2_Data structures used in Algorithms

Ivan
7 min readApr 15, 2023
The Last Dinner, Leonard Da Vinci, 1494–1498

Data structures used in algorithms As mentioned in the previous article, the right data structures are required for the performance of an algorithm to operate. An incorrect structure could prove to be detrimental or unsustainable to an algorithm. So we are going to see the data structure that every (or almost) programming language has. (Any examples or specificities will be reported in Python).

Reading notes: all images show Python code and the corresponding output. Within the images you will find the code comments. Understanding everything written in the code is not essential, the important thing is to have a first visualization of what is reported.

Lists

Definitely the most used structure, they are a set of elements separated by the comma. In the case of Python the elements in a list can be of any type, number or string but in other programming languages they must all be of the same type.

The peculiarities of the lists and why are so useful:

  • The items in a list are numbered, so they can be extracted. If I only need the twentieth item in a list of one hundred, I can only have that;
  • You can split lists with specific ranges by slicing;
  • You can choose the elements to be divided or the individual elements we are interested in, also using negative indices, which work the same way but starting from the bottom. For example, index 1 will extract the second element (see below why*) while index -1 will extract the last element;
  • They are iterative, so with certain programming structures, the whole list can be ‘consulted’ and the same operations can be applied to all or part of the elements via automatic control. Operations that consult the entire list one item at a time are called cycles.

* The index one will extract the second element because in computer science we always start from 0 for counting the elements of a list. So the zero elements is the first, the one element is the second, and so on. As for negative indices, there is no -0 so the last element is simply -1.

List creation in Python and select the first half of the list and the second half of the list.
Print the results

In the images, a list of four colours is created, then displayed in two, the last two colors and the first two.

Many functions can be used on lists, i.e. operations that work on data within a list, these operations allow you to filter data greater than a certain number, multiply all the values in the list and much more.

A particularly important function is range() which allows you to create lists in a range of numbers of your choice. With the range function I can create lists that start from zero to 1000, I can create lists ranging from 0 to 1000 that contain only one number every ten (10,20,30,40 …), in short, an essential tool for programming and for this structure.

Filter the list
Print the result of the previous filtering

Three lists are created to which three functions are applied. In the first case only the elements greater than 100 are returned, in the second case the powers and in the third the sum of all the values within the list.

Use the range function
Print the result of the range

Two lists are created with the indicated values, one from zero to six, the other from three to twenty-nine with steps of two.

The lists are enclosed in square brackets.

Tuple

They are like lists but can’t be modified, you can’t change the values inside them. A tuple is faster than a list, so a program should have tuples where it is possible.

The tuples are enclosed in round brackets.

Dictionaries

Dictionaries are a map structure, so each value is associated with a key and the value is extracted by calling the corresponding key. The key, which is always put in the first position is separated by two points from the value, each value key pair is separated by commas. It is particularly convenient to extract values and replace them.

Dictionaries are enclosed in curly braces.

Creation of a dictionary with the typical dictionary form
Print using the key associated with the value

A dictionary is created with three colors associated with three keys, then the first key is printed and then replaced, then printed again.

Sets

The peculiarity of dictionaries is that there can be equal values. If an existing value is entered, it will not be considered. Two interesting operations in the sets are the union and the intersection. In the first case we are returned the different values between two sets, in the second the values in common between two sets.

Sets are enclosed in curly braces, they are distinguishable from dictionaries because they do not have the division made with the colons between keys to values.

Creation of two sets and operation on the sets
Print the result of the operation

Two sets containing colors are created, then two operations are performed, in the first case only the unique colors in the two datasets appear, in the second only the common colors.

Dataframe

They are used to store tabular data (tables that have rows and columns). Both can have names to distinguish values in the column, for example, name, surname and age.

You can select subsets of rows and columns to create subsets within the dataframe. For example, you can choose only one column or several columns.

Data frame creation and operation on the data frame created

A data frame with four rows and four columns is created, namely, unique number, name, age, and decision.

Then the prints of the first two columns are shown, the rows from zero to three, the first two rows, the rows where the age is greater than 800, and those in which the two conditions are met.

Matrices

They are data structures arranged in rows and columns, you cannot name rows or columns. Usually all values are numeric as opposed to data frames, they are especially useful if you have to do calculations on all values in the array.

They are especially useful for representing images that are not matrices where in both rows and columns there are numbers representing a colour. So if we want to decrease everyone’s brightness, it will be enough to halve the value of all pixels, the opposite is if we want to make the image darker.

In addition to this data there are abstract structures, they are structures that do not really ‘exist’ but are used in the ways that will be described to facilitate the logical understanding of an algorithm.

Matrix creation
Print of the matrix

An array is created and then printed.

Vectors

They can be created with lists. They are numbers arranged linearly.

Stack

In one-dimensional lists, the data entered is processed in LIFO mode, so if I have a stack composed of three numbers and let’s say 7 is the last to be entered I can only and exclusively delete the number seven, the other two can not be touched. The operation with which I add data is the push() and the one with which I remove it is the pop().

Code

Again they are one-dimensional lists, the data is entered in FIFO mode, so if I have inserted three elements and I have to remove one I am forced to remove the first element inserted. In the stacks I had to remove the last element inserted.

Trees

Hierarchical structures are important for Machine Learning for the creation of Decision trees (I will talk about it soon if we are lucky).

They are composed of two symbols, the nodes(the spheres) and the connections(the lines that connect the spheres). The image will surely make you understand what we are talking about, some important features:

  • The first node, the one that has no connections above it is the root, usually contains the most important data;
  • The layer is the distance between a node and the root;
  • The brothers are the knots on the same level;
  • Parents are the nodes at the top level to which the underlying nodes are connected;
  • The degree of the node is the number of children of a node;
  • The grade of the tree is the maximum grade of the child;
  • An under tree is when we want to extract a portion of a tree. If it is done the root node also becomes the first node without parents;
  • A leaf knot is a childless knot;
  • While an internal node is a node with children and parents. Note that trees are a type of graph (also if we’re lucky I’ll talk about it later).
An example of decisional trees

BECOME a WRITER at MLearning.ai

--

--

Ivan

I’m a data science student, I want to share here my passion for the future, data, artificial intelligence and so much more! I am open to every type of collab.