Understanding Graph Neural Network with hands-on example| Part-2

--

Photo by Paulius Andriekus on Unsplash

Welcome back to the next part of this Blog Series on Graph Neural Networks!

The following section will provide a little introduction to PyTorch Geometric, and then we’ll use this library to construct our very own Graph Neural Network! For this approach, I will make use of the MNIST-Superpixel dataset.

Here is the first part of this series

This portion of the series is also accessible in collab format, which can be found at this link - https://colab.research.google.com/drive/1EMgPuFaD-xpboG_ZwZcytnlOlr39rakd

PyTorch Geometric Introduction

A Python library for deep learning on irregular data structures, such as Graphs, and PyTorch Geometric, is available for download. When creating Graph Neural Networks, it is widely utilized as the framework for the network’s construction. Installing it with the pip package manager may be accomplished by running the following commands:

How Data is represented in PyTorch Geometric

I’ll quickly review the library’s core concepts in the sections that follow. As the name implies, it is an extension of PyTorch and hence operates similarly to how torch models are constructed. Data can be saved in a specialized data object that contains the following attributes:

  • data.x: Node feature matrix with shape [num_nodes, num_node_features] This means that for each node in the graph, we have a node feature vector, which can be represented as a matrix. data.x merely holds the bars adjacent to the nodes on the display, stacked as a matrix.
  • data.edge_index: Graph connectivity in COO format with shape [2,num_edges] and type torch.longCOO is a special format that is used to represent sparse matrices and stands for coordinate list. This means it contains 2-tuples of elements that are connected. This is an alternative form to the already mentioned adjacency matrix.
  • data.edge_attr: Edge feature matrix with shape [num_edges, num_edge_features]

As explained before, edges can also have features, which are stored the same way as for the nodes — resulting in a matrix.

  • data.y: Target to train against (may have arbitrary shape), e.g., node-level targets of shape [num_nodes, *] or graph-level targets of shape [1, *]

In the end, data can be loaded using the provided Data Loader, which allows batching, iterating, shuffling, efficient handling of the graph structure, and numerous other features.

In the following section, I will make use of the MNIST Superpixels dataset, which is already provided in PyTorch Geometric.

Our example dataset: MNISTSuperpixels

In the following, I will use a dataset provided in the dataset collection of PyTorch Geometric (Here you find all datasets).

Here the machine learning task is graph prediction from the MNISTSuperpixel dataset with Graph Neural Network.

Here, in this paper, Monti and colleagues used the MNIST dataset and converted it into a graph-based format by using a superpixel-based representation.

image source:https://arxiv.org/pdf/1611.08402.pdf

The regular grid is seen on the left in the image above (the graph is fixed for all images). The graph on the right represents the superpixel adjacency (different for each image). Vertices are represented by red circles, and edges are represented by red lines.

The nodes of the GCN are formed by these superpixels. Afterward, design a fully connected graph in which each superpixel is connected to every other superpixel in the image, allowing information to spread throughout the entire image.

Loading the Data

The MNISTSuperpixels-data can be loaded directly into PyTorch Geometric; however, it is necessary to first import another library, Networks, before the data can be used.

After that, the dataset is imported using the following syntax:

Some basic insights are observed by running:

This dataset contains 60000 samples and 1 feature at the node level for each of the nodes in the dataset. A closer examination of the first sample/graph reveals that it has 75 nodes and 1399 edges.

The node characteristics for the first node are also printed, which came out looking like this:

In the same way, the edge information is investigated by looking at the edge index:

Input Graph Visualization

The Networks library is used to visualize the graph, which is imported in this case.

The following is a graph representation which is an input to the GCN of which the actual target is 0(zero):

Building the Graph Neural Network Model

The code in the following section creates a straightforward GNN model. In order to begin, I import the GCNConv layer from PyTorch Geometric and create a first layer that converts the node features into a size that corresponds to the size of the embedding. After that, I add three more Message Passing Layers on top of each other. This means that it will make a total of four stops in different neighborhoods to gather information.

Between the layers, I utilized a tanh activation function to create contrast. After that, I consolidate the node embeddings into a single embedding vector by using a pooling operation on the node embeddings. A mean and a maximum operation were performed on the node states in this instance. The reason for this is that I want to make a prediction at the graph level and hence require a composite embedding. When dealing with predictions at the node level.

There are a variety of alternative Pooling layers available in PyTorch Geometric, but I’d like to keep things simple here and utilize this mix of mean and maximum.

Finally, a linear output layer ensures that I receive an output value that is continuous and unbounded. The flattened vector is used as the input to this function.

After printing the model summary of the layers, it is seen that 10 features in each are fed into the Message Passing layers, which produce hidden states of size 64, which are finally combined using the mean and max operation. The choice of the embedding size (64) is a hyperparameter and depends on factors such as the size of the graphs in the dataset.

Finally, This model has 13898 parameters, which seems reasonable, as I have 9000 samples. For demonstration purpose, 15% of the total dataset is used.

Training the Graph Neural Network Model

A batch size of 64 (which implies we have 64 graphs in our batch) is selected, and the shuffle option to distribute the graphs in the batch. The first 80 percent of 15% of the main dataset will be for training data, and the remaining 20 percent of 15% of the main dataset will be for test data. cross-entropy is used as a loss metric in my analysis. Adam (Adaptive Movement Estimation) is chosen as the optimizer, with an initial learning rate of 0.0007.

It was then a simple matter of iterating over each batch of data loaded by the Data Loader; fortunately, this function takes care of everything for us, just as it does in the train function. This train function was given the name # epochs times, which in this case was 500.

My training output looked like this:

I also viewed the loss of training as follows:

Based on the aforementioned plot, it can be seen that the loss is decreasing. This can be further enhanced by correct tuning of hyperparameters, such as changing the learning rate.

Validating the predictions

For a test batch, how accurate the graph predictions are evaluated on a rough scale. As a result, the actual numbers are printed as well as the predictions.

I hope you enjoyed this series on Graph Neural Networks. If you have any questions or require assistance, please do not hesitate to leave a remark, and I will try my best to assist you!

--

--