ML-Kurs-SS2023/05_GNNs.md at 9c63eeb027192481fb638bca79f38f180d35afd2

Teaching/ML-Kurs-SS2023

Fork 0

Klaus Reygers 2c7fcaa392 added slides on graph neural networks

2023-04-12 22:18:11 +02:00

5.1 KiB

Raw Blame History

Graph Neural Networks

::: columns :::: {.column width=65%}

Graph Neural Networks (GNNs): Neural Networks that operate on graph structured data
Graph: consists of nodes that can be connected by edges, edges can be directed or undirected
no grid structure as given for CNNs
node features and edge features possible
relation often represented by adjacency matrix: A_{ij}=1 if there is a link between node i and j, else 0
tasks on node level, edge level and graph level
full lecture: \url{https://web.stanford.edu/class/cs224w/} :::: :::: {.column width=35%} \begin{center} \includegraphics[width=1.1\textwidth]{figures/graph_example.png} \normalsize \end{center} :::: :::

Simple Example: Zachary's karate club

::: columns :::: {.column width=60%}

link: \url{https://en.wikipedia.org/wiki/Zachary's_karate_club}
34 nodes: each node represents a member of the karate club
4 classes: a community each member belongs to
task: classify the nodes
many real world problems for GNNs exist, e.g.\ social networks, molecules, recommender systems, particle tracks :::: :::: {.column width=40%} \begin{center} \includegraphics[width=1.\textwidth]{figures/karateclub.png} \normalsize \end{center} :::: :::

From CNN to GNN

\begin{center} \includegraphics[width=0.8\textwidth]{figures/fromCNNtoGNN.png} \normalsize \newline \tiny (from Stanford GNN lecture) \end{center} \normalsize

GNN: Generalization of convolutional neural network
No grid structure, arbitrary number of neighbors defined by adjacency matrix
Operations pass information from neighborhood

Architecture: Graph Convolutional Network

::: columns :::: {.column width=60%}

Message passing from connected nodes
The graph convolution is defined as:

 H^{(l+1)} = \sigma \left( \tilde{D}^{\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)} \right)

The adjacency matrix A including self-connections is given by \tilde{A}
The degree matrix of the corrected adjacency matrix is given by \tilde{D}_{ii} = \Sigma_j \tilde{A}_{ij}
The weights of the given layer are called W^{(l)}
H^{(l)} is the matrix for activations in layer $l$ :::: :::: {.column width=40%} \begin{center} \includegraphics[width=1.1\textwidth]{figures/GCN.png} \normalsize \end{center} \tiny \url{https://arxiv.org/abs/1609.02907} :::: :::

Architecture: Graph Attention Network

::: columns :::: {.column width=50%}

Calculate the attention coefficients e_{ij} from the features \vec{h} for each node i with its neighbors j

 e_{ij} = a\left( W\vec{h}_i, W\vec{h}_j \right)

a: learnable weight vector

Normalize attention coefficients

 \alpha_{ij} = \text{softmax}_j(e_{ij}) = \frac{\text{exp}(e_{ij})}{\Sigma_k \text{exp}(e_{ik})}

Calculate node features


\vec{h}^{(l+1)}_i = \sigma \left( \Sigma \alpha_{ij} W \vec{h}^l_j \right)$$
::::
:::: {.column width=50%}
\begin{center}
\includegraphics[width=1.1\textwidth]{figures/GraphAttention.png}
\normalsize
\end{center}
\tiny \url{https://arxiv.org/abs/1710.10903} 
::::
:::

## Example: Identification of inelastic interactions in TRD


::: columns
:::: {.column width=60%}
* Identification of inelastic interactions of light antinuclei
in the Transition Radiation Detector in ALICE
* Thesis: \url{https://www.physi.uni-heidelberg.de/Publications/Bachelor_Thesis_Maximilian_Hammermann.pdf}
* Construct nearest neighbor graph from signals in detector
* Use global pooling for graph classification
::::
:::: {.column width=40%}

interaction of antideuteron:

\begin{center}
\includegraphics[width=0.8\textwidth]{figures/antideuteronsgnMax.png}
\normalsize
\end{center}
::::
:::


\begin{center}
\includegraphics[width=0.9\textwidth]{figures/GNN_conf.png}
\normalsize
\end{center}



## Example: Google Maps

* link: \url{https://www.deepmind.com/blog/traffic-prediction-with-advanced-graph-neural-networks}
* GNNs are used for traffic predictions and estimated times of arrival (ETAs)

\begin{center}
\includegraphics[width=0.8\textwidth]{figures/GNNgooglemaps.png}
\normalsize
\end{center}


## Example: Alpha Fold
* link: \url{https://www.deepmind.com/blog/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology}
* "A folded protein can be thought of as a 'spatial graph', where residues are the nodes and edges connect the residues in close proximity"

\begin{center}
\includegraphics[width=0.9\textwidth]{figures/alphafold.png}
\normalsize
\end{center}

## Exercise 1: Illustration of Graphs and Graph Neural Networks

On the PyTorch webpage, you can find official examples for the application of Graph Neural Networks:
https://pytorch-geometric.readthedocs.io/en/latest/get_started/colabs.html

\vspace{3ex}

The first introduction notebook shows the functionality of graphs with the example of the Karate Club. Follow and reproduce the first [\textcolor{green}{notebook}](https://colab.research.google.com/drive/1h3-vJGRVloF5zStxL5I0rSy4ZUPNsjy8?usp=sharing). Study and understand the data format.

\vspace{3ex}

At the end, the separation power of Graph Convolutional Networks (GCN) are shown via the node embeddings. You can replace the GCN with a Graph Attention Layers and compare the results.

5.1 KiB Raw Blame History