5.1 KiB
5.1 KiB
Graph Neural Networks
::: columns :::: {.column width=65%}
- Graph Neural Networks (GNNs): Neural Networks that operate on graph structured data
- Graph: consists of nodes that can be connected by edges, edges can be directed or undirected
- no grid structure as given for CNNs
- node features and edge features possible
- relation often represented by adjacency matrix:
A_{ij}=1
if there is a link between nodei
andj
, else 0 - tasks on node level, edge level and graph level
- full lecture: \url{https://web.stanford.edu/class/cs224w/} :::: :::: {.column width=35%} \begin{center} \includegraphics[width=1.1\textwidth]{figures/graph_example.png} \normalsize \end{center} :::: :::
Simple Example: Zachary's karate club
::: columns :::: {.column width=60%}
- link: \url{https://en.wikipedia.org/wiki/Zachary's_karate_club}
- 34 nodes: each node represents a member of the karate club
- 4 classes: a community each member belongs to
- task: classify the nodes
- many real world problems for GNNs exist, e.g.\ social networks, molecules, recommender systems, particle tracks :::: :::: {.column width=40%} \begin{center} \includegraphics[width=1.\textwidth]{figures/karateclub.png} \normalsize \end{center} :::: :::
From CNN to GNN
\begin{center} \includegraphics[width=0.8\textwidth]{figures/fromCNNtoGNN.png} \normalsize \newline \tiny (from Stanford GNN lecture) \end{center} \normalsize
- GNN: Generalization of convolutional neural network
- No grid structure, arbitrary number of neighbors defined by adjacency matrix
- Operations pass information from neighborhood
Architecture: Graph Convolutional Network
::: columns :::: {.column width=60%}
- Message passing from connected nodes
- The graph convolution is defined as:
H^{(l+1)} = \sigma \left( \tilde{D}^{\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)} \right)
- The adjacency matrix
A
including self-connections is given by\tilde{A}
- The degree matrix of the corrected adjacency matrix is given by
\tilde{D}_{ii} = \Sigma_j \tilde{A}_{ij}
- The weights of the given layer are called
W^{(l)}
H^{(l)}
is the matrix for activations in layer $l$ :::: :::: {.column width=40%} \begin{center} \includegraphics[width=1.1\textwidth]{figures/GCN.png} \normalsize \end{center} \tiny \url{https://arxiv.org/abs/1609.02907} :::: :::
Architecture: Graph Attention Network
::: columns :::: {.column width=50%}
- Calculate the attention coefficients
e_{ij}
from the features\vec{h}
for each nodei
with its neighborsj
e_{ij} = a\left( W\vec{h}_i, W\vec{h}_j \right)
a
: learnable weight vector
- Normalize attention coefficients
\alpha_{ij} = \text{softmax}_j(e_{ij}) = \frac{\text{exp}(e_{ij})}{\Sigma_k \text{exp}(e_{ik})}
- Calculate node features
\vec{h}^{(l+1)}_i = \sigma \left( \Sigma \alpha_{ij} W \vec{h}^l_j \right)$$
::::
:::: {.column width=50%}
\begin{center}
\includegraphics[width=1.1\textwidth]{figures/GraphAttention.png}
\normalsize
\end{center}
\tiny \url{https://arxiv.org/abs/1710.10903}
::::
:::
## Example: Identification of inelastic interactions in TRD
::: columns
:::: {.column width=60%}
* Identification of inelastic interactions of light antinuclei
in the Transition Radiation Detector in ALICE
* Thesis: \url{https://www.physi.uni-heidelberg.de/Publications/Bachelor_Thesis_Maximilian_Hammermann.pdf}
* Construct nearest neighbor graph from signals in detector
* Use global pooling for graph classification
::::
:::: {.column width=40%}
interaction of antideuteron:
\begin{center}
\includegraphics[width=0.8\textwidth]{figures/antideuteronsgnMax.png}
\normalsize
\end{center}
::::
:::
\begin{center}
\includegraphics[width=0.9\textwidth]{figures/GNN_conf.png}
\normalsize
\end{center}
## Example: Google Maps
* link: \url{https://www.deepmind.com/blog/traffic-prediction-with-advanced-graph-neural-networks}
* GNNs are used for traffic predictions and estimated times of arrival (ETAs)
\begin{center}
\includegraphics[width=0.8\textwidth]{figures/GNNgooglemaps.png}
\normalsize
\end{center}
## Example: Alpha Fold
* link: \url{https://www.deepmind.com/blog/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology}
* "A folded protein can be thought of as a 'spatial graph', where residues are the nodes and edges connect the residues in close proximity"
\begin{center}
\includegraphics[width=0.9\textwidth]{figures/alphafold.png}
\normalsize
\end{center}
## Exercise 1: Illustration of Graphs and Graph Neural Networks
On the PyTorch webpage, you can find official examples for the application of Graph Neural Networks:
https://pytorch-geometric.readthedocs.io/en/latest/get_started/colabs.html
\vspace{3ex}
The first introduction notebook shows the functionality of graphs with the example of the Karate Club. Follow and reproduce the first [\textcolor{green}{notebook}](https://colab.research.google.com/drive/1h3-vJGRVloF5zStxL5I0rSy4ZUPNsjy8?usp=sharing). Study and understand the data format.
\vspace{3ex}
At the end, the separation power of Graph Convolutional Networks (GCN) are shown via the node embeddings. You can replace the GCN with a Graph Attention Layers and compare the results.