Graph Neural Network
A class of deep learning models designed to operate on graph-structured data, enabling nodes to aggregate and propagate information across their neighbourhoods through a message-passing mechanism.
A graph neural network (GNN) is a class of neural network designed to operate on data that is naturally represented as a graph — a structure composed of nodes (vertices) and edges (connections between nodes). Traditional deep learning architectures such as convolutional neural networks and recurrent neural networks are designed for grid-structured data (images, sequences), and do not generalise straightforwardly to irregular, non-Euclidean domains. GNNs address this limitation by defining a learning procedure that respects the topology of the graph.
Graph-Structured Data
Many real-world phenomena are best modelled as graphs. Social networks consist of users (nodes) connected by friendship or follow relationships (edges). Molecular structures are graphs of atoms connected by chemical bonds. Citation networks represent papers as nodes and citations as directed edges. Road and transit networks are spatial graphs. Knowledge bases organise entities and their relationships as directed property graphs.
Each node and edge can carry feature vectors encoding properties — for example, a node in a molecular graph might carry features describing the atom type, charge, and hybridisation state. The GNN's task is to produce useful representations (embeddings) of nodes, edges, or the entire graph by combining these features with structural information.
Message Passing
The dominant computational paradigm for GNNs is message passing, formalised by Gilmer and colleagues in 2017 as the Message Passing Neural Network (MPNN) framework. In each layer of a GNN, every node collects messages from its immediate neighbours, aggregates them (typically by summing or averaging), and updates its own representation using a learnable transformation. After L layers, each node's representation incorporates information from its L-hop neighbourhood.
Formally, the update for node v at layer l is: , The update rule is: h_v^(l+1) = UPDATE(h_v^(l), AGGREGATE(h_u^(l) for u in N(v))), where N(v) denotes the set of neighbours of v, and UPDATE and AGGREGATE are learnable functions. This formulation encompasses a broad family of architectures.
Major Variants
Graph Convolutional Network (GCN), introduced by Kipf and Welling in 2017, applies a spectral convolution approximated in the spatial domain, averaging neighbourhood features with degree normalisation. GCN is simple and effective for semi-supervised node classification.
Graph Attention Network (GAT), proposed by Velickovic and colleagues in 2018, replaces uniform neighbourhood averaging with learned attention weights, allowing each node to weight its neighbours' contributions differentially. This is analogous to the self-attention mechanism in Transformer models.
GraphSAGE (Hamilton et al., 2017) introduced an inductive variant that learns an aggregation function applicable to unseen nodes, enabling scaling to large graphs via mini-batch sampling.
Graph Isomorphism Network (GIN), developed by Xu and colleagues (2019), is theoretically as expressive as the Weisfeiler-Leman graph isomorphism test, making it among the most powerful message-passing GNNs in terms of distinguishing non-isomorphic graphs.
Applications
GNNs have achieved state-of-the-art results across numerous domains.
In drug discovery and bioinformatics, GNNs model molecules as graphs of atoms and bonds to predict properties such as toxicity, solubility, and binding affinity. DeepMind's AlphaFold 2 incorporates graph-based components to predict three-dimensional protein structures. RFDiffusion (Baker Lab) combines GNNs with diffusion models to design novel protein structures satisfying custom constraints.
In recommendation systems, platforms such as Pinterest (PinSage) and Alibaba use GNNs to model user-item interaction graphs, improving product and content recommendations over collaborative filtering baselines.
In fraud detection, financial institutions model transaction networks as graphs, where GNNs identify suspicious subgraph patterns that correspond to money laundering rings or account takeover attacks.
In traffic and transportation, GNNs model road networks to forecast travel time and optimise route planning, with deployments at Google Maps and DiDi.
In network security, GNNs capture complex dependencies in network topology graphs to detect intrusion attempts and anomalous lateral movement within enterprise networks.
Scalability and Open Challenges
Applying GNNs to graphs with billions of nodes and edges — such as social networks or the web graph — presents significant engineering challenges. Techniques including mini-batch neighbourhood sampling, cluster-based training (Cluster-GCN), and graph partitioning have been developed to address memory and compute constraints.
Over-smoothing is a well-documented failure mode in which deep GNNs cause node representations to converge to similar values, losing discriminative information. Over-squashing occurs when information from distant nodes is compressed into fixed-size representations, limiting the effective receptive field. Active research addresses both phenomena through architectural and training innovations.
See Also
References
References
- Scarselli, F., et al. (2009). The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61-80.
- Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. ICLR 2017.
- Velickovic, P., et al. (2018). Graph attention networks. ICLR 2018.
- Gilmer, J., et al. (2017). Neural message passing for quantum chemistry. ICML 2017.
- Hamilton, W., et al. (2017). Inductive representation learning on large graphs. NeurIPS 2017.