torch-geometric

Graph Neural Networks (PyG). Node/graph classification, link prediction, GCN, GAT, GraphSAGE, heterogeneous graphs, molecular property prediction, for geometric deep learning.

Install

Hot:9

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=k-dense-ai-scientific-skills-torch_geometric&locale=en&source=copy

PyTorch Geometric (PyG) - Graph Neural Network Deep Learning Library

Skill Overview

PyTorch Geometric (PyG) is a graph neural network (GNN) development library built on PyTorch, designed specifically for handling graph-structured and geometric deep learning tasks. This toolkit offers 40+ types of graph convolutional layers (such as GCN, GAT, GraphSAGE), a full message-passing framework, large-scale graph sampling training, and model explainability tools, suitable for node classification, graph classification, link prediction, molecular property prediction, and more.

Applicable Scenarios

1. Graph Machine Learning and Deep Learning Research

PyG is a core tool for graph deep learning research, supporting classic tasks like node classification, graph classification, and link prediction. Whether for paper classification in citation networks, community detection in social networks, or relationship modeling in recommender systems, PyG provides efficient data loading, model training, and evaluation workflows. Built-in benchmark datasets like Planetoid, TU Dataset, and QM9 allow researchers to quickly validate algorithms.

2. Molecular Discovery and Chemical Property Prediction

In drug development and cheminformatics, PyG can handle graph representations of molecules (atoms as nodes, chemical bonds as edges). It predicts molecular physicochemical properties, drug activity, or chemical reaction outcomes using graph neural networks. PyG supports processing 3D molecular geometries, suitable for geometric deep learning tasks, making it an important tool in computational chemistry and drug discovery.

3. Large-scale Heterogeneous Graph Learning

For complex graph structures containing multiple node and edge types (such as knowledge graphs and social platforms), PyG provides the HeteroData class and HeteroConv layers to handle heterogeneous graphs. Paired with NeighborLoader for neighbor sampling, it is possible to train large-scale graphs with millions of nodes on a single machine. Multi-GPU parallel training is supported to meet industrial application needs.

Core Features

1. Rich GNN Layers and Message-Passing Framework

PyG includes 40+ built-in graph convolutional layers, including GCN (Graph Convolutional Network), GAT (Graph Attention Network), GraphSAGE, and more. Through the MessagePassing base class, users can easily customize message-passing layers to implement specific neighborhood aggregation strategies. The framework automatically handles graph data structures like COO-format edge indices and node feature matrices, supports sparse matrix operations, and advanced features such as edge weights and edge attributes.

2. Efficient Data Processing and Training Pipeline

PyG employs a unique batching mechanism that concatenates multiple graphs into one large disconnected graph via a block-diagonal adjacency matrix, enabling efficient computation without padding. DataLoader supports automatic batching and shuffling, and NeighborLoader implements neighbor sampling for large-scale graphs. For extremely large datasets, streaming processing and distributed training can be used. Data augmentation transforms (such as adding self-loops, feature normalization, positional encoding) can be chained via Compose.

3. Model Explainability and Visualization Analysis

PyG provides tools like GNNExplainer to analyze important edges and node features for model predictions, helping to understand model decision processes. It supports various explanation types such as node masks and edge masks. Combined with visualization tools, it can intuitively display graph structures and attention weights, which is important for model debugging and building trust in both research and industrial deployment.

Frequently Asked Questions

How to install PyTorch Geometric?

Installing PyG is straightforward; it is recommended to use uv or pip: uv pip install torch_geometric. For accelerated features like sparse operations and clustering, you can additionally install extension packages such as pyg_lib, torch_scatter, torch_sparse. Note that these extension packages need to match your PyTorch and CUDA versions; the official site provides precompiled wheel files, and you can specify the download URL via the -f parameter.

How large a graph can PyG handle?

PyG is designed to support graph data from small to very large scale. For graphs that fit in memory (tens of thousands of nodes), full-graph training can be done directly; for large-scale graphs (millions of nodes), it is recommended to use NeighborLoader for neighbor-sampled training. PyG’s batching mechanism is also very efficient for many small graphs. If you encounter memory bottlenecks, you can reduce the number of sampled neighbors, lower the batch size, or use streaming data loading.

How to choose between PyG and other GNN libraries (like DGL)?

Both PyG and DGL are excellent graph neural network libraries. PyG is deeply integrated with the PyTorch ecosystem, and its API design follows PyTorch conventions, making it friendlier for users already familiar with PyTorch. DGL supports multiple backends (MXNet, TensorFlow, PyTorch) and is more mature in distributed training. If your project is based on PyTorch and mainly focuses on algorithm research, PyG is recommended; if you need cross-framework support or large-scale industrial deployment, consider DGL.