General Base Layers for Graph Convolutions with tensorflow.keras

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

GitHub release (latest by date) PyPI - Downloads GitHub

Keras Graph Convolutions

A set of layers for graph convolutions in TensorFlow Keras that use RaggedTensors.

General
Requirements
Installation
Documentation
Implementation details
Literature
Datasets
Examples
Issues
Citing
References

General

The package in kgcnn contains several layer classes to build up graph convolution models. Some models are given as an example. A documentation is generated in docs. This repo is still under construction. Any comments, suggestions or help are very welcome!

Requirements

For kgcnn, usually the latest version of tensorflow is required, but is listed as extra requirements in the setup.py for simplicity. Additional python packages are placed in the setup.py requirements and are installed automatically.

tensorflow>=2.4.1
rdkit>=2020.03.4

Installation

Clone repository https://github.com/aimat-lab/gcnn_keras and install with editable mode:

pip install -e ./gcnn_keras

or latest release via Python Package Index.

pip install kgcnn

Documentation

Auto-documentation is generated at https://kgcnn.readthedocs.io/en/latest/index.html .

Implementation details

Representation

The most frequent usage for graph convolutions is either node or graph classification. As for their size, either a single large graph, e.g. citation network or small (batched) graphs like molecules have to be considered. Graphs can be represented by an index list of connections plus feature information. Typical quantities in tensor format to describe a graph are listed below.

nodes: Node-list of shape (batch, N, F) where N is the number of nodes and F is the node feature dimension.
edges: Edge-list of shape (batch, M, F) where M is the number of edges and F is the edge feature dimension.
indices: Connection-list of shape (batch, M, 2) where M is the number of edges. The indices denote a connection of incoming i and outgoing j node as (i, j).
state: Graph state information of shape (batch, F) where F denotes the feature dimension.

A major issue for graphs is their flexible size and shape, when using mini-batches. Here, for a graph implementation in the spirit of keras, the batch dimension should be kept also in between layers. This is realized by using RaggedTensors.

Input

Here, for ragged tensors, the nodelist of shape (batch, None, F) and edgelist of shape (batch, None, F') have one ragged dimension (None, ). The graph structure is represented by an index-list of shape (batch, None, 2) with index of incoming i and outgoing j node as (i, j). The first index of incoming node i is usually expected to be sorted for faster pooling operations, but can also be unsorted (see layer arguments). Furthermore, the graph is directed, so an additional edge with (j, i) is required for undirected graphs. A ragged constant can be directly obtained from a list of numpy arrays: tf.ragged.constant(indices, ragged_rank=1, inner_shape=(2, )) which yields shape (batch, None, 2).

Model

Models can be set up in a functional way. Example message passing from fundamental operations:

import tensorflow.keras as ks
from kgcnn.layers.gather import GatherNodes
from kgcnn.layers.keras import Dense, Concatenate  # ragged support
from kgcnn.layers.pool.pooling import PoolingLocalMessages, PoolingNodes

n = ks.layers.Input(shape=(None, 3), name='node_input', dtype="float32", ragged=True)
ei = ks.layers.Input(shape=(None, 2), name='edge_index_input', dtype="int64", ragged=True)

n_in_out = GatherNodes()([n, ei])
node_messages = Dense(10, activation='relu')(n_in_out)
node_updates = PoolingLocalMessages()([n, node_messages, ei])
n_node_updates = Concatenate(axis=-1)([n, node_updates])
n_embedd = Dense(1)(n_node_updates)
g_embedd = PoolingNodes()(n_embedd)

message_passing = ks.models.Model(inputs=[n, ei], outputs=g_embedd)

or via sub-classing of the message passing base layer. Where only message_function and update_nodes must be implemented:

from kgcnn.layers.conv.message import MessagePassingBase
from kgcnn.layers.keras import Dense, Add

def MyMessageNN(MessagePassingBase):

    def __init__(self, units, **kwargs):
        super(MyMessageNN, self).__init__(**kwargs)
        self.dense = Dense(units)
        self.add = Add(axis=-1)

    def message_function(self, inputs, **kwargs):
        n_in, n_out, edges = inputs
        return self.dense(n_out)

    def update_nodes(self, inputs, **kwargs):
        nodes, nodes_update = inputs
        return self.add([nodes, nodes_update])

Literature

A version of the following models are implemented in literature:

GCN: Semi-Supervised Classification with Graph Convolutional Networks by Kipf et al. (2016)
INorp: Interaction Networks for Learning about Objects,Relations and Physics by Battaglia et al. (2016)
Megnet: Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals by Chen et al. (2019)
NMPN: Neural Message Passing for Quantum Chemistry by Gilmer et al. (2017)
Schnet: SchNet – A deep learning architecture for molecules and materials by Schütt et al. (2017)
Unet: Graph U-Nets by H. Gao and S. Ji (2019)
GNNExplainer: GNNExplainer: Generating Explanations for Graph Neural Networks by Ying et al. (2019)
GraphSAGE: Inductive Representation Learning on Large Graphs by Hamilton et al. (2017)
GAT: Graph Attention Networks by Veličković et al. (2018)
GATv2: How Attentive are Graph Attention Networks? by Brody et al. (2021)
DimeNetPP: Fast and Uncertainty-Aware Directional Message Passing for Non-Equilibrium Molecules by Klicpera et al. (2020)
AttentiveFP: Pushing the Boundaries of Molecular Representation for Drug Discovery with the Graph Attention Mechanism by Xiong et al. (2019)
GIN: How Powerful are Graph Neural Networks? by Xu et al. (2019)
PAiNN: Equivariant message passing for the prediction of tensorial properties and molecular spectra by Schütt et al. (2020)

Datasets

In data.datasets there are graph learning datasets. They are being downloaded from e.g. TUDatasets, MoleculeNet or defined freely using class definitions in data. For the simple case that the dataset fits in memory the base class is defined as:

class MemoryGraphDataset:

    def __init__(self):
        self.node_attributes = None
        self.node_labels = None

        self.edge_indices = None
        self.edge_attributes = None
        self.edge_labels = None

        self.graph_labels = None
        self.graph_attributes = None

or the extension to geometric information in addition to the graph's structure.

from kgcnn.data.base import MemoryGraphDataset

class MemoryGeometricGraphDataset(MemoryGraphDataset):

    def __init__(self, **kwargs):
        super(MemoryGeometricGraphDataset, self).__init__(**kwargs)
        self.node_coordinates = None

        self.range_indices = None
        self.range_attributes = None
        self.range_labels = None

        self.angle_indices = None
        self.angle_labels = None
        self.angle_attributes = None

Each property holds an iterable object (e.g. list, array) with the length of the dataset.

Examples

A set of example training can be found in training.

Issues

Some known issues to be aware of, if using and making new models or layers with kgcnn.

RaggedTensor can not yet be used as a keras model output (https://github.com/tensorflow/tensorflow/issues/42320), which means only padded tensors can be used for batched node embedding tasks.
Using RaggedTensor's for arbitrary ragged rank apart from kgcnn.layers.keras can cause significant performance decrease.

Citing

If you want to cite this repo, refer to our paper:

@article{REISER2021100095,
title = {Graph neural networks in TensorFlow-Keras with RaggedTensor representation (kgcnn)},
journal = {Software Impacts},
pages = {100095},
year = {2021},
issn = {2665-9638},
doi = {https://doi.org/10.1016/j.simpa.2021.100095},
url = {https://www.sciencedirect.com/science/article/pii/S266596382100035X},
author = {Patrick Reiser and Andre Eberhard and Pascal Friederich}
}

References

https://www.tensorflow.org/api_docs/python/tf/RaggedTensor

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

4.0.1

Feb 27, 2024

4.0.0

Jan 5, 2024

3.1.0

Nov 29, 2023

3.0.2

Aug 2, 2023

3.0.1

May 22, 2023

3.0.0

Apr 28, 2023

2.2.4

Apr 28, 2023

2.2.3

Mar 8, 2023

2.2.2.0

Feb 17, 2023

2.2.1

Feb 1, 2023

2.2.0

Dec 16, 2022

2.1.1

Nov 14, 2022

2.1.0

Oct 1, 2022

2.0.4

Aug 25, 2022

2.0.3

Jul 13, 2022

2.0.2

Apr 1, 2022

2.0.1

Feb 18, 2022

2.0.0

Jan 12, 2022

1.1.1

Oct 24, 2021

This version

1.1.0

Oct 5, 2021

1.0.2

Jul 10, 2021

1.0.1

Jun 14, 2021

1.0.0

May 18, 2021

0.1.3

May 15, 2021

0.1.1

Apr 10, 2021

0.1.0

Feb 23, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kgcnn-1.1.0.tar.gz (111.0 kB view hashes)

Uploaded Oct 5, 2021 Source

Built Distribution

kgcnn-1.1.0-py3-none-any.whl (155.3 kB view hashes)

Uploaded Oct 5, 2021 Python 3

Hashes for kgcnn-1.1.0.tar.gz

Hashes for kgcnn-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`757b2e5fd342d14f03ae383be8e004e05a1244e110d538efe5a071901adf0a4d`
MD5	`a01a4bce7353336c3cab951fb6e9eff1`
BLAKE2b-256	`c7856f4a3c1702e9c6a314f80ef9e84e20e1b0bbe29b9d2ef2b02725e81b6800`

Hashes for kgcnn-1.1.0-py3-none-any.whl

Hashes for kgcnn-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b145c79ace10c485873100c96324b01627eaece207d118d02c3fb34419f06b6c`
MD5	`9153bfc75073662df09194c16883fdf4`
BLAKE2b-256	`885ce9ebfcec087bca265b61e8dc81884291bf29d926173656ef3e0902b9bad0`