Skip to content

Graph ID

A universal identifier system for atomistic structures

PyPI version Python versions codecov


What is Graph ID?

Graph ID generates unique, deterministic identifiers for atomistic structures including crystals and molecules. It works by converting atomic structures into graph representations and computing hash-based identifiers that capture both topology and composition.

Quick Example

from pymatgen.core import Structure, Lattice
from graph_id import GraphIDMaker

# Create a structure (NaCl)
structure = Structure.from_spacegroup(
    "Fm-3m",
    Lattice.cubic(5.692),
    ["Na", "Cl"],
    [[0, 0, 0], [0.5, 0.5, 0.5]]
)

# Generate Graph ID
maker = GraphIDMaker()
graph_id = maker.get_id(structure)
print(graph_id)  # Output: NaCl-88c8e156db1b0fd9

How It Works

Graph ID operates through a multi-step process:

flowchart LR
    A[Atomic Structure] --> B[Graph Representation]
    B --> C[Local Environment Analysis]
    C --> D[Compositional Sequences]
    D --> E[Hash-based ID]
  1. Graph Construction: Convert atomic structures into graph representations where atoms are nodes and bonds are edges
  2. Environment Analysis: Analyze the local chemical environment around each atom using compositional sequences
  3. Hash Generation: Compute a deterministic hash-based identifier that captures both topology and composition

Applications

Graph ID is particularly useful for:

  • Materials Databases: Efficient indexing and deduplication of structure databases
  • High-throughput Screening: Rapid identification of unique structures in computational workflows
  • Polymorph Identification: Distinguishing between different polymorphs of the same composition
  • Machine Learning: Feature engineering for materials property prediction

Web Service

Try it online

You can search materials using Graph ID at matfinder.net

Next Steps