pygraphistry-gfql
PyGraphistry GFQL
Doc routing (local + canonical)
- First route with
../pygraphistry/references/pygraphistry-readthedocs-toc.md. - Use
../pygraphistry/references/pygraphistry-readthedocs-top-level.tsvfor section-level shortcuts. - Only scan
../pygraphistry/references/pygraphistry-readthedocs-sitemap.xmlwhen a needed page is missing. - Use one batched discovery read before deep-page reads; avoid
cat *and serial micro-reads. - In user-facing answers, prefer canonical
https://pygraphistry.readthedocs.io/en/latest/...links.
Two syntaxes, one entrypoint
g.gfql() accepts both chain-list (Python AST objects) and Cypher strings. It auto-detects the language from the argument type:
# Chain-list syntax (Python AST objects)
g2 = g.gfql([n({'type': 'person'}), e_forward(), n()])
# Cypher string syntax (auto-detected)
g2 = g.gfql("MATCH (p:Person)-[r:KNOWS]->(q:Person) RETURN p.name, q.name")
# Explicit language parameter (optional)
g2 = g.gfql(query_string, language="cypher")
When to use which:
- Chain-list: Programmatic composition, dynamic parameterization, when building queries from code
- Cypher: Readability, familiarity for Cypher users, complex pattern matching with RETURN/ORDER BY/LIMIT
Quick start — chain-list
from graphistry import n, e_forward
g2 = g.gfql([
n({'type': 'person'}),
e_forward({'relation': 'transfers_to'}, min_hops=1, max_hops=3),
n({'risk': True})
])
Quick start — Cypher
# Simple pattern match
g2 = g.gfql("MATCH (p:Person)-[r:KNOWS]->(q:Person) WHERE p.age > 30 RETURN p.name, q.name")
# Variable-length paths
g2 = g.gfql("MATCH (a:Account)-[*1..3]->(m:Merchant) RETURN a, m")
# Parameterized queries
g2 = g.gfql(
"MATCH (n) WHERE n.score > $cutoff RETURN n.id, n.score ORDER BY n.score DESC LIMIT $top_n",
params={"cutoff": 50, "top_n": 10}
)
# Relationship type alternation
g2 = g.gfql("MATCH (a:Person)-[:KNOWS|COLLABORATES_WITH]->(b:Person) RETURN a.name, b.name")
Cypher node labels and DataFrame columns
GFQL Cypher maps :Label to boolean columns label__<Label>, not string columns. Prefer property filters (simpler, works with any column):
# Recommended: property filter (works with any string/numeric column)
g2 = g.gfql("MATCH (p) WHERE p.type = 'Person' AND p.age > 30 RETURN p.name")
# Alternative: pre-create boolean label columns for Cypher :Label syntax
nodes['label__Person'] = nodes['type'] == 'Person'
g = graphistry.edges(edges, 'src', 'dst').nodes(nodes, 'id')
g2 = g.gfql("MATCH (p:Person) WHERE p.age > 30 RETURN p.name")
Supported Cypher clauses
- Full: MATCH, WHERE, RETURN, WITH, ORDER BY, SKIP, LIMIT, DISTINCT, CALL graphistry.*, GRAPH {}, USE
- Partial: OPTIONAL MATCH (bounded subset), UNWIND (top-level), UNION/UNION ALL (direct g.gfql() only)
- Not supported: CREATE, MERGE, DELETE, SET, REMOVE (GFQL is read-only)
Cypher functions
- Scalar: labels(), type(), keys(), properties(), abs(), sqrt(), coalesce(), substring(), tointeger(), tofloat(), toboolean(), tostring()
- Aggregation: count(), sum(), min(), max(), avg(), collect(), count(DISTINCT ...)
- Operators: =, <>, <, <=, >, >=, IN, STARTS WITH, ENDS WITH, CONTAINS, IS NULL, IS NOT NULL, AND, OR, NOT
GRAPH constructor (Cypher extension)
# Extract subgraph as a graph object (not a table)
subgraph = g.gfql("GRAPH { MATCH (a)-[r]->(b) WHERE a.risk_score > 7 }")
# Multi-stage pipeline with named GRAPH bindings and USE
result = g.gfql("""
GRAPH g1 = GRAPH { MATCH (a)-[r]->(b) WHERE a.event_count > 100 }
GRAPH g2 = GRAPH { USE g1 CALL graphistry.degree.write() }
USE g2 MATCH (n) RETURN n.id, n.degree ORDER BY n.degree DESC LIMIT 10
""")
Let/DAG bindings
from graphistry import n, e_forward, let, ref
# Named bindings forming a DAG
result = g.gfql(let({
'high_risk': n({'risk_score': {'$gt': 0.8}}),
'neighborhoods': ref('high_risk', [e_forward(max_hops=2), n()])
}))
# Select specific binding output
result = g.gfql(let({...}), output='neighborhoods')
# Multi-stage DAG: sequential refs build on each other
result = g.gfql(let({
'people': n({'type': 'person'}),
'contacts': ref('people', [e_forward({'rel': 'contacts'}), n()]),
'owned': ref('contacts', [e_forward({'rel': 'owns'}), n()])
}), output='owned')
# Nested let: inner DAGs execute as opaque units for parallel-friendly pipelines
result = g.gfql(let({
'social': let({
'people': n({'type': 'person'}),
'friends': ref('people', [e_forward({'rel': 'knows'}), n()]),
}),
'infra': let({
'servers': n({'type': 'server'}),
'traffic': ref('servers', [e_forward({'rel': 'serves'}), n()]),
}),
'combined': ref('social', [e_forward(), n()])
}), output='combined')
# Let + degree computation + visual encoding
from graphistry import n, e_forward, let, ref, call
result = g.gfql(let({
'seeds': n({'risk_flag': True}),
'neighborhood': ref('seeds', [e_forward(max_hops=2), n()]),
}))
# Then compute degrees and encode color
result = result.get_degrees().encode_point_color('degree', as_continuous=True)
- Independent bindings operate on the root graph
- ref() bindings operate on the referenced binding's output
- Nested let scope rules (requires pygraphistry >= 0.53.7):
- Inner bindings do NOT leak to outer scope
- Inner bindings CAN read outer bindings (lexical closure)
- Sibling nested lets may reuse names without collision
- Each nested let is an opaque execution unit (parallel-friendly)
Targeted patterns (high signal)
# Edge query filtering
g2 = g.gfql([n(), e_forward(edge_query="type == 'replied_to' and submolt == 'X'"), n()])
# Same-path constraints with where + compare/col
from graphistry import col, compare
g2 = g.gfql([n(name='a'), e_forward(name='e'), n(name='b')], where=[compare(col('a', 'owner_id'), '==', col('b', 'owner_id'))])
# Traverse 2-4 hops but only return hops 3-4
g2 = g.gfql([e_forward(min_hops=2, max_hops=4, output_min_hops=3, output_max_hops=4)])
Edge direction variants
e_forward()— source-to-destinatione_reverse()— destination-to-sourcee_undirected()— both directionse()— alias for any direction
High-value patterns
g.gfql()is the unified entrypoint — pass chain-lists OR Cypher strings.- NEVER use
.chain()or.hop()— they are deprecated and emit warnings. Always useg.gfql([...])for chain-list syntax org.gfql("MATCH ...")for Cypher. - When user explicitly asks for GFQL, final snippets must include explicit
.gfql(...). - When the task says remote execution/dataset, use
gfql_remote(...). - Use
name=labels for intermediate matches when you need constraints. - Use
where=[...]for cross-step/path constraints. - Use
min_hops/max_hopsandoutput_min_hops/output_max_hopsfor traversal vs returned slice. - Use predicates (
is_in, numeric/date predicates) for concise filtering. - Use
engine='auto'by default; forcecudf/pandasonly when needed.
Remote mode
# Remote with chain-list
rg = graphistry.bind(dataset_id='my-dataset')
res = rg.gfql_remote([n(), e_forward(), n()], engine='auto')
# Remote with Cypher string
res = rg.gfql_remote("MATCH (n:Person)-[r]->(m) WHERE n.risk_level = 'critical' RETURN n, r, m")
# Remote with Let/DAG
res = rg.gfql_remote(let({...}))
# Remote slim payload (only required columns)
res = rg.gfql_remote([n(), e_forward(), n()], output_type='nodes', node_col_subset=['node_id', 'time'])
# Post-process on remote side when you want trimmed transfer payloads
res = rg.python_remote_table(lambda g: g._edges[['src', 'dst']].head(1000))
Validation and safety
- Validate user-derived query fragments before execution.
- Normalize datetime columns before temporal predicates.
- Prefer small column subsets for remote result transfer.
- Preflight Cypher:
from graphistry.compute.gfql.cypher import parse_cypher, compile_cypher
Canonical docs
- GFQL index: https://pygraphistry.readthedocs.io/en/latest/gfql/index.html
- GFQL overview: https://pygraphistry.readthedocs.io/en/latest/gfql/overview.html
- GFQL quick reference: https://pygraphistry.readthedocs.io/en/latest/gfql/quick.html
- Predicate quick reference: https://pygraphistry.readthedocs.io/en/latest/gfql/predicates/quick.html
- GFQL remote mode: https://pygraphistry.readthedocs.io/en/latest/gfql/remote.html
- GFQL validation: https://pygraphistry.readthedocs.io/en/latest/gfql/validation/index.html
- GFQL + loaders/AI patterns: https://pygraphistry.readthedocs.io/en/latest/gfql/combo.html
- Cypher syntax guide: https://pygraphistry.readthedocs.io/en/latest/gfql/cypher.html
- Cypher-GFQL mapping: https://pygraphistry.readthedocs.io/en/latest/gfql/spec/cypher_mapping.html
More from graphistry/graphistry-skills
pygraphistry-visualization
Build PyGraphistry visualizations with bindings, encodings, layout controls, static export, and privacy-aware sharing. Use for color/size/icon/badge styling, layout tuning, map/static output, and plot link sharing workflows.
27pygraphistry-core
Core PyGraphistry workflow for authentication, shaping edges/nodes/hypergraphs, and plotting. Use for first-run setup, converting tables to graphs, and producing an initial interactive graph quickly and safely.
25pygraphistry-connectors
Select and use PyGraphistry connector and plugin workflows for graph databases, SQL/data platforms, SIEM/log sources, and layout/compute plugins. Use when requests involve Neo4j/Neptune/Splunk/Kusto/Databricks/SQL/TigerGraph and similar integrations.
25pygraphistry
TOC router for PyGraphistry tasks. Use when a request involves PyGraphistry and you need to choose the right workflow: loading/ETL shaping, visualization/layout/sharing, GFQL queries (Cypher, chain-lists, Let/DAG, GRAPH constructors), AI/UMAP/embed/semantic-search workflows, or connector-specific ingestion.
24pygraphistry-ai
Apply PyGraphistry graph ML/AI workflows such as UMAP, DBSCAN, embedding-based anomaly analysis, and fit/transform pipelines on nodes or edges. Use for feature-driven exploration, clustering, anomaly triage, and graph-AI notebook workflows.
24graphistry
Umbrella router for Graphistry workflows across SDK and API surfaces. Use to dispatch between Python SDK, REST API, and (future) JavaScript SDK workflows.
17