pygraphistry-ai

SKILL.md

PyGraphistry AI

Doc routing (local + canonical)

  • First route with ../pygraphistry/references/pygraphistry-readthedocs-toc.md.
  • Use ../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv for section-level shortcuts.
  • Only scan ../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml when a needed page is missing.
  • Use one batched discovery read before deep-page reads; avoid cat * and serial micro-reads.
  • In user-facing answers, prefer canonical https://pygraphistry.readthedocs.io/en/latest/... links.

Typical workflow

  1. Build graph from nodes/edges.
  2. Run feature/embedding method (umap, embed, optional dbscan).
  3. Inspect derived columns/features and visualize.
  4. Iterate on feature columns and sampling strategy.

Baseline examples

# Similarity embedding / projection
g2 = graphistry.nodes(df, 'id').umap(X=['f1', 'f2', 'f3'])
g2.plot()
# Fit/transform flow for consistent projection on new batches
g_train = graphistry.nodes(df_train, 'id').umap(X=['f1', 'f2'])
g_batch = g_train.transform_umap(df_batch, return_graph=True)
g_batch.plot()
# Semantic search over embedded features
g2 = graphistry.nodes(df, 'id').umap(X=['text_col'])
results_df, query_vector = g2.search('suspicious login pattern')
# Text-first workflow: featurize then search/cluster
g2 = graphistry.nodes(df, 'id').featurize(kind='nodes', X=['title', 'body']).umap(kind='nodes').dbscan()
hits, qv = g2.search('credential stuffing campaign')
# Precomputed embedding columns
embedding_cols = [c for c in df.columns if c.startswith('emb_')]
g2 = graphistry.nodes(df, 'id').umap(X=embedding_cols)
g_new = g2.transform_umap(df_new, return_graph=True)

Practical guardrails

  • Start with small/representative samples before full runs.
  • Keep explicit feature lists (X=...) for reproducibility.
  • Track engine/dataframe type for CPU vs GPU behavior.
  • For anomaly workflows, document thresholds and false-positive assumptions.
  • For graph ML tasks, route deeper model workflows to RGCN/link-prediction references.
  • For text workflows, prefer featurize(...).umap(...).search(...) when queries are natural language.
  • If users already have embeddings, reuse them via explicit embedding column lists (X=[...]) before recomputing.
  • When user asks for a concise workflow snippet, prefer one short code block and avoid long narrative wrappers.

Canonical docs

Weekly Installs
7
First Seen
10 days ago
Installed on
claude-code7
codex6
mcpjam3
kilo3
windsurf3
zencoder3