Tech Stack & Ecosystem

The technologies powering Graphify and how it fits into the broader AI coding landscape.

Core Technologies

Language

Python 3.10+

Core language. Chosen for ecosystem compatibility with tree-sitter, NetworkX, and AI model SDKs.

Graph Engine

NetworkX

In-memory graph data structure and algorithms. No database needed — the graph serializes to JSON.

Code Parsing

tree-sitter

Parser generator for concrete syntax trees. 19 languages. Deterministic, fast, local — no LLM calls.

Visualization

vis.js

Interactive force-directed graph layout in the browser. Search, filter, click-to-explore.

Clustering

Leiden (graspologic)

Community detection algorithm. Finds clusters by edge density — no embeddings or vector DB.

Semantic Extraction

Claude / GPT-4

Parallel subagents extract concepts from docs, papers, and images. Only for non-code content.

Optional Dependencies

Extra	Package(s)	Purpose
`[mcp]`	mcp	MCP stdio server for agent queries
`[neo4j]`	neo4j	Direct push to Neo4j instance
`[pdf]`	pypdf, html2text	PDF text extraction
`[watch]`	watchdog	File system monitoring for auto-rebuild
`[leiden]`	graspologic	Leiden community detection (preferred over Louvain)
`[office]`	python-docx, openpyxl	.docx and .xlsx support
`[all]`	all of the above	Install everything

pip install graphifyy[all]   # install with all extras

Platform Integration

Claude Code

Deepest integration. CLAUDE.md + PreToolUse hook fires before Glob/Grep calls, surfacing graph context automatically.

Codex

Trigger: $graphify. Uses AGENTS.md for always-on instructions. Needs multi_agent = true for parallel extraction.

OpenCode

Trigger: /graphify. Uses AGENTS.md. Standard skill installation.

OpenClaw

Trigger: /graphify. Uses AGENTS.md. Sequential extraction only (parallel agent support is early).

Factory Droid

Trigger: /graphify. Uses AGENTS.md. Parallel extraction via Task tool.

Always-On vs. Explicit Commands

Two Modes of Operation

Always-On Hook

Reads GRAPH_REPORT.md before file searches
Gives the AI a "map" of the codebase
Passive — no user action needed
Best for: everyday navigation

Explicit Commands

Traverses raw graph.json hop by hop
Traces exact paths between nodes
Active — user asks specific questions
Best for: deep investigation

The hook gives your assistant a map. The commands let it navigate the map precisely.

Supported File Types

Type	Extensions	Extraction Method
Code	.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php .swift .lua .zig .ps1 .ex .exs .m .mm	tree-sitter AST + call-graph + rationale
Docs	.md .txt .rst	Claude concept extraction
Office	.docx .xlsx	Convert to markdown, then Claude
Papers	.pdf	Citation mining + concept extraction
Images	.png .jpg .webp .gif	Claude vision

Extending Graphify

Adding a New Language

Define a LanguageConfig instance with AST node types for the language

Add a _import_<lang>() function for language-specific import handling

Add suffix to CODE_EXTENSIONS in detect.py and _WATCHED_EXTENSIONS in watch.py

Add tree-sitter package to pyproject.toml dependencies

Add fixture to tests/fixtures/ and tests to tests/test_languages.py

Programmatic Usage

from graphify.detect import detect
from graphify.extract import extract
from graphify.build import build_from_json
from graphify.cluster import cluster
from graphify.analyze import god_nodes, surprising_connections

files = detect("./my-project")
extractions = extract(files["files"]["code"])
G = build_from_json(extractions)
communities = cluster(G)
gods = god_nodes(G)

Knowledge Check

Quiz: Test Your Understanding

1. Why does Graphify use two extraction passes instead of one?

To extract code twice for better accuracy

Code is parsed free via AST; only docs/images need LLM calls

The first pass is a draft and the second pass refines it

2. What does the EXTRACTED confidence tier mean?

The LLM extracted it with high confidence

It was extracted from documentation

Found directly in source code (import statement, direct call)

3. How does Graphify detect communities?

Using embeddings and a vector database

Graph topology via the Leiden algorithm (edge density)

By grouping files in the same directory

4. What happens when you run /graphify . --update?

Rebuilds the entire graph from scratch

Only re-extracts changed files, merges into existing graph

Updates the graph visualization CSS

Built from safishamsi/graphify

← Back to Home