Home › Tech Stack & Ecosystem

Tech Stack & Ecosystem

The technologies powering Graphify and how it fits into the broader AI coding landscape.

Core Technologies

Language

Python 3.10+

Core language. Chosen for ecosystem compatibility with tree-sitter, NetworkX, and AI model SDKs.

Graph Engine

NetworkX

In-memory graph data structure and algorithms. No database needed — the graph serializes to JSON.

Code Parsing

tree-sitter

Parser generator for concrete syntax trees. 19 languages. Deterministic, fast, local — no LLM calls.

Visualization

vis.js

Interactive force-directed graph layout in the browser. Search, filter, click-to-explore.

Clustering

Leiden (graspologic)

Community detection algorithm. Finds clusters by edge density — no embeddings or vector DB.

Semantic Extraction

Claude / GPT-4

Parallel subagents extract concepts from docs, papers, and images. Only for non-code content.

Optional Dependencies

ExtraPackage(s)Purpose
[mcp]mcpMCP stdio server for agent queries
[neo4j]neo4jDirect push to Neo4j instance
[pdf]pypdf, html2textPDF text extraction
[watch]watchdogFile system monitoring for auto-rebuild
[leiden]graspologicLeiden community detection (preferred over Louvain)
[office]python-docx, openpyxl.docx and .xlsx support
[all]all of the aboveInstall everything
pip install graphifyy[all]   # install with all extras

Platform Integration

Claude Code
Deepest integration. CLAUDE.md + PreToolUse hook fires before Glob/Grep calls, surfacing graph context automatically.
Codex
Trigger: $graphify. Uses AGENTS.md for always-on instructions. Needs multi_agent = true for parallel extraction.
OpenCode
Trigger: /graphify. Uses AGENTS.md. Standard skill installation.
OpenClaw
Trigger: /graphify. Uses AGENTS.md. Sequential extraction only (parallel agent support is early).
Factory Droid
Trigger: /graphify. Uses AGENTS.md. Parallel extraction via Task tool.

Always-On vs. Explicit Commands

Two Modes of Operation
Always-On Hook
  • Reads GRAPH_REPORT.md before file searches
  • Gives the AI a "map" of the codebase
  • Passive — no user action needed
  • Best for: everyday navigation
Explicit Commands
  • Traverses raw graph.json hop by hop
  • Traces exact paths between nodes
  • Active — user asks specific questions
  • Best for: deep investigation

The hook gives your assistant a map. The commands let it navigate the map precisely.

Supported File Types

TypeExtensionsExtraction Method
Code.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php .swift .lua .zig .ps1 .ex .exs .m .mmtree-sitter AST + call-graph + rationale
Docs.md .txt .rstClaude concept extraction
Office.docx .xlsxConvert to markdown, then Claude
Papers.pdfCitation mining + concept extraction
Images.png .jpg .webp .gifClaude vision

Extending Graphify

Adding a New Language

1
Define a LanguageConfig instance with AST node types for the language
2
Add a _import_<lang>() function for language-specific import handling
3
Register the file suffix in extract() dispatch and collect_files()
4
Add suffix to CODE_EXTENSIONS in detect.py and _WATCHED_EXTENSIONS in watch.py
5
Add tree-sitter package to pyproject.toml dependencies
6
Add fixture to tests/fixtures/ and tests to tests/test_languages.py

Programmatic Usage

from graphify.detect import detect
from graphify.extract import extract
from graphify.build import build_from_json
from graphify.cluster import cluster
from graphify.analyze import god_nodes, surprising_connections

files = detect("./my-project")
extractions = extract(files["files"]["code"])
G = build_from_json(extractions)
communities = cluster(G)
gods = god_nodes(G)

Knowledge Check

Quiz: Test Your Understanding

1. Why does Graphify use two extraction passes instead of one?

To extract code twice for better accuracy
Code is parsed free via AST; only docs/images need LLM calls
The first pass is a draft and the second pass refines it

2. What does the EXTRACTED confidence tier mean?

The LLM extracted it with high confidence
It was extracted from documentation
Found directly in source code (import statement, direct call)

3. How does Graphify detect communities?

Using embeddings and a vector database
Graph topology via the Leiden algorithm (edge density)
By grouping files in the same directory

4. What happens when you run /graphify . --update?

Rebuilds the entire graph from scratch
Only re-extracts changed files, merges into existing graph
Updates the graph visualization CSS

Built from safishamsi/graphify

← Back to Home