Graphista: Dynamic Graph-Based LLM-Powered Memory
A proof-of-concept prototype that leverages LLM loops for graph-based memory management. Two specialized loops — SmartNodeProcessor for ingestion and SmartRetrievalTool for querying — enable dynamic knowledge management with chain-of-thought reasoning.
Clone the repository and install dependencies:
Project structure:
Initialize Memory, ingest text, and query — in three steps:
Import the Memory class and configure your ontology, extraction rules, and LLM settings. The backend defaults to local JSON for testing.
The ingest() method processes raw text through the SmartNodeProcessor loop: extracting entities, deduplicating, generating embeddings, and creating edges automatically.
The ask() method uses the SmartRetrievalTool to traverse the graph with chain-of-thought reasoning, returning both the answer and its reasoning steps.
Click any component below to explore its implementation. Particles show the live data flow: blue = read path, yellow = write path, green = LLM loops.
Full read/write access to the graph database. Runs an LLM loop that processes incoming text: extracting entities, checking for duplicates via vector similarity, merging or creating nodes, and establishing edges. Exits when the LLM returns a finish action.
Read-only graph access. Runs a chain-of-thought loop: the LLM iteratively decides which read tools to call (vector search, get edges, find similar nodes), building context until it has enough to synthesize a final answer.
Database abstraction layer that provides a unified API across multiple backends. Handles node/edge CRUD, vector storage, batch operations, and path queries. Supports Local JSON, Neo4j, and FalkorDB.
The Memory class is the unified entry point. It ties together the graph database, ontology, LLM, and processing loops.
| Method | Description | Returns |
|---|---|---|
| Memory(backend, ontology_config, ...) | Initialize with ontology, extraction rules, LLM config | Memory instance |
| ingest(text) | Process text → extract entities → create graph nodes/edges | {id, processing_result} |
| ask(question) | Natural language query via chain-of-thought retrieval | {final_answer, chain_of_thought} |
| retrieve(keyword) | Simple keyword search across node properties | list[Node] |
| query(Query) | Advanced query object with filters, vector search, paths | list[Node] |
When you call memory.ingest(text), the following pipeline executes:
- Document node created with raw text stored as content
- SmartNodeProcessor loop starts — LLM reads text and decides actions
- Entity extraction — identifies Person, Company, Concept entities
- Deduplication — vector similarity search against existing nodes (threshold 0.92)
- Node creation or merge — new entities become nodes, duplicates get merged
- Edge creation — relationships between entities become directed edges
- Embedding generation — all new nodes get vector embeddings via OpenAI
- Loop exits — when LLM returns
finishaction with processing summary
When you call memory.ask(question), the SmartRetrievalTool loop executes:
- Question parsed — LLM analyzes the query intent
- Tool selection — LLM picks from:
find_similar_nodes,get_edges,get_connected_nodes,vector_search,query - Iterative traversal — each tool result is added to context, LLM decides next step
- Chain-of-thought recorded — every action logged for transparency
- Synthesis — LLM combines gathered context into a natural language answer
- Finish — returns
{final_answer, chain_of_thought}
Read-Only Tools (used by SmartRetrievalTool + SmartNodeProcessor):
| Function | Description | Returns |
|---|---|---|
| find_similar_nodes(embedding, k) | Vector cosine similarity search | list[Node] |
| get_node_by_property(key, val) | Exact property match lookup | Node | null |
| get_connected_nodes(node_id) | All nodes connected by edges | list[Node] |
| get_edges(node_id) | All edges from/to a node | list[Edge] |
| vector_search(query, k) | Text → embed → similarity search | list[Node] |
| query(cypher_str) | Raw Cypher/GQL execution | ResultSet |
Write Tools (SmartNodeProcessor only):
| Function | Description | Returns |
|---|---|---|
| create_node(label, props) | Create node with auto-embedding | node_id |
| update_node(id, props) | Merge properties + refresh embedding | void |
| create_edge(src, tgt, rel) | Create directed relationship | edge_id |
| batch_create_nodes(list) | Bulk insert with parallel embeddings | list[id] |
| batch_create_edges(list) | Bulk edge creation | list[id] |
Efficiently create thousands of nodes in a single call with parallel embedding generation.
Discover indirect relationships between nodes with configurable depth. Find how Person connects to Company through intermediate nodes.
Combine vector similarity with property filters for precise results. Filter by type/industry then rank by embedding similarity.
| Backend | Config | Status |
|---|---|---|
| Local JSON | backend="local", db_path="graph.json" | ✓ Stable |
| Neo4j | backend="neo4j", uri, auth | ⚠ Experimental |
| FalkorDB | backend="falkordb", host, port | ⚠ Experimental |