// Documentation
Discord GitHub
MIT License — Open Source

Graphista: Dynamic Graph-Based LLM-Powered Memory

A proof-of-concept prototype that leverages LLM loops for graph-based memory management. Two specialized loops — SmartNodeProcessor for ingestion and SmartRetrievalTool for querying — enable dynamic knowledge management with chain-of-thought reasoning.

01Installation

Clone the repository and install dependencies:

bash
git clone https://github.com/pippinlovesyou/graphista.git cd graphista pip install -e .

Project structure:

tree
graphista/ ├── memory.py # Unified Memory class (entry point) ├── console.py # Interactive CLI console ├── example.py # Usage examples ├── graphrouter/ # Graph database abstraction layer ├── ingestion_engine/ # SmartNodeProcessor + extraction ├── llm_engine/ # LLM integration + tool definitions ├── docs/ # Documentation └── tests/ # Test suite
02Quickstart

Initialize Memory, ingest text, and query — in three steps:

Step 1 — Initialize Memorymemory.py

Import the Memory class and configure your ontology, extraction rules, and LLM settings. The backend defaults to local JSON for testing.

Step 2 — Ingest Dataingest()

The ingest() method processes raw text through the SmartNodeProcessor loop: extracting entities, deduplicating, generating embeddings, and creating edges automatically.

Step 3 — Query the Graphask()

The ask() method uses the SmartRetrievalTool to traverse the graph with chain-of-thought reasoning, returning both the answer and its reasoning steps.

03System Architecture

Click any component below to explore its implementation. Particles show the live data flow: blue = read path, yellow = write path, green = LLM loops.

04Core Components
SmartNodeProcessoringestion_engine

Full read/write access to the graph database. Runs an LLM loop that processes incoming text: extracting entities, checking for duplicates via vector similarity, merging or creating nodes, and establishing edges. Exits when the LLM returns a finish action.

SmartRetrievalToolllm_engine

Read-only graph access. Runs a chain-of-thought loop: the LLM iteratively decides which read tools to call (vector search, get edges, find similar nodes), building context until it has enough to synthesize a final answer.

GraphRoutergraphrouter/

Database abstraction layer that provides a unified API across multiple backends. Handles node/edge CRUD, vector storage, batch operations, and path queries. Supports Local JSON, Neo4j, and FalkorDB.

05Memory Class API

The Memory class is the unified entry point. It ties together the graph database, ontology, LLM, and processing loops.

MethodDescriptionReturns
Memory(backend, ontology_config, ...)Initialize with ontology, extraction rules, LLM configMemory instance
ingest(text)Process text → extract entities → create graph nodes/edges{id, processing_result}
ask(question)Natural language query via chain-of-thought retrieval{final_answer, chain_of_thought}
retrieve(keyword)Simple keyword search across node propertieslist[Node]
query(Query)Advanced query object with filters, vector search, pathslist[Node]
06ingest() — Ingestion Pipeline

When you call memory.ingest(text), the following pipeline executes:

  • Document node created with raw text stored as content
  • SmartNodeProcessor loop starts — LLM reads text and decides actions
  • Entity extraction — identifies Person, Company, Concept entities
  • Deduplication — vector similarity search against existing nodes (threshold 0.92)
  • Node creation or merge — new entities become nodes, duplicates get merged
  • Edge creation — relationships between entities become directed edges
  • Embedding generation — all new nodes get vector embeddings via OpenAI
  • Loop exits — when LLM returns finish action with processing summary
python
result = memory.ingest("John Doe is a software engineer at Innotech.") print(result["id"]) # "doc_001" print(result["processing_result"]) # {chain_of_thought: [...], new_nodes: 2, ...}
07ask() — Query Pipeline

When you call memory.ask(question), the SmartRetrievalTool loop executes:

  • Question parsed — LLM analyzes the query intent
  • Tool selection — LLM picks from: find_similar_nodes, get_edges, get_connected_nodes, vector_search, query
  • Iterative traversal — each tool result is added to context, LLM decides next step
  • Chain-of-thought recorded — every action logged for transparency
  • Synthesis — LLM combines gathered context into a natural language answer
  • Finish — returns {final_answer, chain_of_thought}
python
answer = memory.ask("Who works at Innotech?") print(answer["final_answer"]) # "John Doe works at Innotech as a software engineer" print(answer["chain_of_thought"]) # ["vector_search: Innotech", "get_edges: ...", ...]
08Tool Library

Read-Only Tools (used by SmartRetrievalTool + SmartNodeProcessor):

FunctionDescriptionReturns
find_similar_nodes(embedding, k)Vector cosine similarity searchlist[Node]
get_node_by_property(key, val)Exact property match lookupNode | null
get_connected_nodes(node_id)All nodes connected by edgeslist[Node]
get_edges(node_id)All edges from/to a nodelist[Edge]
vector_search(query, k)Text → embed → similarity searchlist[Node]
query(cypher_str)Raw Cypher/GQL executionResultSet

Write Tools (SmartNodeProcessor only):

FunctionDescriptionReturns
create_node(label, props)Create node with auto-embeddingnode_id
update_node(id, props)Merge properties + refresh embeddingvoid
create_edge(src, tgt, rel)Create directed relationshipedge_id
batch_create_nodes(list)Bulk insert with parallel embeddingslist[id]
batch_create_edges(list)Bulk edge creationlist[id]
09Advanced Usage
Batch Operationsperformance

Efficiently create thousands of nodes in a single call with parallel embedding generation.

Path Queriestraversal

Discover indirect relationships between nodes with configurable depth. Find how Person connects to Company through intermediate nodes.

Hybrid Searchvector + property

Combine vector similarity with property filters for precise results. Filter by type/industry then rank by embedding similarity.

10Database Backends
BackendConfigStatus
Local JSONbackend="local", db_path="graph.json"✓ Stable
Neo4jbackend="neo4j", uri, auth⚠ Experimental
FalkorDBbackend="falkordb", host, port⚠ Experimental