Your IDE has an index.
Your AI agent can’t use it.
Twira builds the same kind of structural map for your AI agent, every symbol, every call, every import, every reference. The result: your agent finds things with deterministic accuracy. Query the graph, get the exact answer. No guessing, no inference, no hallucination. Captured once on install, kept fresh by a post-commit hook, queried by every other tool in milliseconds across 26 languages.
You navigate by structure. Your agent navigates by text. Twira gives the agent the structure, guessing becomes querying, and the answer is exact, every time.
symbolsEvery function, class, type, constant. Across 26 languages.
call graphEvery who-calls-what edge, with type and arg context.
dependenciesEvery import. Both directions. Imports + imported-by.
referencesEvery non-call mention, value-passes, type annotations, macro tokens.
embeddingsOptional vectors that power semantic code search.
You ask
“Give my agent the same structural view of this codebase that my IDE has.”
Twira instantly
- walks every file in the project
- parses 26 languages with the same grammar family the major IDEs use
- extracts symbols, calls, dependencies, and references
- writes a local knowledge graph (typically ~60 MB on a 10k-file repo)
- optionally adds semantic embeddings via your configured provider
Every other Twira tool now answers in milliseconds.
How you use this
Primarily a developer command (`twira index`). The agent can trigger a re-index via the `health` MCP tool with `action: "index"`.
When you reach for it
- First time you install Twira on a project, `twira index` once. 10–40 s on a typical 10k-file project.
- After install: never, by default. A git post-commit hook re-indexes incrementally in the background on every commit.
- When you want to add semantic search to the mix, `twira index --embed` after configuring a provider key.
- After a big rename refactor, `twira index --rebuild` regenerates the cross-file call graph in one pass (incoming edges to renamed symbols self-heal otherwise as callers are touched, but `--rebuild` is the immediate full clean).
- When you’re curious what’s in the graph, `twira health` shows file count, symbol count, call count, embedding coverage, and last-indexed-at per file.
See it work
$ twira indexTechnical depth, for engineers who want it
In your editor
In your editor, when you open a project, the IDE quietly builds an index of every file, every symbol, every import. `Cmd+T` jumps. Go to Definition. Find All References. They all work because the index is there, ready to query. You never see it, but you would notice the day it did not exist.
What Index does
Index builds the same kind of structural map for your AI agent. Every file, every symbol, every call, every import, every reference, captured once on install, kept fresh by a git post-commit hook. Every other Twira tool, search, impact, diagnose, lore, port, queries this graph in milliseconds. Your agent stops re-reading and starts navigating structure.
How it actually works
The index is the substrate. Running twira index walks your codebase once and extracts the structure your AI agent needs to reason about it, every symbol, every call, every dependency, every reference, into a local SQLite knowledge graph at .Twira/index.db. Every other Twira tool is a query against this graph.
The schema holds 26 production tables grouped by purpose, plus seven virtual tables for full-text and vector search. The code graph captures files, symbols, dependencies, package imports, and SQL references. The call graph captures caller-to-callee edges with full data-flow context. Knowledge anchors hold lore entries, drift indexes, suppressions, and baselines. The search layer adds FTS5 indexes over symbols and content. Embeddings, when enabled, live in three sqlite-vec virtual tables, one each for symbols, code chunks, and lore entries. The audit chain that records what your agent did against the index lives in a sibling audit.db with its own Merkle-chained provenance.
Twira parses 26 languages via tree-sitter: JavaScript, TypeScript, TSX, Python, Rust, Go, Java, PHP, C, C++, Swift, Kotlin, C#, Objective-C, Ruby, Lua, Dart, Scala, R, Bash, SQL, HTML, CSS, Haskell, Elixir, and Zig. For each symbol the indexer captures the signature, return type, parent class (for methods), export status, doc comment, and a JSON type_info column recording generic parameters, interface members, enum variants, and constraint details. Malformed files do not crash the extractor, every per-language module handles parser ERROR nodes gracefully.
The call graph is built at index time, not lazily on query. Every call site becomes a directed edge from caller to callee, carrying rich metadata. The call_type column distinguishes direct calls from method invocations, constructors, and await expressions. The argument_count column records how many arguments were passed. The return_usage column records whether the result was discarded, stored, returned, passed as an argument, chained, awaited, or voided. The assigned_to and passed_to columns capture data-flow context.
A separate symbol_refs table records every non-call mention, function pointers, type annotations, path expressions, macro tokens, each tagged with a ref_kind of value, type_annotation, path, or macro_token. A function pointer is not a function call, but it is a symbol reference, and that distinction matters for safe refactoring.
Cross-file resolution is by design; cross-language is deliberately not. Within a language the graph captures everything statically resolvable, direct calls, method calls on known receivers, constructor calls, await expressions, trait method calls in Rust, virtual method calls within a known class hierarchy. What it does not capture: calls through eval or reflection, higher-order closures with no static target, and dynamic dispatch through abstract base classes where the runtime type is unknown.
Embeddings are an opt-in upgrade. When you configure an embedding provider (Anthropic, OpenAI, or your own, bring your own key), Twira stores 512-dimensional vectors via the sqlite-vec extension across three populations. One per symbol, embedding the symbol contract — name, kind, signature, doc, visibility, return type, async and unsafe flags, owning class, language, and top-15 callees — so the vector carries both intent and behaviour. One per function body via tree-sitter AST-aware chunking (one chunk per function), each chunk enriched with its symbol contract so intent and implementation embed together. One per lore entry, powering lore search and the semantic tier of drift detection. Semantic search fuses all three with Reciprocal Rank Fusion plus a final call-graph rescoring pass. The non-embedding search modes, symbol, path, content, regex, work without an embedding provider configured.
Drift detection runs across three tiers, all stored in the index. Tier 1 is the alias chain: when a finding is renamed, the finding_ref_aliases table keeps the old finding_ref resolvable to the new canonical_ref. Tier 2 is structural: every suppressed finding carries an AST hash (struct_hash_s, struct_hash_c, taint_sig) so the suppression survives a rename but not a semantic change. Tier 3 is semantic: embeddings catch deeper refactors that change the structure but not the meaning. Together the three tiers mean suppressions stay attached when code changes shape, and lore entries surface as stale when the symbol they were anchored to no longer matches.
Every other Twira tool reads from the same graph. Code Search queries the FTS5 tables and the embedding tables. Code Read uses the indexed line ranges to fetch a symbol’s exact source without re-parsing. Impact (refs, deps, blast_radius) walks call_graph and dependencies. Diagnose writes baselines and suppressions into the index and uses the drift indexes to detect stale findings. Lore anchors entries against symbol_hash and surfaces them via lore_triggers. Port uses type_info to match structurally equivalent symbols across languages. Without the index, none of them work.
Storage is compact and incremental updates are clean. On a typical 10,000-file repository the database lands around 60 MB. First index takes 10 to 40 seconds depending on size and language mix, plus another 5 to 30 seconds if you are also building embeddings. After that, a git post-commit hook runs twira index in the background after every commit, incremental, file-level, and content-hash gated so unchanged files are skipped entirely. For files that did change, Twira deletes the old symbols, call edges, references, dependencies, and package imports, then inserts the fresh ones. Foreign-key cascades clean up embeddings and lore anchors. Files removed from disk drop out cleanly. No duplication, no orphan rows, no stale symbols.
One subtlety worth knowing about renames. When you rename a heavily-referenced symbol, handleLogin to handleSignIn, say, the outgoing call edges from that symbol’s file are refreshed immediately, but incoming call edges from other files keep pointing at the old name until those files are themselves re-indexed. The post-commit hook re-indexes any file you touch, so this self-heals on the next commit that touches a caller. After a sweeping rename refactor, twira index --rebuild regenerates the entire graph from scratch in one pass.
Privacy and isolation are non-negotiable. The index never leaves your machine. It is not synced to a cloud, not phoned home, not visible to Twira’s servers, there are no Twira servers in this loop. The only outbound calls Twira makes are to the AI provider you configured, with the key you supplied. Encryption at rest is bring-your-own-key: set TWIRA_DB_KEY, use the OS credential manager, or supply a key per database via .Twira/encryption.json. The audit chain in sibling audit.db is Ed25519-signed and Merkle-chained, any third party can verify it offline, without Twira infrastructure.
Why this all matters. Every AI coding agent operates against tokens, the model sees a window of text, and the bigger the window, the more it sees. But the agent is still pattern-matching over text it has to re-read each time. The index gives your agent something fundamentally different: a structured graph it can query in milliseconds. Asking "what calls handleLogin" becomes a database query against call_graph, not a context-window read across thousands of files. The answer is deterministic, exact, and instant. Hallucinations drop because the agent is grounded in what the index says is actually there.
What it isn’t
- The call graph is per-language. Calls across language boundaries (TypeScript calling a Python service, etc.) are not statically resolvable, so they are not in the graph.
- Calls through `eval`, reflection, or runtime dynamic dispatch are not in the index either, only statically resolvable references.
- Embeddings are opt-in. They need an embedding provider configured. The non-embedding search modes (symbol, path, content, regex) work without one.
- You almost never run `twira index` manually after the first time. A git post-commit hook re-indexes incrementally in the background on every commit.
One install. Your agent will know the difference in the first session.
$ curl -fsSL twira.com/install.sh | sh