Diagnose (SAST)MCPPro

Bugs take ages to find.
Whether you wrote them or your AI did.
Diagnose finds them in seconds, on demand.

Diagnose is Twira’s static-analysis (SAST) engine, with bug, code-health, DOM, and port-migration detectors layered on top. 54 deterministic checks today, catching the issues humans and AI agents commonly ship, SQL injection, command injection, hardcoded secrets, path traversal, missing await, empty catch blocks, unhandled errors, and more. Per-file in milliseconds; whole codebase in seconds. Two ways to run it: on demand from your editor, or inside the agent loop via MCP, wherever you are writing code, Diagnose is one call away. Drift-resilient suppressions that survive renames.

AI alone

Ships code without checks
"Looks reasonable to me"
Bugs caught hours later, in production
No deterministic safety net
Suppressions break on every rename
Slow feedback, or none at all

Twira Diagnose powertool

54 deterministic detectors
Per file in milliseconds
Whole codebase in seconds
Findings on demand, in your editor or the agent loop
Suppressions survive refactors
Same engine across editor and AI agent

How Diagnose compares to the typical cloud SAST shape

The cloud SAST category settled into a familiar shape over the last decade: per-developer pricing, code uploaded to the vendor's cloud, minutes-to-hours scan times, and suppressions keyed on file-line-rule so every refactor reopens yesterday's noise. Twira flips all of it.

Dimension	Typical cloud SAST tool	Twira Diagnose
Pricing	Per-developer or per-committer seats, typically $25–$80/dev/mo, with enterprise tiers running to five and six figures annually	Pro $29.99/mo flat for the whole project, no per-developer scan tax
Where your code goes	Uploaded to the vendor's cloud for analysis; regional data residency is typically an Enterprise-tier upgrade	Stays on your machine, no upload, no vendor cloud, no data-residency negotiation
Typical scan time	Minutes to hours, depending on engine, codebase size, and scan depth, usually delivered as a CI job, not inside the edit loop	Per file: milliseconds. Whole codebase: seconds. Fast enough for the agent to call inside a single turn.
Suppression resilience	Typically keyed on file-line-rule or content hash, renames, reformats, and refactors break the suppression and the warning re-fires	Three-tier drift system, exact match, structural-hash, semantic-embedding, so suppressions survive renames, reformats, and deeper restructures
Index freshness	Re-scans the codebase on every meaningful change; results decoupled from the developer's edit loop	Local index updated on every commit by a post-commit hook; the next scan starts from the already-fresh graph

Comparison describes general patterns observed across the cloud SAST category, pricing models, deployment architectures, scan-time ranges, and suppression-keying approaches, based on Twira's internal competitive research (May 2026), drawn from public pricing pages, vendor documentation, and independent benchmarks. Individual products vary; some vendors offer self-hosted editions or premium tiers that change one or more of the dimensions above. Twira timings are measured on a typical 10,000-file repository running locally. Not procurement advice, evaluate any specific tool on its own merits.

Bugs take ages to find. Diagnose finds them in milliseconds. Yours and your agent’s.

scan

65 detectors across 4 profiles. Findings as JSON or SARIF 2.1.0.

baseline

Snapshot the finding set. Diff every later run against it.

suppress

Mark false positives. Survive renames and refactors automatically.

output

Structured findings as JSON or SARIF 2.1.0, pipe to any downstream tooling you already have.

Sensitivity scale

REDSQL injection · taint flow · nil-map writes · hardcoded secrets · command injection

YELLOWmissing await · floating promises · unchecked errors · empty catch · large functions

GREENTODO markers · dead exports · stray logs · discarded returns

You ask

“The agent just wrote handleStripeWebhook. Anything wrong?”

Twira instantly

runs the bug + security profiles on the changed file
catches a missing await on the database call (YELLOW)
catches a string-concatenated SQL, taint flow flagged (RED)
skips a false positive you suppressed last week (drift survived the rename)
returns findings with line numbers, severity, and confidence

Agent sees both findings, fixes them before commit. Nothing ships with the bugs in it.

Without Twira

With Twira

agent ships untested

agent checks before shipping

bugs caught in production

bugs caught on commit

"looks reasonable to me"

RED / YELLOW / GREEN with line numbers

no deterministic safety net

65 detectors

suppressions break on every rename

drift-resilient suppressions

How the agent uses this

Agent calls `diagnose` via MCP. 8 modes: `scan`, `suppress`, `unsuppress`, `suppressions`, `baseline`, `verify`, `remediate`, `reconcile`. `profile`: `bug` / `health` / `security` / `all`. Returns findings as JSON or SARIF 2.1.0.

When you reach for it

Pre-commit safety, `twira diagnose --profile bug` before pushing, to catch the missing await, the SQL concatenation, the empty catch that the agent wrote and you didn’t spot.
Inside the agent loop, the agent calls `diagnose` over MCP after writing or editing code and fixes findings before you ever see them; deterministic catches keep the LLM from shipping the obvious bugs.
Before you merge an agent-suggested refactor, `baseline` before, `verify` after, and the delta shows you exactly what changed in the finding set.
When you confirm a false positive, `suppress` it once, and the drift system keeps it suppressed even when the agent renames the surrounding function or restructures the file next week.
Multi-model code review, run Diagnose first to catch the deterministic issues, then run `team review` (Pro multi-model code review) so the LLMs spend their tokens on judgement calls, not pattern matching.
Compliance evidence, the suppression audit log plus the Audit chain together give you a defensible record of every suppression decision: who, when, why, with which intent.
Long-term code-health trends, baseline at the start of each sprint, verify at the end; the delta becomes the engineering health metric.

See it work

$ twira diagnose --profile all --format sarif

✓ scanned 4,210 files · 38,127 symbols in 8.2s⚠ 14 findings · 2 RED · 5 YELLOW · 7 GREEN✓ 22 suppressions survived (12 via struct hash, 6 via alias, 4 exact)✓ SARIF 2.1.0 written to .twira/diagnose.sarif

Technical depth, for engineers who want it

What Diagnose (SAST) does

Diagnose runs 54 deterministic detectors across four selectable profiles, bug, health, security, and port, plus an "all" profile that runs everything. Each finding comes back with a severity (RED · YELLOW · GREEN), a confidence level (High · Medium · Low), and a precise line. Per-file scans in milliseconds; whole codebase in seconds. Two ways to run it: on demand from the CLI, or inside the agent loop via MCP. Structured output as JSON or SARIF 2.1.0 if you want to pipe findings into downstream tooling. Plus drift-resilient suppressions: a false positive suppressed once stays suppressed even when the agent renames the surrounding function next week.

How it actually works

Diagnose is Twira’s static-analysis engine, a SAST tool by category, with bug, code-health, DOM, and port-migration detectors layered on top of the security profile. 54 deterministic detectors, eight operating modes, drift-resilient suppressions that survive refactors, and standards-compliant output. The tool the agent reaches for when it needs to know whether the code it just wrote will actually behave, and the tool you reach for when you need a defensible answer to "did this change introduce anything new?"

The detectors are deterministic. Each one fires on a real pattern in the code, not on a probabilistic guess. They are organised into four selectable profiles you pick at scan time, bug, health, security, and port, plus an all profile that runs everything.

By functional category, the 65 detectors cover: bug-class issues (4), empty catch blocks, type-system bypasses, SQL string concatenation, return-without-await; code health (6), TODO markers, oversized functions, dead exports, stray console.log, discarded return values, missing error handling; security (23), hardcoded secrets, SQL injection, XSS patterns, eval usage, insecure config, command injection, path traversal, env leaks, prototype pollution, insecure randomness, timing attacks, taint-flow tracking, SSRF (server-side request forgery), open redirect, LDAP injection, NoSQL injection, insecure deserialization, XXE (XML external entities), insecure cookie flags, weak crypto algorithms, weak TLS configuration, race-condition TOCTOU (time-of-check-time-of-use), CSRF pattern detection; cross-language quality (4), unwrap-in-production, missing error propagation, unchecked errors, missing defer; Go-specific (4), goroutine leaks, nil map writes, context misuse, error shadowing; DOM (6), dead CSS, orphaned CSS references, invisible text, CSS layout bugs, accessibility violations, specificity conflicts; cross-file analytics (10), circular dependencies, zombie component islands, high-risk hotspots, argument mismatch, missing await, floating promises, inconsistent await patterns, return-usage contract drift, habit-based argument shape detection, temporal pair breaks; and port migration (8), orphaned symbols, signature drift, missing caller wiring, dead stubs, missing test coverage, contract drift, group incompleteness, import orphans. All 65 ship in Pro.

Coverage is broad today and expanding. The 12 security detectors above cover most of the statically detectable OWASP Top 10, injection (SQL, command), hardcoded secrets, XSS sinks, taint flow, path traversal, insecure config, weak randomness, timing attacks, and more. The next phase adds 12 further categories: SSRF (server-side request forgery), insecure deserialization, XXE (XML external entities), weak password hashing, LDAP and NoSQL injection, open redirect, insecure cookie flags, weak crypto algorithms, weak TLS configuration, format-string vulnerabilities, race-condition (TOCTOU), and CSRF pattern detection. Once those land, Diagnose covers the full OWASP Top 10 + CWE Top 25 categories that are amenable to static analysis. (The universal SAST blind spots, business-logic flaws, design-level issues, remain; no SAST tool catches those, and detecting them is outside the static-analysis problem space.)

Eight modes, one tool. scan runs the detector pipeline across the scope you specify (file, directory, or whole repo) with optional profile, confidence, and detector filters. suppress and unsuppress create and revoke suppression records for specific findings. suppressions lists what is currently active. baseline snapshots the current finding set under a named tag; verify compares any subsequent scan against that baseline and returns a clean delta of new, fixed, and regressed findings. remediate and reconcile are the auto-fix and stale-suppression-housekeeping modes (present in the schema today; capability expanding in the next phase).

The drift-resilient suppression system is the differentiator. Most static-analysis tools key suppressions on file-line-rule or a content hash. Rename a function and the suppression breaks. Move a file and it breaks. Reformat the code and it breaks. Every refactor turns yesterday’s acknowledged false-positives back into today’s noise. Twira uses three layered tiers so a suppression matches the same finding even when the surrounding code is restructured.

Tier 1 matches on exact location, detector plus file plus line, the fast path when nothing has changed. Tier 2 matches on the structure of the statement, a hash of the abstract syntax tree with names and literals normalised away, so renames, reformats, and local edits keep the suppression attached. Tier 3 matches on meaning, a similarity search against the semantic embedding of the original suppression, so deeper refactors (a new branch added, a helper extracted, error handling rewritten) still find their historical match. Tier 3 matches surface as a "ghost candidate" on the finding rather than a silent auto-suppression: the developer or agent confirms it before it sticks. Confirmations are remembered; rejections are recorded so the same false candidate never nags twice.

For AI-agent-heavy workflows where the codebase is constantly being restructured, this is the difference between "the tool still works after the agent’s third refactor" and "the tool’s output became noise an hour ago." It is built for the world where the codebase shape changes every few minutes.

Suppression provenance is recorded for every entry. Each suppression carries an intent (false positive, accepted risk, or temporary), a trust level (human, AI, or policy), and an optional expiry. Intent influences how confidently a future drift-tier match is accepted, a confirmed human "false positive" matches future findings more confidently than something parked as "temporary." Trust level is captured for attribution. Every suppress, unsuppress, or expire event lands in an append-only audit log that compliance can inspect.

Multi-detector corroboration. When more than one detector fires on the same location, say EmptyCatch and MissingErrorHandling both flag the same try/catch, Diagnose merges them into a single finding with a corroborated_by array, so the agent sees one issue with two reasons rather than two separate noise lines to triage.

Baselines and delta verification. baseline captures every active finding at a moment in time, named with whatever tag you choose (release branch, sprint start, pre-refactor checkpoint). verify runs a fresh scan and returns three clean lists: new findings that were not in the baseline, fixed findings that no longer reproduce, and regressed findings that have escalated in severity. Combined with the drift-resilient suppressions, this gives you a stable answer to "did my refactor introduce any new problems?", even when the refactor moved every line in the file.

Standards-aligned output. Diagnose parses 26 languages using the same tree-sitter grammar family the major IDEs use. Findings come back as structured JSON by default; you can also emit SARIF 2.1.0 (the OASIS industry standard for static-analysis output) if you want to pipe results into downstream tooling that understands the format. Every suppression event is signed and chained into the same audit Merkle tree that powers Audit, so compliance can reconstruct exactly who suppressed what and why. Dependency-vulnerability scanning has its own dedicated tool, see Dependency Vulnerabilities.

Two ways to run it, one engine. Run on demand from the CLI with twira diagnose. Run inside the agent loop via MCP, the agent sees the same findings you would and can act on them. Same engine, same findings, same drift handling either way. Twira is built to live where the code is being written: your editor and your AI agent. Not in a cloud pipeline waiting to be triggered.

Setup. Diagnose reads from the index, so run twira index once on install. The post-commit hook keeps the index fresh, and the suppression drift system handles everything else automatically.

What it isn’t

Diagnose is SAST (static analysis of source code), not DAST (testing a running application). It does not send requests to a deployed app or look for runtime issues, for that, you need a dedicated DAST tool.
Detectors fire on patterns we have written them to find. They do not invent novel bug classes. Each detector documents the exact pattern it covers, no probabilistic "maybe" findings.
RED, YELLOW and GREEN classify the *kind* of issue, not the certainty. A RED finding is more likely to bite than a GREEN one; both still want human review.
Confidence threshold tunes the noise floor. Default `low` shows every match. Raise to `high` to see only the highest-confidence signals when you are dialling down noise.
Diagnose covers SAST + bugs + code health. Dependency vulnerabilities (SCA) live in the sibling Dependency Vulnerabilities tool, same page family, separate scan path.

One install. Your agent will know the difference in the first session.

$ curl -fsSL twira.com/install.sh | sh

Install Twira →See pricing