See what the output artifacts look like, how the defect scan works, which models deliver the best results, and what kind of codebases benefit most.
These are real output sizes from a 6-phase run analyzing CodeCartographer itself. A small codebase produced over 21,000 tokens of structured findings.
System structure with 5 layers, 12 public surfaces catalogued, dependency direction analysis, runtime lifecycle documentation, and concurrency model notes. Each architectural claim tagged with its evidence level.
Six-pass scan finding 18 defects: 1 critical, 3 high severity, 8 medium, 6 low. Each defect includes file location, reproduction steps, recommended action, and evidence classification.
Feature-by-feature behavior with defaults, error handling, and acceptance tests. Nine feature areas catalogued with explicit black-box acceptance scenarios.
Event flows, state machines, persistence formats. Two state machines documented, five event protocols traced, persistence layer compatibility hazards flagged.
Synthesis layer ranking what matters and what is risky. Priority-ranked module list with rationale, portability hazard inventory, and dependency graph for rewrite sequencing.
Language-agnostic build plan with 8 modules, acceptance scenarios per module, and a known-unknowns section for what could not be determined from available sources.
The defect scan is not one general-purpose review. It runs six sequential passes, each with domain-specific instructions. Findings are triaged by severity and mapped to a recommended action.
Off-by-one errors, null handling gaps, boundary condition violations, type coercion risks, control flow dead spots.
Swallowed exceptions, missing fallback paths, error propagation gaps, inconsistent error taxonomy, silent failure modes.
Race conditions, deadlock risks, missing synchronization, shared state without guards, async ordering assumptions.
Injection surfaces, missing input validation, hardcoded secrets, auth bypass risks, dependency vulnerability exposure.
Parameter mismatches, return type drift, undocumented side effects, versioning hazards, broken backward compatibility.
Hardcoded paths, missing env validation, unsafe defaults, environment drift between dev and production, secret rotation gaps.
Every finding gets a severity (critical, high, medium, low) and a recommended action: fix before porting, port differently, or leave behind. The deep audit variant adds a second semantic pass with full contracts and protocols context, catching issues the mechanical scan misses.
CodeCartographer is LLM-agnostic by design, but model choice affects what you can analyze and how good the results are.
Best for full-with-deep-audit on larger codebases. The semantic defect pass benefits most from strong reasoning, long context, and reliable tool use.
Good for architecture, contracts, and protocols. Recommended starting point: lite pipeline, then escalate only if the output is grounded and specific.
Use for quick structural orientation. Multi-phase pipelines can drift; prefer architecture-only unless you can manually review each phase.
Start with pipeline-architecture-only.yaml on a codebase you already
understand. Compare the output against your own knowledge. That gives you a fast signal on
whether to trust the model with deeper phases.
Different pipelines serve different goals. Here is when each one makes sense.
Run the full pipeline. Start with architecture to get your bearings. Continue through contracts and protocols to recover the behavior. End with porting and reimplementation specs if a rewrite is on the table.
Full with deep audit. The split defect scan means the reimplementation spec is designed around defects that were found with full contracts and protocols context. You get a build plan that accounts for the real state of the code.
Defect scan pipeline. Two phases: architecture plus six-pass audit. You get a triaged defect report with severity and actions. Nothing more.
Architecture only. One phase. System layers, public API surfaces, dependency direction, runtime notes. Low cost, fast signal.
Lite pipeline. Architecture, contracts, protocols. The fastest path to knowing how a system works and what it promises, without the porting bundle overhead.
Drop .codecarto/ into each package. Run architecture-only on packages you are curious about. Run full on packages you are targeting for changes. The status file keeps each analysis independent.
Download the repository, drop .codecarto/ into your project, and point an LLM at
the guide.