Analysis Architecture

SqC uses a multi-pass analysis architecture:

Source Files
    |
    v
[Tree-sitter Parser] --> AST (per-file)
    |
    v
[Pre-scan Pass] --> Cross-file context (function defs, summaries, macros,
    |                 struct types, global states, call graph, call-site args)
    v
[CFG Construction] --> Per-function control-flow graphs
    |
    v
[Dataflow Analysis] --> Null state, value range, reaching defs, init state
    |
    v
[Rule Evaluation] --> 285 CERT C rules applied to AST + CFG + context
    |
    v
[Suppression Filter] --> Hash-based + wildcard (glob/prefix) suppression
    |
    v
[Export] --> CSV, XLSX, JSON, SARIF

Analysis Modules

Tree-sitter parsing (src/analyze/mod.rs).

Fast, incremental, error-tolerant C parsing. Each .c file is parsed into an AST; the orchestrator coordinates prescan, CFG construction, dataflow, and per-rule evaluation with optional Rayon parallelism.

Cross-file pre-scan (src/analyze/prescan.rs, context.rs).

Walks -d directories collecting function definitions, header prototypes, function summaries, call graphs, macro constants/aliases, struct field types, global constants, and global pointer null states. Second pass aggregates call-site argument null states and propagates transitive frees through parameter pass-through chains (max 8 iterations). Results stored in ProjectContext, optionally cached to binary (--save-prescan / --load-prescan). Consumed by 15+ rules.

Function summaries (src/analyze/function_summary.rs).

Lightweight inter-procedural summaries computed during prescan: frees_params, can_return_null, returns_allocation, checks_null_params, modifies_params, dereferences_params, never_returns, callsite_param_null_states (aggregated from all call sites), callsite_param_field_null_states (struct field propagation), callsite_param_pointee_null_states (pointer-to-pointer propagation), return_range (VRA inter-procedural), param_passthroughs (transitive free tracking). Consumed by 7 rules.

Control-flow graphs (src/analyze/cfg.rs).

Per-function CFG with basic blocks, typed edges (Fallthrough, TrueBranch, FalseBranch, BackEdge, Return, Break, Continue, Goto), and condition_range metadata for path-sensitive edge refinement. Optional macro-constant-aware construction for dead-branch elimination. Consumed by 8 rules (INT30/31/32/33/34-C, EXP33/34-C, MEM01-C).

Null state dataflow (src/analyze/null_state.rs).

Forward dataflow on CFG with NullState lattice (Unknown → DefinitelyNull / PossiblyNull / NotNull). Edge refinement on branch conditions supports compound || / && expressions. Seeded from global pointer states, call-site parameter states, and function summaries. Primary consumer: EXP34-C; also used by API00-C.

Value range analysis (src/analyze/value_range.rs).

CFG-based forward value-range dataflow for integer variables. Tracks TypedRange (interval + signedness/bit-width) per variable. Handles sequential assignments, conditional narrowing, loop bounds, and early-return guards. Inter-procedural return ranges from function summaries. Consumed by INT30/31/32/33/34-C.

Constant evaluation (src/analyze/const_eval.rs).

Syntactic constant folding of #define macro constants and arithmetic expressions. Includes built-in C99 <limits.h>/<stdint.h> macros (LP64 model). try_evaluate_range() computes value ranges from constants + variables + loop bounds via ancestor walks. Consumed by 11 rules (INT, ENV, ERR, FIO, FLP, STR families).

Reaching definitions (src/analyze/dataflow.rs).

Standard iterative worklist algorithm computing which definitions (Declaration, Assignment, Parameter, NullAssignment, FreeCall, NullableCall) reach each program point. Supports use-after-free and null dereference queries. Primary consumer: MEM01-C.

Initialization state (src/analyze/init_state.rs).

Forward dataflow tracking initialization status with malloc-aware semantics (Uninitialized, MaybeUninitialized, Initialized, MallocUninitialized, MallocInitialized). Detects partial-init patterns in loops. Primary consumer: EXP33-C.

Standard function database (src/utility/cert_c/std_functions.rs).

~370 C11, POSIX, and Windows API functions recognized to suppress false positives on standard library calls (DCL31-C, DCL07-C).

Suppression system (src/analyze/suppression.rs).

Inline // SQC-SUPPRESS comments and .sqc-suppress.toml files. SHA-256 hash-based point suppressions and glob/prefix wildcard suppressions.

Current Capabilities

Capability

Implementation

Local variable/type inference

Per-function collect_variable_types

Preprocessor block traversal

preproc_* node recursion

Standard function database

~370 C11/POSIX/Windows functions

Cross-file function scanning

-d flag pre-scan with binary cache

CFG construction

Per-function with condition_range metadata

Reaching definitions

Iterative worklist dataflow (MEM01-C)

Inter-procedural summaries

Null returns, freed params, no-return, return ranges, dereferences, pass-throughs

CFG-based null state dataflow

Forward dataflow with NullState lattice, compound condition support, global/call-site seeding

Value range analysis

CFG-based forward dataflow, inter-procedural return ranges, type-aware intervals

Initialization state analysis

Forward dataflow with malloc-aware semantics

Constant evaluation

Macro resolution, built-in limits, sizeof types

Call-site null propagation

Aggregated argument states across all callers

Transitive free propagation

Parameter pass-through chains (MEM31-C)

Global pointer null state

Cross-file extern pointer tracking (EXP34-C)

Struct field type resolution

Prescan-collected struct definitions

Taint tracking

Intra-function (FIO30-C, STR02-C)

Dead-branch elimination

Macro-constant-aware CFG construction

Known Limitations

Gap

Impact

No preprocessor expansion

Macros appear as function calls; partially mitigated by collect_macro_aliases

No alias analysis

Pointer aliasing unresolved; field-scoped alias collection causes cross-function issues

No symbolic execution

Complex path conditions not evaluated

No SSA form

No use-def chains beyond reaching definitions

VRA intra-procedural only

Inter-procedural argument ranges and field-sensitive VRA not implemented; return ranges available

Limited taint tracking

Intra-function only (STR02-C, FIO30-C); cross-function taint for injection CWEs planned

Struct field tracking limited

Prescan-visible structs only (INT32-C/INT30-C); no field-level free or null tracking

No ownership model

Cross-function memory ownership untracked; limits MEM31-C/MEM30-C precision

Architectural Ceiling

Current TP rate: 67.5% (Juliet, v0.3.119, 74 CWEs). The remaining gaps are concentrated in CWEs requiring deeper analysis:

  • CWE-190/191 (integer overflow/underflow): 60.9%/55.3% vs clang-tidy 94%. Requires more complete value-range propagation and bounds-check recognition.

  • CWE-369 (divide by zero): 56.0% vs clang-tidy 94.7%. Requires stronger zero-value tracking through assignments.

  • CWE-476 (null dereference): 61.9% vs clang-tidy 94.3%. Requires deeper inter-procedural null propagation and alias analysis.

  • CWE-121 (stack buffer overflow): 57.5% vs clang-tidy 86.6%. Requires symbolic buffer size tracking across assignments.

Alias analysis and field-sensitive value tracking are the two capabilities most likely to lift the ceiling. Each would require significant architectural investment but could push TP rate toward 75%+.

Competitor Landscape

5-tool comparison on 15 overlapping Juliet CWEs (28,488 files):

Tool

Detection Rate

FP Rate

Analysis Depth

Price

clang-tidy

91.6%

0.8%

AST + path-sensitive

Free

SqC

67.5%

32.5%

AST + CFG + inter-procedural

Frama-C

61.0%

39.0%

Abstract interpretation

Free

Infer

43.6%

56.4%

Separation logic

Free

cppcheck

36.4%

63.6%

Data-flow

Free

SqC results from v0.3.119 Juliet benchmark (74 CWEs). Competitor figures from prior study on 15 overlapping CWEs.

SqC achieves 100% precision (zero FP) on 34 CWEs including CWE-690, CWE-761, CWE-78, and CWE-416. Broadest CWE coverage (74+ CWEs benchmarked vs clang-tidy’s 15).

Key context: Tools on average find ~20% of weaknesses in Juliet (ISSTA2022). Even commercial tools miss 27% (Goseva2015). Industry FP target for adoption is 10–20%. See Bibliography for full references.