5-Pass Matching Pipeline
STET's reconciliation engine runs in 5 passes, processing matches in order of decreasing certainty to prevent the "greedy trap" where weak early matches steal better candidates from later, more rigorous passes.
Overview
Traditional reconciliation tools use greedy matching—finding any plausible match and moving on. This produces cascading errors where a mediocre early match permanently blocks a perfect later one.
STET solves this by running matching in strict rounds. Each pass applies specific criteria, and matched transactions are removed from the candidate pool before the next pass begins. Deterministic rules always run first; ML is used only as an enhancement for edge cases.
Important: STET is a software tool. Its outputs are technical records of a matching process, not audit opinions or professional certifications. All results require review by a qualified professional before any reliance.
Pass 1: Anchor
- Exact amount match (to the cent)
- Exact date match
- Exact description match (after normalization)
100%
Perfect identity matches with zero ambiguity. These form the foundation of the reconciliation and are excluded from all subsequent passes.
Pass 2: High-Confidence Fuzzy
- Exact amount match
- Exact date match
- Description similarity ≥ 85% (Levenshtein ratio)
85–99%
Catches minor description variations like 'AWS' vs 'AWS Inc' or 'Wire Transfer' vs 'Wire Xfer'. Amount and date must still be exact.
Pass 2.5: Semantic Match
- Exact amount match
- Exact date match
- Semantic embedding similarity ≥ 85%
85–99%
Uses ML embeddings to match conceptually similar descriptions that character-level fuzzy matching misses. Example: 'AWS' ↔ 'Amazon Web Services'.
ML Note: Semantic matching uses the frozenall-MiniLM-L6-v2model run entirely in your browser. Your data never leaves your device during this step. The model produces similarity scores only — it cannot invent or fabricate transactions.
Pass 3: Float
- Exact amount match
- Date within ±3 business days
- Description similarity ≥ 70%
70–99%
Handles timing differences caused by weekends, holidays, or processing delays. A Friday bank transaction might appear Monday in the ledger. The auditor should verify each float match.
Pass 4: Redline
After all matching passes complete, Redline analyzes remaining unmatched transactions and classifies them into discrepancy types for auditor triage. Redline is a flagging tool — investigation and final determination are the auditor's responsibility.
- Classification Errors: Same amount + date, but description similarity <40%
- Amount Variance: Same date + description, but different amounts
- Timing Mismatch: Same amount + description, but dates >3 days apart
- Missing Entries: Transactions present in one source with no candidate in the other
Discrepancy Types
Transaction exists in one source but has no match in the other. May indicate a recording error or fraud. Requires auditor investigation.
Same transaction, but dates differ by more than 3 business days. Often benign (processing delays) but must be reviewed by the auditor.
Same description and date, but amounts differ. Requires auditor investigation to determine root cause.
Same amount and date, but descriptions are significantly different. Could be miscategorization or an unrelated transaction.
Auditor's Responsibility: Discrepancy flags are produced automatically by pattern matching. STET does not determine whether a discrepancy is material, intentional, or an error. That determination belongs solely to the qualified professional reviewing the output.
Confidence Scores
Every match includes a confidence score from 0.0 to 1.0 based on the pass and criteria used. Confidence scores are inputs to the auditor's judgment, not final verdicts.
Anchor or near-exact matches
Strong fuzzy or semantic matches
Float matches with date variance — warrant review
Flagged for auditor triage