The Multi-Pass Assay
STET's matching algorithm runs in 5 passes, processing matches in order of decreasing certainty to prevent the "greedy trap" where weak early matches steal better candidates.
Overview
Traditional reconciliation systems often use greedy matching—finding any plausible match and moving on. This leads to cascading errors where a mediocre early match prevents a perfect later match.
The Multi-Pass Assay solves this by processing matches in rounds. Each pass has strict criteria, and matched transactions are removed from the pool before the next pass begins.
Pass 1: The Anchor
- Exact amount match (to the cent)
- Exact date match
- Exact description match (after normalization)
100%
Anchor matches are perfect identity matches with zero ambiguity. These form the foundation of the reconciliation.
Pass 2: High Confidence Fuzzy
- Exact amount match
- Exact date match
- Description similarity ≥ 85% (Levenshtein ratio)
85-99%
Catches minor description variations like 'AWS' vs 'AWS Inc' or 'Wire Transfer' vs 'Wire Xfer'.
Pass 2.5: Semantic Match
- Exact amount match
- Exact date match
- Semantic embedding similarity ≥ 85%
85-99%
Uses ML embeddings (sentence-transformers) to match conceptually similar descriptions that fuzzy matching misses. Example: 'AWS' ↔ 'Amazon Web Services'.
ML Enhancement: Semantic matching uses theall-MiniLM-L6-v2model to compute text embeddings and cosine similarity.
Pass 3: The Float
- Exact amount match
- Date within ±3 business days
- Description similarity ≥ 70%
70-99%
Handles timing differences caused by weekends, holidays, or processing delays. A Friday bank transaction might appear on Monday in the ledger.
Pass 4: Redline Analysis
After matching passes, Redline analyzes remaining transactions to flag discrepancies:
- Classification Errors: Same amount + date, but description similarity <40%
- Amount Variance: Same date + description, but different amounts
- Timing Mismatch: Same amount + description, but dates >3 days apart
- Missing Entries: Transactions in one source with no candidate in the other
Discrepancy Types
Transaction exists in one source but has no match in the other. May indicate a recording error or fraud.
Same transaction, but dates differ by more than 3 business days. Often benign but worth reviewing.
Same transaction description and date, but amounts don't match. Requires investigation.
Same amount and date, but descriptions are significantly different. Could be miscategorization.
Confidence Scores
Every match includes a confidence score from 0.0 to 1.0:
Anchor or near-exact matches
Strong fuzzy or semantic matches
Float matches with date variance
Flagged for review (usually discrepancies)