EVERY M&A DEAL RUNS ON THREE RISKS: BAD DATA, LIMITED TIME, AND MISSED DISCREPANCIES.
Buy-side teams inherit the seller's data room. The problems are already inside.
BAD DATA+
Sellers present financials across dozens of PDFs, Excel exports, and management accounts. Reference numbers don't match. Amounts differ by rounding. Duplicate invoices inflate revenue. Manual cross-referencing catches 60–70% on a good day.
LIMITED TIME+
Buy-side analysts burn 40–80 hours per deal manually reconciling ledger entries against data room documents. With compressed deal timelines and lean team sizes, bandwidth is the bottleneck — not insight.
MISSED DISCREPANCIES+
The gaps that matter — deferred revenue, timing adjustments, undisclosed obligations — surface at closing, in rep & warranty claims, or after the wire clears. By then, negotiation leverage is gone.
HOW WE WORK
CONNECT VDR
Link to the seller's data room via Box, Dropbox, or Datasite — or point STET at a local folder. Files are pulled and processed entirely on-device. OAuth2 + PKCE. Nothing uploaded.
RUN RECONCILIATION
STET's 5-pass engine matches every transaction: exact hash dedup → fuzzy dedup → trigram Jaccard similarity → content fingerprint → HNSW semantic deep-dive. Handles 100k+ row ledgers and TB-scale data rooms.
TRIAGE FINDINGS
Every flagged discrepancy links back to its source document in the data room. Annotate status, set priority, track resolution. Export a multi-tab Excel workbook or PDF report ready for your QoE memo or IC presentation.
THE CIM MAKES CLAIMS. STET CHECKS THEM.
Every CIM tells a story. Revenue growth, EBITDA margins, customer concentration, working capital normalization. STET reads the CIM, extracts those claims, and then asks the VDR to prove them.
Claims that can't be supported by source documents in the data room are flagged for analyst review — with a link to the CIM page that made the claim and the closest matching (or missing) evidence in the VDR.
This is how buy-side teams build their initial request list: not by reading the CIM and guessing what might be missing, but by running STET and knowing exactly what isn't supported.
CIM Ingestion
Upload the CIM PDF. STET extracts financial claims — revenue figures, margins, headcount, customer concentration, normalized EBITDA — from narrative text and financial tables.
Claim Verification
Each extracted claim is checked against the VDR: supporting documents, management accounts, signed financials. Verified, partially supported, and unsupported claims are categorized separately.
Request Generation
Unsupported or conflicting claims become your initial diligence request list. Send to the seller with document references and expected evidence format — no manual drafting required.
Recheck on Response
When the seller uploads documents in response to your requests, STET reruns the verification automatically and updates claim status. You see what changed and what's still open.
FAST ENOUGH TO RUN BEFORE THE CALL.
85-page CIM. 220 data points extracted.
STET parsed an 85-page Confidential Information Memorandum — financial tables, narrative claims, operating metrics — and extracted 220 structured data points in under two seconds. Every extracted claim is traceable back to its source page.
4 GB data room. 0.5 MB ledger. Reconciled.
A 4-gigabyte virtual data room — hundreds of PDFs, Excel exports, and management accounts — reconciled against a half-megabyte financial ledger in five seconds. Every transaction matched, every discrepancy flagged with a link to its source document.
BUILT FOR BUY-SIDE DEAL TEAMS.
VDR Ingest
Connect directly to Box, Dropbox, or Datasite — or point STET at a local folder. Files are processed on-device. Nothing uploaded.
OAuth2 + PKCE. Handles PDF, Excel, CSV, PPTX, DOCX.
5-Pass Matching Engine
Exact hash dedup → Fuzzy dedup → Trigram Jaccard similarity → Content fingerprint → HNSW semantic deep-dive. Every transaction matched with a link to its source.
Handles ledgers with 100k+ rows. Processes TB-scale data rooms.
CIM Claim Extraction
Upload the CIM. STET extracts revenue figures, margin claims, headcount, working capital normalization — then checks each claim against the VDR.
Unsupported claims become your initial diligence request list.
Discrepancy Triage
Every flagged item links back to its source document in the data room. Annotate status, set priority, and track resolution through close.
Filter by amount, type, status. Export triage directly to Excel.
Audit Certificate
SHA-256 hash-chained log of every match, discrepancy, and analyst annotation. Export a signed certificate for your deal file or lender package.
Reproducible and verifiable. Accepted by Big 4 audit teams.
Structured Export
Multi-tab Excel workbook and PDF formatted for QoE memos, working capital analyses, and IC presentations. Ready to attach to your deal file.
CSV export, issue packs, audit log — all included.
ZERO-KNOWLEDGE BY DESIGN. NOT BY POLICY.
Every other diligence tool asks you to upload deal documents to a server they control. That is a legal problem on day one: the moment you upload files from an NDA-protected data room, you've potentially breached your confidentiality obligations.
STET processes everything on your machine. Source text, extracted data, embeddings, match results — none of it is transmitted anywhere. The only thing that touches our infrastructure is your license check at startup.
This isn't just a privacy preference. It's the only architecture that is structurally compatible with deal confidentiality. It also means there is no breach surface: we cannot lose your clients' data because we never have it.
NDA Compliant from Day One
No upload ever occurs. STET reads files directly from disk or VDR API — source text never leaves your machine. Your confidentiality obligations are structurally preserved, not just promised.
No Breach Surface
We cannot expose your clients' deal data because we don't store it. There is no database of extracted documents, no embeddings server, no match result log on our infrastructure.
Deterministic, Reproducible Results
Every match and discrepancy is produced by a deterministic pipeline — no probabilistic AI that changes answers between runs. The same input always produces the same output. Auditable.
SHA-256 Audit Certificate
Every run produces a hash-chained audit log covering every document processed, every match made, and every discrepancy flagged. Tamper-evident and verifiable by any third party.