Unsiloed AI — Investment Memo · Orange Collective

Thesis

Unsiloed AI is an API that converts complex multimodal documents (PDFs, scans, slides, charts) into structured, LLM-ready data with a vision-first pipeline — not an LLM-first one — built for speed, determinism, and enterprise deployment (incl. air-gapped on-prem).^[1] The market is shaping up like AI code (Codex / Cursor / Replit) and AI legal (Harvey / Eve / Crosby) — an industry, not a winner-take-all market. Unsiloed is the horizontal ingestion layer for regulated, multimodal workflows.^[19]

01
Deterministic, auditable parsing is mandatory for regulated AI. Banks, insurers, legal, and healthcare teams need reproducible, explainable outputs with layout preservation and confidence scoring. LLM-only pipelines are probabilistic, drift with model updates, and are hard to audit. Unsiloed produces deterministic, schema-aligned, citation-backed outputs — and the accuracy claim is now benchmarked: Parser v3.1 posted the top strict pass rate (88.0) on olmOCR-Bench in May 2026, ahead of GPT-5.5 (84.6), Reducto (66.0), and Unstructured (39.9).^[1]^[20]
02
Vision-first beats LLM-first on speed and cost. Vision models parallelize on GPUs; 7B-scale LLMs in the extraction loop drive higher latency, nondeterminism, and per-page token cost. Unsiloed's ~$0.01/page economics undercut Reducto's ~2× per-page pricing.^[15]
03
A proprietary corpus becomes a compounding moat. 1M+ real multimodal documents and domain ontologies for finance, legal, and healthcare. Customer-specific post-training on low-confidence fields and corrections feeds back into the model — a reinforcement loop that compounds over time.
04
The team is the canonical technical pair for this stack. Aman (ultra low-latency trading in C++/Rust; AI copilots for Goldman / Schwab) and Adnan (MIT; multimodal models at a Fortune 10; autonomous perception at Mercedes). Both IIT Kharagpur. Already shipping into Fortune 150 banks and NASDAQ-listed enterprises.^[1]

Problem

AI teams spend 6+ months building document workflows. Fewer than 10% reach production.

Generic LLM parsers and OCR collapse on multimodal documents that contain text, tables, images, and charts. Poor parsing and suboptimal chunking cripple RAG pipelines and downstream automation. The proof-of-concept demo passes; the production rollout doesn't.^[1]

Financial, insurance, legal, and healthcare documents are not text-only. They frequently contain charts, infographics, styled text, footnotes, merged cells, multi-page tables, color-encoded semantics, and irregular multi-column layouts. These structures carry meaning that generic LLM parsers routinely miss, conflate, or hallucinate.

More importantly, LLMs cannot parse these multimodal elements deterministically — making them unsuitable for high-stakes, auditable extraction. A bank's reconciliation pipeline can't tolerate non-determinism; an insurance claim adjudicator can't accept "the parser sometimes flips merged cells."

80%

Of enterprise data

Is unstructured · only a fraction is analyzed

<10%

Of doc workflows

Reach production after 6+ months of build

1M+

Documents in corpus

Proprietary, multimodal, domain-tuned

Why Now

AI document extraction is an industry, not a market.

Like AI code (Codex / Cursor / Replit) and AI legal (Harvey / Eve / Crosby), document extraction will support multiple horizontal infra players and vertical specialists. Unsiloed is positioned as the horizontal ingestion layer for regulated, multimodal workflows.

Industries, not markets. AI categories like code, legal, and document extraction are not winner-take-all — they support multiple horizontal infrastructure players and vertical specialists.

Anish Acharya^[19]

General Partner · Andreessen Horowitz

The unstructured-to-AI layer is becoming core infra.

The data problem is enormous. 80% of enterprise data is unstructured; only a fraction is analyzed. Turning documents into AI-ready data is becoming as fundamental as databases were in the last era.^[2]

IDP is growing at 26%+ CAGR. Fortune Business Insights pegs Intelligent Document Processing at $10.6B in 2025, growing to $91B by 2034 — with North America holding ~48% share and banking and financial services leading adoption.^[3]

Capital has validated the category — fast. Reducto closed a $75M Series B led by a16z in October 2025 ($108M total), reporting 6× volume growth in five months and close to a billion pages processed monthly, with customers like Harvey, Rogo, and Scale AI.^[6]^[24] LandingAI raised a Series B in September 2025 ($57M total) behind Andrew Ng's Agentic Document Extraction.^[26] LlamaParse has now processed 1B+ documents for 300k+ users.^[32] When a16z, Benchmark, and Menlo all fund the same layer within 18 months, the layer is real.

The teams that bet on LLM-only extraction in 2024 are circling back. Document AI spend is rising as enterprises move from pilots to production RAG and agents — and discovering they need the deterministic infra layer underneath.^[4]

Industries, not markets. AI document extraction will support multiple horizontal infrastructure players and vertical specialists — just like AI code and AI legal.

— Anish Acharya, General Partner · a16z^[19]

Product & Technology

Segment visually → preserve structure → decode deterministically.

The full pipeline is engineered around the principle that LLMs should not be in the extraction loop for regulated workflows.

Multimodal strengths where generic OCR + LLM parsers fail.

Charts and infographics. Unsiloed reads tables, charts, and infographics directly — extracting values from axes, legends, and series. Generic OCR collapses to raw text; generic LLM parsers hallucinate the numbers. Unsiloed treats the chart as the structured object it actually is.

Long-tail layouts. A proprietary corpus of 1M+ real multimodal documents and domain ontologies for finance, legal, and healthcare enables higher fidelity on long-tail structures — nested tables, multi-page figures, format-encoded semantics — that generic models consistently miss.

Synthetic post-training. The team also post-trains on synthetically generated multimodal datasets that mimic rare layouts, edge cases, and domain-specific templates — expanding coverage where real-world labeled data is sparse.

Forward compatibility. The architecture is model-agnostic. It can incorporate emerging OCR-free vision-RAG (ColPali) and VLM components as they mature — without abandoning the deterministic decoding and confidence scaffolding that probabilistic LLM-only stacks fundamentally lack.^[14]

olmOCR-Bench — strict pass rate, May 2026

Chart

Unsiloed Parser v3.1 leads 19 systems at 88.0 across 1,403 PDFs and 8,413 unit tests (olmocr==0.4.27 scorer) — ahead of frontier VLMs (GPT-5.5: 84.6, Claude Opus 4.7: 81.9, Gemini 3 Pro: 77.7) and funded direct competitors (LlamaParse: 73.5, Reducto: 66.0, Extend: 64.0). Re-scoring failures with an LLM-as-judge lifts Unsiloed to 94.8. Caveat: this is a vendor-run evaluation, though the scorer is deterministic and reproducible.^[20]

Source · Unsiloed AI olmOCR-Bench publication, May 2026 [20] · Ai2 olmOCR-Bench [21]

A benchmark the field actually competes on.

olmOCR-Bench (from Ai2) has become the de facto public scoreboard for document parsing — Datalab, LlamaIndex, and the model labs all publish against it.^[21]^[22]^[23] Unsiloed's weakest sub-category is old scans (52.9); its strongest are exactly the ones regulated buyers care about: tables (93.2), multi-column layouts (87.9), and headers/footers (94.6).^[20]

The result that matters most for the thesis: the two best-funded direct competitors — Reducto ($108M raised) and Extend — scored 66.0 and 64.0 on the same run. Even discounting for vendor selection effects, a 20+ point gap on a deterministic scorer is not noise. It is the kind of gap that wins bake-offs.^[20]^[24]

Traction

Already shipping into the buyers most others can't access.

$24k

MRR (~$300k ARR)

Profitable at this scale

14

Paying customers

Fortune 150 bank · NASDAQ-listed cos. · 10+ YC startups

100%

Daily API use

Every paying customer uses the API every day

Pipeline depth that unlocks 6- to 7-figure ACVs.

Volume. Millions of pages processed weekly. The API is in the hot path for production reconciliation, document review, and RAG ingestion at Fortune 150 banks and NASDAQ-listed enterprises.^[1]

Pipeline. 120+ companies in pipeline. 15 ongoing pilots — including Rippling and a large public tech company. Single bottoms-up logos in finance and legal land at tens to hundreds of thousands per account; Fortune 500 deployments scale into 7-figure ACVs across business units.

Signed enterprise LOI. $500k LOI with a global bank. This is the early enterprise signal that the deterministic, auditable, air-gapped product positioning lands with the buyers Reducto and LlamaParse are chasing.^[1]

Momentum since the memo was written. The company closed a $500K seed in September 2025^[29], shipped Parser v3.1 to the #1 strict pass rate on olmOCR-Bench in May 2026^[20], published an April 2026 head-to-head parser comparison that doubles as developer-facing GTM^[33], and added a native Claude integration for parsing, extraction, classification, and splitting inside Claude document workflows.^[29] All of this on four people and half a million dollars.

Market

Unstructured-to-AI is the next core infrastructure layer.

80–90% of enterprise data is unstructured, growing ~3× faster than structured data — and only a fraction is analyzed. Turning documents into AI-ready data is becoming as fundamental as databases were in the last era.^[2]^[31] Every production RAG system, every vertical AI agent, every regulated automation pipeline needs the ingestion layer underneath.

IDP: $10.6B (2025) → $91B (2034) at a 26.2% CAGR, per Fortune Business Insights — with North America holding ~48% share and banking and financial services leading adoption, followed by healthcare and legal.^[3] Document AI alone is roughly $12–13B in 2024 → ~$27B by 2030 as enterprises move from pilots to production.^[4]

Competitive landscape

Four categories of competition. Unsiloed wins on determinism, throughput, and air-gapped deployment.

The market splits into LLM-centric specialists, OSS / DIY toolkits, hyperscaler APIs, and — increasingly — frontier VLMs used directly. As of the May 2026 olmOCR-Bench run, Unsiloed outscores all four.

Document-AI parsing — total capital raised

Chart

Reducto $108M^[24] · Unstructured $65M^[25] · LandingAI $57M^[26] · LlamaIndex $27.5M^[27] · Extend $17M^[9] · Tensorlake $8M^[28] · Unsiloed $0.5M^[29]. The asymmetry cuts both ways: rivals can outspend Unsiloed on GTM, but Unsiloed topping the category benchmark on 1/200th of Reducto's capital is the capital-efficiency signal we underwrite at pre-seed.

Source · Company announcements, Tracxn, PitchBook — see refs [9] [24] [25] [26] [27] [28] [29]

Our APIs are already parsing hundreds of thousands of documents for startups and NASDAQ-listed enterprises, powering vertical AI solutions across industries.

— Unsiloed AI launch post^[1]

Strategic advantages

Moat

Deterministic + confidence-scored. Matches the audit and governance posture regulated buyers actually require.
Benchmark leadership. #1 strict pass rate on olmOCR-Bench (88.0, May 2026) — 22 points clear of Reducto on the same deterministic scorer.^[20]
Vision-first cost / throughput. ~$0.01/page vs. LLM-centric incumbents — unit economics compound with volume.^[15]
Proprietary corpus + domain decoders. 1M+ real multimodal documents and finance / legal / healthcare ontologies create a data moat that compounds with every customer.
Air-gapped on-prem. The deployment posture that unlocks BFSI procurement — one that Reducto and LlamaParse don't lead with.

Founder deep dive

The canonical technical pair for vision-first document AI.

The shared foundation. Both Aman and Adnan are IIT Kharagpur alumni — one of the densest concentrations of systems and ML engineering talent in the world. They bring complementary skill sets to a problem that requires both extreme-performance systems engineering and applied multimodal ML research.

Aman — low-latency systems + AI copilots in regulated finance. Started building software systems for high-frequency trading at Teesta Investment after IIT Kharagpur — multi-threaded C++ and Rust optimized for ultra-low-latency execution moving billions on crypto exchanges. Then went founding engineer (#1) at a stealth SF startup building AI copilots for institutions like Goldman Sachs and Charles Schwab — exactly the regulated-finance design constraints Unsiloed now serves. He has lived inside the requirements: deterministic, auditable, integrated with legacy compliance flows.

Adnan — multimodal ML at Fortune 10 scale + autonomous perception. IIT Kharagpur → MIT Masters. Built multi-modal models deployed at a Fortune 10 company. Then was building autonomous navigation systems at Mercedes-Benz R&D — perception pipelines that must hold up under real-time, safety-critical constraints. This is the exact skill stack Unsiloed needs: vision-first models that are layout-aware, domain-tuned, and deployable in regulated environments.

The thesis they bring. Unsiloed's published positioning emphasizes that LLMs cannot deterministically parse multimodal documents — and that the right answer is specialized vision models combined with OCR-based models, dual-stream representation (data + layout), and domain-specific decoders trained with RL. This is not a wrapper company. It is a vision-model and infrastructure company, with founders whose careers were already pointed at this problem.^[1]

Why now — and why them. Aman is publicly writing about vision models for enterprise documents (Forbes Business Council). Adnan is building the technical strategy for accuracy-sensitive deployments in finance, legal, and healthcare — including on-prem and air-gapped options that match the realities of BFSI procurement. Together they are the canonical pair for this exact category at this exact moment.

Founders

Aman Mishra

Founder

Co-founder at Unsiloed AI • IIT Kharagpur Previously built an ultra low-latency trading system moving billions at a hedge fund. Founding Engineer (#1) at an SF-based stealth startup building AI copilots for firms like Goldman Sachs and Charles Schwab. Launched a P2P rental platform from my dorm room, scaling it to thousands of orders within 2 months of operation.

Adnan Abbas

Founder

Co-founder & CTO at Unsiloed AI • MIT • IIT Kharagpur Built multi-modal models deployed at a Fortune 10 company. Was building autonomous navigation systems at Mercedes Benz. Launched India's first Web 3.0 audio app while in college, scaling it to thousands of users within a month.

Aman Mishra

Co-founder & CEO

IIT Kharagpur (B.Tech, Industrial & Systems Engineering, CS minor). Previously built ultra low-latency C++/Rust trading systems moving billions at a hedge fund. Founding Engineer (#1) at an SF-based stealth AI copilot startup serving Goldman Sachs and Charles Schwab. Launched a P2P rental platform from his dorm room, scaling it to thousands of orders within 2 months. Forbes Business Council contributor on vision models for enterprise documents.

Adnan Abbas

Co-founder & CTO

IIT Kharagpur (B.Tech) → MIT (Masters). Built multi-modal models deployed at a Fortune 10 company. Was building autonomous navigation systems at Mercedes-Benz R&D. Launched India's first Web 3.0 audio app while in college, scaling it to thousands of users within a month. Leads technical strategy for vision-first, layout-aware multimodal models — including on-prem / air-gapped enterprise options.

Risks & mitigations

Risk

Reducto's scale and GTM outspend in enterprise — $108M raised through its October 2025 Series B (a16z), ~1B pages/month, and marquee customers like Harvey, Rogo, and Scale AI.

Mitigation

Win bake-offs in finance and legal via deterministic accuracy, chart and table fidelity, and on-prem deployment — the May 2026 olmOCR-Bench gap (88.0 vs. 66.0) is the wedge. Leverage cost and throughput edge for high-volume deals — Unsiloed's ~$0.01/page economics undercut Reducto's ~2× per-page pricing.

Risk

Open-source commoditization from Unstructured.io and IBM Docling — DIY teams may settle for 'good enough' free tools.

Mitigation

Offer SLA'd, air-gapped enterprise deployments and maintain an accuracy lead on long-tail documents with proprietary data and domain ontologies (Unstructured scored 39.9 on the same olmOCR-Bench run). Chunkr's pivot away from OSS parsing suggests free tooling alone isn't holding the regulated segment. Reduce integration effort vs. DIY — Unsiloed ships hours-not-months to production.

Risk

Frontier VLMs are closing the gap — GPT-5.5 scored 84.6 and Claude Opus 4.7 scored 81.9 on olmOCR-Bench, within ~3–6 points of Unsiloed's 88.0. Datalab claims the benchmark is approaching saturation.

Mitigation

Keep the architecture model-agnostic and integrate emerging OCR-free vision-RAG and VLM components as they mature. The durable moat isn't the raw score — it's deterministic decoding, schema enforcement, per-field confidence, per-page cost at volume, and air-gapped deployment, none of which a frontier-model API call provides for regulated buyers.

What we're watching

Conversion of the $500k LOI with a global bank into a production contract — and the speed at which it expands across business units.
Whether the 15 ongoing pilots (including Rippling and a large public tech co.) convert at 6- to 7-figure ACVs.
Vertical expansion beyond finance — early signal of healthcare or legal logos at parity accuracy.
Reducto's response: does it cut pricing, deepen on-prem, or push deeper into a specific vertical?
Third-party replication of the May 2026 olmOCR-Bench result — an independent run (Ai2, Datalab, or a customer bake-off) would convert a vendor benchmark into a category fact.
Whether frontier VLM gains (GPT-5.5 at 84.6 and climbing) compress the specialized-parser premium faster than Unsiloed converts determinism, cost, and on-prem into contracted revenue.

References