Cipher AI·B2B SaaS / Logistics·September 2025

Order Intelligence: LLM Classification for 40k+ B2B Orders/Day

Automated categorization of incoming B2B orders with 94% accuracy, reducing 11h of manual work to 8 minutes.

94%
Accuracy
–98%
Manual Work
40k/day
Throughput
7 weeks
Time-to-Production
Next.jsLangChainFastAPIPostgreSQLClaude Sonnetpgvector

Starting Point

Cipher AI processed 40,000+ B2B orders daily — arriving via email, EDI and unstructured PDFs. Three full-time operations staff spent a combined 11 hours per day sorting each order into one of 47 internal classification buckets. Error rate: 4–6%, downstream costs in the five-figure range per month.

Our Approach

Instead of a classical ML classifier with an expensive labeling pipeline, we built Retrieval-Augmented Classification: embeddings of all historical orders in pgvector, plus an LLM call with the top-5 most similar examples as few-shot context. No model training cycles required — the system learns through new entries in the vector DB.

Architecture

  • FastAPI service with async queue for batch processing
  • pgvector as retrieval layer directly in PostgreSQL (no separate vector DB)
  • Claude Sonnet as classifier with structured JSON output
  • Confidence routing: below 85% confidence → manual review, automatic re-training loops
  • Next.js dashboard for operations team with live metrics + override function

Result

After 7 weeks in production: 94% accuracy (vs. 94–96% manual), 98% reduction in manual work, 40k orders/day stable. ROI achieved after 4 months.

What We Learned

Retrieval-first architectures beat fine-tuning for domain-specific classification tasks almost every time — less infrastructure, faster iteration, and every new category works without retraining.

Similar project?

Let's talk about it.

Start a project