Order Intelligence: LLM Classification for 40k+ B2B Orders/Day
Automated categorization of incoming B2B orders with 94% accuracy, reducing 11h of manual work to 8 minutes.
Starting Point
Cipher AI processed 40,000+ B2B orders daily — arriving via email, EDI and unstructured PDFs. Three full-time operations staff spent a combined 11 hours per day sorting each order into one of 47 internal classification buckets. Error rate: 4–6%, downstream costs in the five-figure range per month.
Our Approach
Instead of a classical ML classifier with an expensive labeling pipeline, we built Retrieval-Augmented Classification: embeddings of all historical orders in pgvector, plus an LLM call with the top-5 most similar examples as few-shot context. No model training cycles required — the system learns through new entries in the vector DB.
Architecture
- FastAPI service with async queue for batch processing
- pgvector as retrieval layer directly in PostgreSQL (no separate vector DB)
- Claude Sonnet as classifier with structured JSON output
- Confidence routing: below 85% confidence → manual review, automatic re-training loops
- Next.js dashboard for operations team with live metrics + override function
Result
After 7 weeks in production: 94% accuracy (vs. 94–96% manual), 98% reduction in manual work, 40k orders/day stable. ROI achieved after 4 months.
What We Learned
Retrieval-first architectures beat fine-tuning for domain-specific classification tasks almost every time — less infrastructure, faster iteration, and every new category works without retraining.