Case Studies

We OCR'd 40 Receipts With AI. Here's What It Actually Cost.

· 6 min read

Document processing is one of the first things businesses try to automate with AI. Receipts, invoices, delivery dockets — the kind of paperwork that piles up and eats hours. But when you sit down to build it, the questions start immediately: which model? How much will it cost at scale? Do I need the expensive one?

We ran a real test to find out. Forty receipts — fuel dockets, hardware store purchases, auto repair invoices, restaurant bills, and petrol station slips — processed through two different Claude models to measure exactly what you get for your money.

The Setup

Our test set was deliberately messy. Not clean scans from a flatbed scanner, but the kind of images businesses actually deal with:

We sent each image to Claude's vision API with a structured extraction prompt asking for vendor name, date, total, line item count, and a quality assessment. Every image was processed twice — once with Haiku (the fast, cheap model) and once with Sonnet (the more capable, more expensive one).

What It Cost

The total bill for processing 40 receipts through both models: $0.36.

Broken down:

Sonnet costs exactly 3.1x more than Haiku. At scale, that gap matters: processing 10,000 receipts costs $22 on Haiku versus $67 on Sonnet. But the question is whether Sonnet earns that premium.

Where Sonnet Justified the Price

In our workshops, we often tell clients that the expensive model isn't always the right model. But this test revealed genuine differences that matter for production systems.

Vendor identification was measurably better. On a crumpled receipt with faded thermal ink, Haiku returned "Unknown" for the vendor. Sonnet correctly identified the business name and location. On an angled photo of a hardware store receipt, Haiku misread the store name entirely — returning a garbled version. Sonnet got it right.

Across all 40 receipts, Sonnet consistently returned more complete vendor names. Where Haiku would return just "7-Eleven", Sonnet returned "7-Eleven (Carrara South, Store 4237)". Where Haiku said "Bunnings", Sonnet said "Bunnings Warehouse - Burleigh Waters". For businesses that need to categorise expenses by location or match receipts to supplier records, this detail saves manual cleanup.

Sonnet also used more output tokens — 4,702 versus Haiku's 4,074 across the batch. That 15% increase reflects the richer detail in its responses, not wasted verbosity.

Where Haiku Was Good Enough

For the core extraction task — date, total, item count — both models performed almost identically. Haiku correctly pulled the total amount and date from 39 out of 40 receipts. The one failure was a phone screenshot so compressed that neither model could read it.

If your workflow is "extract the total and date, match it to a bank transaction," Haiku at $0.002 per receipt is the clear winner. You're paying 3x less for essentially the same result on the fields that matter most for reconciliation.

The Compression Surprise

We also tested what happens when you send raw, uncompressed photos versus resized images. The iPhone photos in our set were up to 4284x5712 pixels — about 3-5MB each as JPEGs.

The results were counterintuitive:

Always resize images before sending them to a vision API. You save bandwidth, avoid upload failures, and pay exactly the same token cost. There is no benefit to sending larger files.

A Practical Pipeline

Based on these results, we'd recommend a two-tier approach for any business processing documents at volume:

  1. Resize all images to 1500px maximum on the longest side before sending to the API. This is a one-line operation in any image library and eliminates upload failures entirely.
  2. Run Haiku first on everything. At $0.002 per document, you can process thousands for the cost of a coffee.
  3. Flag low-confidence results — where the vendor is "Unknown" or the model reports difficulty reading the image.
  4. Re-process flagged items with Sonnet only. In our test, roughly 10% of receipts would benefit from the upgrade.

This hybrid approach gives you Sonnet-level accuracy at near-Haiku cost. For 10,000 receipts, you'd spend approximately $28 instead of $67 — a 58% saving with the same quality outcome. We've seen similar patterns in fraud detection workflows where a cheap first pass filters the workload for a more capable model.

What This Means for Your Business

The cost of AI document processing has dropped to the point where it's cheaper than the human time spent opening an envelope. At $0.002 per receipt, a business processing 500 receipts a month spends $1 on extraction. Even with Sonnet on every image, it's $3.50.

The real cost isn't the API — it's the engineering time to build a reliable pipeline. Resize images, handle errors, validate outputs, match to accounting records. The model choice is the easy part. The lesson from this test is that you don't need to agonise over it: start with the cheap model, measure the gaps, and upgrade selectively where the data tells you to.

Continue Reading

Related articles worth reading next

These are the closest practical follow-ons if you want to go deeper on this topic.

Need help deciding what to build or teach first?

We help teams choose the right next step, whether that is training, workflow design, or a system built for a specific business problem.

Book a call See services

This article was reviewed, edited, and approved by Tahae Mahaki. AI tools supported research and drafting, but the final recommendations, examples, and wording were refined through human review.