Document processing is one of the first things businesses try to automate with AI. Receipts, invoices, delivery dockets — the kind of paperwork that piles up and eats hours. But when you sit down to build it, the questions start immediately: which model? How much will it cost at scale? Do I need the expensive one?

We ran a real test to find out. Forty receipts — fuel dockets, hardware store purchases, auto repair invoices, restaurant bills, and petrol station slips — processed through two different Claude models to measure exactly what you get for your money.

The Setup

Our test set was deliberately messy. Not clean scans from a flatbed scanner, but the kind of images businesses actually deal with:

iPhone photos taken at angles, on dashboards, on kitchen benches
Email attachments forwarded from suppliers
Crumpled thermal paper with faded ink
Phone screenshots of digital receipts
One receipt with a hole punched through it

We sent each image to Claude's vision API with a structured extraction prompt asking for vendor name, date, total, line item count, and a quality assessment. Every image was processed twice — once with Haiku (the fast, cheap model) and once with Sonnet (the more capable, more expensive one).

What It Cost

The total bill for processing 40 receipts through both models: $0.36.

Broken down:

Haiku: $0.09 for 40 images — roughly $0.002 per receipt
Sonnet: $0.27 for 40 images — roughly $0.007 per receipt

Sonnet costs exactly 3.1x more than Haiku. At scale, that gap matters: processing 10,000 receipts costs $22 on Haiku versus $67 on Sonnet. But the question is whether Sonnet earns that premium.

Where Sonnet Justified the Price

In our workshops, we often tell clients that the expensive model isn't always the right model. But this test revealed genuine differences that matter for production systems.

Vendor identification was measurably better. On a crumpled receipt with faded thermal ink, Haiku returned "Unknown" for the vendor. Sonnet correctly identified the business name and location. On an angled photo of a hardware store receipt, Haiku misread the store name entirely — returning a garbled version. Sonnet got it right.

Across all 40 receipts, Sonnet consistently returned more complete vendor names. Where Haiku would return just "7-Eleven", Sonnet returned "7-Eleven (Carrara South, Store 4237)". Where Haiku said "Bunnings", Sonnet said "Bunnings Warehouse - Burleigh Waters". For businesses that need to categorise expenses by location or match receipts to supplier records, this detail saves manual cleanup.

Sonnet also used more output tokens — 4,702 versus Haiku's 4,074 across the batch. That 15% increase reflects the richer detail in its responses, not wasted verbosity.

Where Haiku Was Good Enough

For the core extraction task — date, total, item count — both models performed almost identically. Haiku correctly pulled the total amount and date from 39 out of 40 receipts. The one failure was a phone screenshot so compressed that neither model could read it.

If your workflow is "extract the total and date, match it to a bank transaction," Haiku at $0.002 per receipt is the clear winner. You're paying 3x less for essentially the same result on the fields that matter most for reconciliation.

The Compression Surprise

We also tested what happens when you send raw, uncompressed photos versus resized images. The iPhone photos in our set were up to 4284x5712 pixels — about 3-5MB each as JPEGs.

The results were counterintuitive:

Token usage was identical. Claude's API resizes images internally to approximately 1568 pixels before tokenising. A 5MB photo and a 280KB compressed version cost the same number of input tokens.
One image failed entirely. A 5MB JPEG exceeded the API's upload limit and returned a hard error. That receipt was simply lost — no extraction, no fallback.
File size dropped 72% after resizing to 1500px max — from 33MB to 9.4MB across 40 images — with zero impact on extraction quality.

Always resize images before sending them to a vision API. You save bandwidth, avoid upload failures, and pay exactly the same token cost. There is no benefit to sending larger files.

A Practical Pipeline

Based on these results, we'd recommend a two-tier approach for any business processing documents at volume:

Resize all images to 1500px maximum on the longest side before sending to the API. This is a one-line operation in any image library and eliminates upload failures entirely.
Run Haiku first on everything. At $0.002 per document, you can process thousands for the cost of a coffee.
Flag low-confidence results — where the vendor is "Unknown" or the model reports difficulty reading the image.
Re-process flagged items with Sonnet only. In our test, roughly 10% of receipts would benefit from the upgrade.

This hybrid approach gives you Sonnet-level accuracy at near-Haiku cost. For 10,000 receipts, you'd spend approximately $28 instead of $67 — a 58% saving with the same quality outcome. We've seen similar patterns in fraud detection workflows where a cheap first pass filters the workload for a more capable model.

What This Means for Your Business

The cost of AI document processing has dropped to the point where it's cheaper than the human time spent opening an envelope. At $0.002 per receipt, a business processing 500 receipts a month spends $1 on extraction. Even with Sonnet on every image, it's $3.50.

The real cost isn't the API — it's the engineering time to build a reliable pipeline. Resize images, handle errors, validate outputs, match to accounting records. The model choice is the easy part. The lesson from this test is that you don't need to agonise over it: start with the cheap model, measure the gaps, and upgrade selectively where the data tells you to.

We OCR'd 40 Receipts With AI. Here's What It Actually Cost.

The Setup

What It Cost

Where Sonnet Justified the Price

Where Haiku Was Good Enough

The Compression Surprise

A Practical Pipeline

What This Means for Your Business

Related articles worth reading next

How We Built a Fraud Detection System for a Payment Processor — For Under $12/Month

How We Built an Auto-Blogging Pipeline for Nearly Nothing

What Small Businesses Are Actually Doing With OpenClaw

Need help deciding what to build or teach first?