Early in ShipDoc development we held a naive belief: if OCR is accurate enough, most of the customer's problem is solved. That turned out to be only half right.
Single-document OCR, with LLMs plus structured output, gets to 95%+ without too much pain. What actually eats ops time sits somewhere else — the 5-10 documents that describe the same shipment must agree on every critical field.
A very typical case: the B/L says "Leather Shoes", the invoice says "Leather Footwear", the packing list says "Shoes". Each on its own is fine; together they trigger a reviewer sanity check — is this really the same shipment?
Our approach is to group all documents for one shipment into an event, then run field normalization and cross-document consistency inside that event. Commodity names go through semantic similarity; HS codes go through a rules table; quantities and weights go through tolerance intervals; dates go through timeline sanity checks. Anything off is highlighted in the UI with the most likely source of error.
The impact is much bigger than squeezing OCR up another percentage point — it converts the reviewer's time-consuming cross-check step into an exceptions-only flow.