{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreigya5dttqxxg6elgzz3qv5wvh5bt5a754ad4mvqaa7ar6cyyd7k7e",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mie4mywnr7d2"
},
"path": "/t/invoice-data-recognition/174564#post_3",
"publishedAt": "2026-03-31T09:49:43.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "What about the costs of shipping, freight (which is different from shipping in our accounting system), and discounts? Those all affect the final invoice amount.\n\nJohn6666 gives good advice as I tried to extract text from 500+ invoices in a single PDF file. It didn’t go well. I would be looking for text like “invoice number \\d{4,6}” where the invoice number was 4-6 digits in a row, but the OCR software would often not extract this data in the same order. So it would not find the invoice number, and many other data.\n\nBut I did that with free OCR software and not with AI designed to do OCR.",
"title": "Invoice Data Recognition"
}