๐ ORDERCAPTURE OCR FOR ERPNext
๐ฅ Upload purchase orders. ๐ Extract line items. ๐งพ Create Sales Orders automatically.
1. Overview
OrderCapture OCR is an ERPNext application that automates purchase order processing using optical character recognition and structured document parsing. Instead of manually reading PDFs or Excel sheets and typing order lines, users upload customer purchase orders and let the system extract customer, address, item, quantity, and pricing details into an interactive review screen.
The app supports multiple file formats such as PDF, XLS, XLSX and CSV, with specialized handling for vendor-specific layouts (for example FlipKart and BB). This allows the system to interpret different marketplace or customer templates without changing your internal ERPNext process.
Once the data is extracted, users verify and adjust it in a guided dialog, compare rates against the customerโs price list, match or create addresses, and then post a clean Sales Order in ERPNext with the original document attached for full auditability.
2. What Problems Does It Solve?
OrderCapture OCR eliminates many manual, repetitive and error-prone tasks in sales order creation:
- โ ๏ธ Reading POs manually from PDFs and spreadsheets โ โ Automated extraction of order details and totals directly from uploaded documents.
- โ ๏ธ Typing each item, quantity and rate into ERPNext โ โ Items are pre-filled into a review table where users only fine-tune or correct specific fields.
- โ ๏ธ Handling different vendor PO formats manually โ โ Vendor models (e.g. FlipKart, BB) help interpret columns and structure for non-PDF files.
- โ ๏ธ Missing or inconsistent customer addresses โ โ Intelligent address matching suggests the closest saved address or prompts to create a new one.
- โ ๏ธ Not catching price mismatches against your price list โ โ Rate comparison with the customer price list and visual highlights for discrepancies.
- โ ๏ธ Accidentally creating duplicate Sales Orders for the same PO โ โ Duplicate PO checks before Sales Order posting.
- โ ๏ธ Slow processing of multiple incoming POs โ โ A dedicated dashboard to upload, review, track status, and create Sales Orders in bulk.
3. Key Benefits
-
๐ High-speed order capture
๐ฌ Cut down the time spent converting customer purchase orders into ERPNext Sales Orders by automatically extracting key fields. -
๐ Lower data entry errors
๐ฌ Data flows from document โ structured JSON โ ERPNext, with humans focusing on review and exceptions instead of manual typing. -
๐ Smarter pricing visibility
๐ฌ Integrated price list checks and row-level highlights make it easy to catch under-billing or rate mismatches before confirming the order. -
๐ Better control & auditability
๐ฌ Each processed document is tracked with status (Pending, Failed, Completed), linked Sales Order, and original file attached for future reference. -
๐ Consistent order structure
๐ฌ Even when POs arrive in different formats, Sales Orders in ERPNext follow your standard structure: correct customer, address, items, and taxes. -
๐ Flexible OCR and parsing options
๐ฌ Support for classic OCR endpoints plus LLM-based structured extraction models (e.g. Gemini) gives you flexibility for complex layouts.
4. Core Features
-
โ Multi-format document upload
Upload purchase orders as PDF, CSV, XLS, or XLSX directly from the OrderCapture OCR dashboard. The uploader supports drag-and-drop and multiple files at once. -
โ Customer-linked uploads
Each upload is tied to a chosen customer, so extracted orders are always associated with the correct ERPNext Customer record. Recent documents auto-suggest the last used customer. -
โ OCR Document Processor doctype
Every file creates an OCR Document Processor record storing file path, customer, date, status, vendor type, processed JSON, and linked Sales Order. This becomes the anchor for processing and tracking. -
โ Interactive processing dialog
A full-screen dialog shows customer details, address, PO metadata, extracted items table, totals, and action buttons (Process File, View File, Fetch Price List Rate, Save Changes, Post Sales Order) with document navigation controls. -
โ Smart extraction based on file type and OCR model
PDF files use the configured OCR model (e.g. Gemini vs default). Non-PDF files use a structured extraction method with vendor_type (FlipKart / BB) passed for correct interpretation. -
โ Vendor-specific logic for Excel/CSV POs
For XLS/CSV formats, users can select a vendor type. The system remembers this choice on the OCR Document Processor and adjusts extraction rules accordingly. -
โ Customer & address enrichment
Customer details are fetched from ERPNext, and intelligent matching logic tries to find best-fit addresses based on similarity, updating the document with both the link and formatted address display. -
โ Item grid with rate comparison and highlighting
Extracted items populate a grid with columns like item code, name, quantity, rate, landing rate, GST, price-list rate, and line total. Rows automatically highlight when the PO rate differs from the price list rate or when mapping is missing. -
โ Totals and tax breakdown
The dialog calculates total item quantity, grand total, total taxes, and net amount to give users quick financial visibility before creating the Sales Order. -
โ Price list rate fetch & validation workflow
With one action, the system fetches price list rates for all items based on the customerโs selling price list, verifies against Customer Item Code Mapping, and marks problematic rows prominently. -
โ Save Changes & reprocessing support
Users can adjust table data and metadata, then save changes back as structured JSON. Documents can be reprocessed if needed without losing context. -
โ Sales Order creation & linking
On confirmation, the app creates a Sales Order, attaches the original file, updates the OCR Document Processor status to Completed, and links the Sales Order back for quick navigation.
5. How It Works
-
๐ฅ Upload customer purchase orders
From the OrderCapture OCR dashboard, users select a customer and upload one or more PO files (PDF, XLS, CSV, XLSX). Each file creates an OCR Document Processor record with status Pending. -
๐ Open the processing dialog
Clicking โProcessโ on a document opens a full-screen dialog showing file details, customer info, address, and an initially empty items table. -
๐ค Run OCR / structured extraction
The user clicks โProcess Fileโ. Depending on the file extension and configured OCR model, the app either calls a PDF OCR engine or an Excel/CSV parser with vendor-specific logic. The system stores the processed JSON and fills the item table. -
๐งพ Verify customer, address, items and prices
Users review the extracted data, run price list checks, resolve mapping issues, and adjust quantities or rates where necessary. Totals and tax fields update automatically. -
๐พ Save changes & finalize structure
When satisfied, users save changes. The current table and header data are written back into structured JSON, ensuring a clean snapshot of the order. -
๐งฎ Create Sales Order in ERPNext
Clicking โPost Sales Orderโ transforms the structured OCR data into a new Sales Order, attaches the original file, and updates the status to Completed.
6. Common Use Cases
- ๐ฅ Processing daily incoming purchase orders from key customers
- ๐งพ Capturing marketplace / e-commerce POs (FlipKart, BB, etc.) into ERPNext
- ๐ Monitoring price adherence vs customer-specific price lists
- ๐ Speeding up order entry for high-volume B2B customers
- ๐ Reprocessing or correcting previously uploaded documents with improved mappings
7. Who Is It For?
- ๐จโ๐ผ Sales Operations teams handling large numbers of customer POs
- ๐งพ Finance teams who want consistency between POs and Sales Orders
- ๐ฆ B2B companies dealing with marketplace or portal-based purchase orders
- ๐งโ๐ป ERPNext admins seeking to reduce manual order entry workload
- ๐ Businesses scaling order volume and wanting automation without losing control
Industries:
๐ญ Manufacturing
๐ฆ Wholesale & Distribution
๐ E-commerce & Marketplaces
๐ข B2B Services
๐ Any business receiving recurring purchase orders from customers
8. Setup & Onboarding
- ๐ง Install the OrderCapture OCR app on your Frappe/ERPNext instance.
- โ๏ธ Configure Order Capture OCR Configuration (OCR model, default settings).
- ๐งฉ (Optional) Define vendor models and mappings for special PO formats (e.g. FlipKart, BB).
- ๐ฅ Ensure customers and basic item masters are created in ERPNext.
- ๐ฅ Open the Order Capture OCR page, select a customer and upload test files.
- ๐ Process 1โ2 sample documents, review extraction, and verify rates using price list comparison.
- ๐งพ Post a Sales Order from a processed document and validate downstream flow (delivery, invoice, etc.).
- ๐ Roll out to daily operations for the defined customers and vendor types.
9. Technical Details (Admins Only)
๐งฉ Architecture
โ Vue.js front-end dashboard embedded in a Frappe page
โ OCR Document Processor doctype as central record
โ Multiple client-side components (document loader, table handler, navigation handler, sales order handler, save handler) orchestrating the workflow.
๐ File handling
โ Uploaded files are stored and linked to the OCR Document Processor record
โ Supports PDF, CSV, and Excel file formats
โ File paths are used for in-dialog preview and are also attached to the created Sales Order for audits
๐ค OCR & parsing
โ PDF documents are processed using the configured OCR engine (standard OCR or LLM-powered extraction)
โ Spreadsheet files use structured parsing logic, adaptable using vendor-type configurations
โ Extracted output is stored as structured JSON inside the OCR Document Processor
๐งพ Price & item logic
โ Customer price list is auto-identified and used for correct rate comparison
โ Customer item codes map to internal item master records
โ Highlighting is applied where rates or mappings mismatch expected values
๐งฎ Sales Order Generation
โ Cleaned and validated extracted data becomes a Sales Order payload
โ Original PO file is attached to the Sales Order for audit trails
โ OCR document status, reference and timestamps update automatically
10. Frequently Asked Questions
-
โ Which file types are supported?
โ PDF, CSV, XLS, and XLSX are supported for purchase order uploads. -
โ Do I still need to check the data after extraction?
โ Yes โ the system automates extraction, but users should review items, rates and totals before posting the Sales Order. -
โ Can it handle different formats for different customers or marketplaces?
โ Yes โ vendor models and vendor_type support allow tailoring of parsing logic to specific sources like FlipKart or BB. -
โ What happens if an item code cannot be mapped?
โ The row is highlighted and listed as an exception; users can update mappings or item codes before posting. -
โ Will the original PO file be available later?
โ Yes โ the file is stored in ERPNext and attached to the created Sales Order for future reference. -
โ Can I reprocess or edit a document after an error?
โ Yes โ you can reopen the processing dialog, adjust data, save changes, and retry posting as needed.
11. Security & Data Handling
- ๐ Files are uploaded and stored within your ERPNext file system.
- ๐ Communication with OCR endpoints uses secure HTTP.
- ๐งพ Processed JSON, status and links are stored in ERPNext doctypes for full traceability.
- ๐ซ No external system receives data beyond your configured OCR/LLM providers.
- ๐งฉ Access is governed by ERPNext user permissions on the relevant doctypes.
12. Availability
OrderCapture OCR is offered as a Frappe app and implementation service by Akhilam Inc. ๐ผ
We can help you with:
- ๐งฉ App installation and configuration
- ๐ Vendor model design and mapping (FlipKart, BB, other marketplaces)
- ๐จ Customization of dialogs and item grids
- ๐งช End-to-end testing and go-live support
13. Support
๐ฉ Email: support@akhilaminc.com
๐ Or submit a request through our website.
14. Related Integrations
- ๐ Airwallex Bank Integration
- ๐ Xero Integration
- ๐ PostGrid Integration
- ๐ Other ERPNext automation apps by AkhilaM Inc