Bank Statement Parsing
This article explains how Account Reconciliation parses bank statement files to extract transaction data for reconciliation matching.
What is Parsing?
Parsing is the process of reading and extracting structured data from uploaded bank statement files. When you upload a bank statement, the system automatically:
- Identifies the file format (PDF, CSV, or Excel)
- Locates transaction tables within the file
- Extracts transaction details (dates, descriptions, amounts)
- Extracts the closing balance (when available)
- Prepares the data for AI-powered matching against GL transactions
The parsing process runs automatically when you upload bank statement files to a reconciliation. No manual configuration is required.
Supported File Types
Account Reconciliation supports the following bank statement formats:
| File Type | Extensions | Parsing Capability |
|---|---|---|
| CSV | .csv | Full parsing (transactions and closing balance if present) |
| Excel | .xlsx, .xls | Full parsing (transactions and closing balance if present) |
| Full parsing including multi-page support |
Parsing Rules
When you upload bank statement files, the system follows these simple rules:
Rule 1: CSV/Excel files are always preferred over PDF files
CSV and Excel files contain structured data that the system can read directly and accurately. PDF files require additional processing to extract data from visual layouts.
Rule 2: Transactions are extracted from CSV/Excel when available
If you upload a CSV or Excel file, the system extracts transaction data from that file. PDF is used for transactions only when no CSV/Excel file is uploaded.
Rule 3: Closing balance source depends on what's available
- If CSV/Excel contains a closing balance → System uses it
- If CSV/Excel does not contain a closing balance but PDF is uploaded → System extracts closing balance from PDF
- If only CSV/Excel is uploaded without closing balance → You must enter closing balance manually
Rule 4: PDF serves different roles based on what else is uploaded
- PDF only → Primary source for all data
- PDF with CSV/Excel (CSV has closing balance) → PDF attached for reference only
- PDF with CSV/Excel (CSV has no closing balance) → PDF used for closing balance only
Parsing Workflow Scenarios
The following table explains how the system handles different file upload combinations:
| Files Uploaded | Description |
|---|---|
| PDF only | If only a PDF is uploaded, the system performs full PDF parsing. Both transaction data and the closing balance are extracted from the PDF. |
| CSV/Excel only (with Closing Balance) | If only a CSV/Excel file is uploaded and it contains a closing balance, the system extracts all data from the file. Both transaction lines and the closing balance come from the structured file. |
| CSV/Excel only (without Closing Balance) | If only a CSV/Excel file is uploaded and it does not contain a closing balance, the system extracts transaction data from the file but requires you to enter the closing balance manually. |
| PDF + CSV/Excel (CSV has Closing Balance) | If both PDF and CSV/Excel are uploaded and the CSV/Excel contains a closing balance, the system uses CSV/Excel as the primary data source. The PDF is attached for reference and audit purposes only. |
| PDF + CSV/Excel (CSV has no Closing Balance) | If both PDF and CSV/Excel are uploaded but the CSV/Excel does not contain a closing balance, the system extracts transactions from CSV/Excel and falls back to the PDF for the closing balance. |
Multi-Page PDF Parsing
Multi-page PDF parsing is the system's ability to process bank statements that span multiple pages within a single PDF document. Many bank statements, especially for accounts with high transaction volumes, extend across several pages. Account Reconciliation reads and processes all pages in a PDF file as a unified document, extracting transaction data regardless of which page it appears on.
When processing multi-page PDF files, the system:
- Analyzes all pages within the document
- Identifies transaction tables across pages
- Extracts transaction dates, descriptions, and amounts
- Consolidates data from all pages into a single transaction set
- Triggers AI matching against GL transactions
Best Practices for PDF Files
To ensure optimal PDF parsing results:
- Use electronically generated PDFs — These contain embedded text data that the system can read directly. Scanned documents may require OCR which can introduce errors.
- Avoid password-protected files — The system cannot access protected files and will return an error status.
- Match the statement period to the reconciliation period — Period mismatches can cause incorrect data alignment and parsing failures.
- Use original bank-generated PDFs — Third-party modified PDFs or statements exported from other applications may have non-standard formatting.
| Error | Cause | Resolution |
|---|---|---|
| Password-protected PDF | The system cannot read protected files | Remove password protection before uploading or request an unprotected version from your bank |
| Scanned image without text | Scanned documents lack embedded text | Use an electronically generated PDF or run OCR software before uploading |
| Corrupted file | File is incomplete or damaged | Re-download the statement from your bank portal |
| Non-standard formatting | Unusual table layouts the system cannot interpret | Upload a CSV or Excel version of the statement instead |
| Period mismatch | Statement dates do not align with reconciliation period | Verify the uploaded file matches the correct accounting period |