A Process to Extract the BCA Bank’s E-Statement Data Using Google Document AI
We can use many methods to extract the data from the pdf file, Google provided a Document AI that specially extracts the data from the document. We will extract the data using Document OCR, Specialized Bank Statement Parser, and Custom Processor.
Google also has Vertex AI OCR Parser that can scan the PDF file.
Overview the Processor
We can create our custom processor or use an existing processor by clicking the button “Explore Processor”, it will bring you to the Processor Gallery page.
The difference is you can train the model with the custom processor, and can't train the model with the existing processor.
If your needs are the general solution you might fit with the existing processor, but if the file is not general e.g. BCA E-Statement you can go with the custom processor.
In this article, we will try the processor from the gallery and create a custom processor to extract the text from the PDF.
Existing Processor
In this case, we want to extract the text from the bank’s e-statement, so the fit model that can be used is Document OCR (General) and Bank Statement Parser (Specialized).
Document OCR
We are not expecting the result to be accurate, because this is the general OCR. Let’s create the processor with Document OCR, for example, select the US region.
After that, you will see the details of it
Let’s continue to upload the test document, and see how the model analysis of it
As we can see, the model selected every text from the uploaded PDF file. In the sidebar, we can see the list of the selected text. Then click the “Extract JSON” button in the navbar to see the structured data.
Result:
- The “text” key contains all selected text.
- Some scanned texts are incorrect both the text selection and order.
- The “entities” property does not exist so it’s hard to manipulate the data to get specific data.
Bank Statement Parser
if we want to try this model we need to request access to Google, because it's a private program.
They redirected me to the Google form.
Let’s skip this and move to the custom processor.
Custom Processor
Custom processors allow us to train the model until it fits our e-statement format. The selection result is accumulated in the “entities” property, so we can easily get the specific data.
Create Custom Extractor
Let’s create a “Custom Extractor”
This is an overview of the created custom processor
Define the Fields
Let’s get started and upload the test document
After the file is uploaded, you will see the same analysis result screen, the difference is you need to define the field so you will get the only thing that you need.
When you create the fields, AI will help you by automatically selecting the text. for example, when I add an account_number/transaction_title field, AI automatically selects the account number area and if it’s correct, you need to check it.
Do not forget to always take action to the purple selection (AI Suggestion), either confirm, delete, or modify it.
In my case, “Saldo Awal” is not selected, so I need to “Add instance” and then click “Annotate” to draw the area covering that text.
These are the created fields:
- account_information (type: Plain text, occurrence: Required once)
- account_number (type: Plain text, occurrence: Required once)
- period (type: Datetime, occurrence: Required once)
- transaction_date (type: Datetime, occurrence: Optional multiple)
- transaction_title (type: Plain text, occurrence: Optional multiple)
- transaction_detail (type: Plain text, occurrence: Optional multiple)
- transaction_amount (type: Number, occurrence: Optional multiple)
- transaction_type (type: Plain text, occurrence: Optional multiple)
- current_balance(type: Number, occurrence: Optional multiple)
Make sure all of the text that you want to be extracted has already covered in the all page . If you only did in the page 1, the result will not be accurate
After you completely add those fields, save it by clicking this button “MARK AS LABELED”, then you will see the summary fields
Let’s go to the next step
Build and Evaluate the Model
Let’s import the document for training and testing, I use the same document. Then, if the button “start labeling” is still active, please do a labeling to cover all dataset
After you finish the labeling field, the “Unlabeled” should be 0. Then, click “Create New Version” in the Call Foundation Mode section, and name the version to “version-1–0–0”. You will see the bottom snack bar notification.
After that, you can see the finished notification
Let’s move to evaluate the version.
Deploy & Use
Make sure the version is already deployed, if not, you can click the deploy button.
Then let’s move the evaluate page, to test it.
Evaluate and Test
Select your processor version, and run a new evaluation.
It will show you the details of the version
Then, you can test it again by uploading the test document to see the result. If it's not enough, you add more document types or the label in the section “Improve your processor”.
Build — Fine Tuning and Train
We should improve the model to increase the accuracy, it depends on how much data training and labels you use. If you want to do find tuning or train, there are some requirements to do this, let's see the requirements.
We should have at least 10 documents for training and testing, you need to prepare that. After you upload it, you need to create a label for the document again.
Evaluation
It still takes time and the result is still not accurate, so, I thought to try another selection method
- Select the entire row transaction instead of per text and then manipulate the data per row to get the transaction date, title, detail, amount, type, and balance.
Try Different Selection Methods
Following the same flow until you create and deploy the version.
The difference is the fields, currently, we only use the fields bank_name (type: Plain text, occurrence: Optional multiple) and transaction_row (type: Plain text, occurrence: Optional multiple).
That’s simple.
Deploy the Version
After the version is created, it will shown in the list deployment and you need to deploy it.
Consume the Version
After the version is deployed, you can send a request to that version. There is a sample request button to see how to interact with the version.
Copy the “Prediction Endpoint” in the detail’s version.
Following that, I try it with the Postman
I got the JSON file from that
Extract the Transaction From The JSON
The field that we created before will be shown in the “entities” property
These are the samples of the entities' values.
Now I want to get an array of string transactions, I manipulate the JSON with this code
const documentAIResponse = {}; // the response
const transactions = documentAIResponse.document.entities
.filter((e) => e.type === "transactions")
.map((e) => e.mentionText.replaceAll("\n", " "));
Conclusion
We have already succeeded in extracting the BCA’s E-Statement, then in the next article, we will categorize or classify the transaction based on the defined categories e.g. Transfer, Food, Clothes, etc.
Thank you for reading my article!
Reach me on Linkedin: https://www.linkedin.com/in/didikmulyadi