Bank’s E-Statement Data Extraction Using Google Vertex AI Studio (Gemini)

Didik Mulyadi
3 min readJun 20, 2024

--

Photo by Growtika on Unsplash

Google Gemini (Generative AI) provides an easy AI implementation e.g. Extraction, Classification, If the image/page contains data with a good layout and common data, it will be great to use Generative AI instead of Google Document AI.

We will extract the data from a PDF file with Google Vertex AI Studio (Gemini).

Vertex AI Studio

Go to Multimodal and create a single turn, we will try to extract and classify the transaction in the same request.

Then, put the statement’s file and this prompt

Extract the data from that file into an object like this {bank,account_name, account_number,estatement_date, transactions: [{ date, detail, amount, balance, category}]},
for the category field, please categorize the transaction based on the detail's field with this options: groceries, transfer IN, top-up e-money, transfer OUT, investment, withdraw, transaction fee, utilities. please consider this sample to determine the category:
assume it to be top-up e-money if the detail field value is related to an e-money platform e.g. ShopeePay, OVO, Gopay, Dana, etc.
assume it to be groceries if the detail field value is related to an e-commerce brand e.g. Blibli, TikTok, etc.
assume it to be an investment if the detail field value is related to an investment context e.g. gold, stocks, etc.
assume it to utilities if the detail field value is related to the laundry, PLN.
assume it to transfer OUT if the detail field value context is transfer amount value to the person

The prompt contains 2 statements:
1. Extract the pdf file to the defined object
2. Classify the category field based on the detail field and follow the samples

Prompt and Response

It’s simple. Then, save the multimodal if you want to make it can be accessed on your application.

API Response

You will get the raw json like this

[
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": "```"
}
]
}
}
]
},
...
]

do this to get the JSON only

const documentAIResponse = require("./ai-gemini-result.json");
const jsons = documentAIResponse
.map((d) => d.candidates[0].content.parts[0].text)
.join("")
.replace("```json", "")
.replace("```", "");

Access it by API

You can use the code from the <>GET CODE. If you are interested in how to create a request with cURL and JWT based on the service account, see this article and go to the last section.

--

--