SEPA payment form recognizer
September 26, 2020 — 5 Min Read
How long does it take to build a SEPA payment recognizer?
We decided to build a production ready model to answer this question. Read on to find out how.
Disclaimer: If you were hoping to see how to train a model from scratch, well than sorry to burst your bubble....but we didn't need to.
SEPA is the abbreviation for Single Euro Payment Area.
A SEPA payment form as depicted in this picture, can be used to transfer money between two parties. These forms are quite common in Europe. All banks with a physical presence have them and it is also quite common to receive them by letter. Usually a company that wants money from you, will send a SEPA payment form with their bill. As a payee we just need to sign and hand off the form to our bank. Or, the more comfortable way via online banking. Where we fill out a transfer request with the information from the SEPA.
But it's 2020 and we thought. Why fill out the information manually? How difficult can it be to automate the extraction using a smartphone?
It turns out that it takes one day
The norm at neurocode is to start with the simplest solution possible e.g. using cloud services. If the cloud doesn't quite fit, we'll try to adapt it with some transfer learning. If that doesn't fit either, we'll go for custom training. Usually in a combination of open source tools and Azure infrastructure.
There is an often overlooked benefit to this strategy. We get to experiment with the usecase sooner. By sidestepping custom training and infrastructure pipelines, we get to experiment with the solution immediately.
Another benefit is that we are still very flexible. Nothing stops us from building a custom model from scratch if the service takes off.
If you are eager to see the result. Head over to our sample!
For demonstration purposes we build the recognizer to only extract the name of the payee, the amount due and the IBAN number.
Our first approach was to use an AI Service called Form Recognizer from Azure.
Unfortunately that didn't go well. We did not receive any meaningful results. Probably because Azure trains their form recognizer on US forms.
Time wasted - 30 min
Our next approach was to train the form recognizer with SEPA forms we got off the internet. We also edited a couple of SEPA orders in photoshop to make the training more interesting 😁
In total we used 17 images for training. Reading Azures documents took more time than the actual training 😃
Time spent - 1 hour
Once we got a model that showed sufficient accuracy, we took it for a test ride.
We used real SEPA payment forms we had lying around and were surprised at the accuracy we got. Out of the 10 tests we conducted, all tests showed correct results with high confidence levels. Try it out yourself!
Now for the fun part. Once we had the model, how do we give the world access to it? Surely we don't want to give everybody access to our Azure account 😁
We love serverless and we love cloudflare workers in particular. Cloudflare workers have virtually no cold starts and are globally distributed. Coupled with their generous offer of 100.000 invocations per day, for free, it was an easy decision to make.
There are two ways of interacting with the custom model; through REST or a client library from Azure. Unfortunately for us, the Azure JS SDK does not work with cloudflare workers, which meant we had to use Azures REST API.
We set up two cloudflare worker endpoints. One for issuing short lived Azure SAS blob tokens and one endpoint to trigger the recognition. The majority of the time went into figuring out how to create short lived Azure blob SAS urls without using the Azure SDK. After we figured that out, creating and testing the API was straightforward.
Time spent - 4 hours
We use Azure for monitoring. We set up metrics for latency, client errors, server errors and successful calls. We also record the accuracy levels for every request we receive. If we see big fluctuations in accuracy we can take a look and adjust. Since most of the metrics are provided by Azure, we didn't have much to do. We just created one alert to notify us on sudden bursts in traffic.
Time spent - 0 hours
We take data privacy and security seriously. We have an Azure serverless function running that deletes all files older than 1 hour. This helps protect us and our users. We also deny requests for recognition if the origin isn't our Azure blob in Europe.
Time spent - 3 hours
Most of the time went into coding the API 😍 and the security around it.
Second most - getting to know Azure cognitive services and reading their docs. We were pleased to find out that Azure offers transfer learning of our data. What was also surprising was the accuracy of the model given the tiny dataset we had.
The only downside we saw was concerning the prediction times. They vary between 0.5 and 5 seconds. We observed that the time it takes to receive a response to a request is correlated with the size of the image. For now we decided not to work around this issue but going forward we might utilize the cloud by resizing large images before the analysis. In a serverless manner of course 😁
Overall we had fun while working on this experiment.
If you are interested in the source code, don't hesitate and send us a message.