Our receipt management engine is powered by machine learning techniques that allow us to have the most accurate, efficient, and scalable solution on the market. (For those who need a refresh on how we do this, refer to Part I and Part II of our machine learning series.) To be brief, these techniques allow our technology to improve its performance based on previous results, with little to no human intervention.
But like humans, technology can stagnate, indicating that it’s time to introduce something new to the environment. In our case, we started experimenting with some innovative machine learning frameworks to see how they would impact the accuracy of our machine’s receipt-reading prowess. What we found however, was that instead of choosing one or the other, a combination of frameworks actually yielded the best result. Fittingly, we dubbed this “The Multi-Brain Approach”.
Before I try to explain this, you should understand that our receipt processing system is divided into two different components:
1. Optical Character Recognition (OCR)
When a user uploads an image of a receipt, OCR identifies what parts of the image are text and translates that into data that can be used to train our machine learning model.
At Sensibill, we use two different OCR systems, both of which have strengths and weaknesses depending on the image. Once the image has been translated into text, the next component of our system–machine learning–takes over.
2. Machine Learning Framework
Sensibill uses machine learning to teach our machine how to read and understand text on receipts based on previous experiences with similar receipts. It’s how our system knows that the text “Vt Icd Mcch” on your Starbucks receipt is a “Venti Iced Macchiato”, which is a a “Coffee”, and so on.
Up until our most recent functional release, we only used one machine learning framework…
But two brains are better than one!
Alas, the old adage is true. In the same way that we use two OCR systems to make sure we’re getting the clearest text, we introduced a second machine learning framework–with its own advantages and weaknesses–into the mix.
We pair one OCR with one machine learning framework and get an output. Then we pair the other OCR with the other machine learning framework and get another output. Whichever configuration produces the most accurate output, we send through to the user.
Technically speaking, we’re currently leveraging a Double-Brain approach–two different configurations of the OCR and the machine learning frameworks–but this model is absolutely scalable (although, to be fair, it does take some extra work on our end).
By having two “brains” working at the same time, we increased our accuracy by 10% without any extra wait time for the user. Imagine what we could accomplish with 3, 4, or 10 brains?
What does this mean for our end users?
The Multi-Brain approach enables us to increase the accuracy of our data extraction without any added processing time. Now, our end users can capture receipts instantly with near-perfect accuracy. That means no manual data entry, and more time back on their calendar to focus on the important things: running their business.