Beyond OCR: Using Deep Learning to Understand Documents
Beyond OCR: Using Deep Learning to Understand Documents


Extracting key fields from a variety of document types remains a challenging problem. Services such as AWS and Google Cloud provide text extraction services to digitize images or PDFs. These services use OCR techniques and return phrases, words, and characters with their corresponding coordinate locations. Working with these outputs remains challenging and unscalable as different document types require different heuristics with new types uploaded daily. Additionally, OCR doesn’t attempt to understand the document; for example, dollar amounts need be numerical, and OCR may suggest a “1” is a lowercase “L.” Furthermore, a performance ceiling is reached even when parsing algorithms work perfectly: while third-party-service OCR is excellent, it isn’t perfectly accurate.

We propose an end-to-end scalable solution using deep learning architecture consisting of a computer vision component connected to a sequence generation component. Through training on millions of documents, the model learns to understand document trends and characteristics to finally extract important fields from raw documents. There is marked improvement of accuracy compared to third-party OCR services. Additional benefits include character-level probabilities for confidence scores and using explainability algorithms such as LIME to determine which “hot pixels” in the document are responsible for the predictions.


I'm the Chief Data Scientist at and have many years of experience as a scientist and researcher. My recent focus is in machine learning, deep learning, applied statistics and engineering. Before, I was a Postdoctoral Scholar at Lawrence Berkeley National Lab, received my PhD in Physics from Boston University and my B.S. in Astrophysics from University of California Santa Cruz. I have 2 patents and 11 publications to date and have spoken about data at various conferences around the world.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google