Document Recovery with OCR and NLP

less than 1 minute read

Goal:

To digitize and recover text from documents which are not in good shape

Objective:

To extract text from documents using OCR and fill in the places where the words are missing or mistakes in OCR outputs using Natural Language Processing

Method:

Do a document layout segmentation i.e. understanding which parts of the document is heading, image, body text, etc.
Try out handwriting recognition.
Use OCR or handwriting recognition to extract the text from image.
Filling in the missing details by using an NLP model to understand the context and fill.

Share on

Twitter Facebook LinkedIn WhatsApp

Computer Vision and Intelligence

Document Recovery with OCR and NLP

Share on

Leave a Comment

You May Also Enjoy

Using our Docker Containers

Problem Statements on basic OpenCV

Session 1: Intro to Computer Vision

Pysangamam 2018: Workshop Content