- Notifications
You must be signed in to change notification settings - Fork 7
Description
Goal
Align image of pages that are in different orientations using keypoint matching and a homography matrix so that we can apply OCR to the texts in the page.
Considerations
We'll be developing a generalized document scanner so there won't be a special text filled template to extract keypoints for matching. A perfectly aligned empty page might be a good template. We can then compute a homography matrix, which allows us to apply a perspective warp to align the image of pages.
Deliverables
- A Colab Notebook to demonstrate the idea.
- A Python script (you can modularize code with multiple scripts too) for the end-to-end execution i.e. this script will take an image of a page as its input, align it in proper orientation, and display the aligned image on the screen.
Tools
You are free to use open-source pre-trained models. If you use someone else's please attribute it. If your code is plagiarized then you will be suspended (applicable only if you are a WoC participant).
This template was adapted from Deep Fusion AI's organization template: thank you Sayak Paul for writing it!