Align image of pages

Goal

Align image of pages that are in different orientations using keypoint matching and a homography matrix so that we can apply OCR to the texts in the page.

Considerations

We'll be developing a generalized document scanner so there won't be a special text filled template to extract keypoints for matching. A perfectly aligned empty page might be a good template. We can then compute a homography matrix, which allows us to apply a perspective warp to align the image of pages.

Deliverables

A Colab Notebook to demonstrate the idea.
A Python script (you can modularize code with multiple scripts too) for the end-to-end execution i.e. this script will take an image of a page as its input, align it in proper orientation, and display the aligned image on the screen.

Tools

You are free to use open-source pre-trained models. If you use someone else's please attribute it. If your code is plagiarized then you will be suspended (applicable only if you are a WoC participant).

This template was adapted from Deep Fusion AI's organization template: thank you Sayak Paul for writing it!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align image of pages #2

Goal

Considerations

Deliverables

Tools

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Align image of pages #2

Description

Goal

Considerations

Deliverables

Tools

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions