The document describes a text extraction model that uses convolutional neural networks (CNNs) to detect and recognize text in images. It discusses pre-processing techniques like binarization and filtering used to improve accuracy. A CNN based on ResNet18 architecture is used for text recognition, trained with CTC loss to handle variable-length text. Keywords can be searched for in extracted text and highlighted. The system allows browsing images, extracting text, searching text, and storing extracted text in an editable document format. While current technology can extract text from simple backgrounds, this model aims to handle more complex real-world images.