Skip to content

Some missing words from converting PDF to Image  #282

@jason-ng-zq99

Description

@jason-ng-zq99

Hi, I am currently encountering the titled issue when using the convert_from_bytes function.

On my Mac, this happens specifically if I open up a fillable pdf and fill in with the preview function
Words that are filled in this way do not get converted.
Screenshot 2024-04-08 at 20 58 08

Screenshot 2024-04-08 at 21 00 40

If i use strict=True, and also when i test out with the pdftoppm -r 200 -jpeg sample_pdf.pdf out command on my terminal,
I get the following error message:

Syntax Error: Unknown font tag 'ArialMT' Syntax Error: Unknown font tag 'ArialMT' Syntax Error (69): No font in show 

I have also gotten Unknown font tag 'Helvetica' on other files.

I have also verified that these fonts are present in my system using the fc-match ArialMT command, which returns me the respective matched font, in this case it'sVerdana.ttf: "Verdana" "Regular"

Interestingly, texts that are filled in via the textbox function remains converted as seen below:
Screenshot 2024-04-08 at 21 03 48
Screenshot 2024-04-08 at 21 03 59

This problem was first found on my Debian GNU/Linux 11 docker, and has the exact same behavior.

I have also already tried installing fonts like fonts-freefont-ttf fonts-liberation fonts-liberation2 ttf-mscorefonts-installer but the same issue persists.

P.S. Suspecting it might be an issue with editable fields, I also tried to flatten the pdf first using fillpdf before using convert_from_path, but the same issue remains.

Problem replicated on two systems:

  • OS: macOS Sonoma 14.2.1

  • pdf2img version: 1.17.0

  • pdftoppm/pdftotext version: 24.03.0

  • OS: Debian GNU/Linux 11

  • Poppler version: 22.11.0

  • Poppler-data version: 0.4.10

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions