95

Suppose I have a PDF and I want to obtain whatever metadata is available for that PDF. What utility should I use?

I find the piece of information I am usually most interested in knowing is the paper size, something that PDF viewers usually don't report. E.g. is the PDF size letter, legal, A4 or something else? But the other information available may be of interest too.

0

2 Answers 2

120

One of the canonical tools for this is pdfinfo, which comes with xpdf, if I recall. Example output:

[0 1017 17:10:17] ~/temp % pdfinfo test.pdf Creator: TeX Producer: pdfTeX-1.40.14 CreationDate: Sun May 18 09:53:06 2014 ModDate: Sun May 18 09:53:06 2014 Tagged: no Form: none Pages: 1 Encrypted: no Page size: 595.276 x 841.89 pts (A4) Page rot: 0 File size: 19700 bytes Optimized: no PDF version: 1.5 
5
  • 7
    In Debian at least this is part of poppler-utils. I have a file here which pdfinfo reports as 595.2 x 841.44 pts. But this is not reported as A4. What decides to mark it as A4? Commented Jun 29, 2014 at 15:30
  • 3
    The names are hardcoded: letter is 612 ± 0.1 x 792 ± 0.1, the magic sizes for DIN/ISO A are (all ±1 pt): 3370.98, 2383.64, 1685.49, 1191.82, 842.74, 595.91, ..., so it seems your page is a tiny bit too narrow for pdfinfo to pick that up. Commented Jun 29, 2014 at 16:20
  • 1
    I see, the bit of code if ((fabs(w - 612) < 0.1 && fabs(h - 792) < 0.1) || (fabs(w - 792) < 0.1 && fabs(h - 612) < 0.1))? Commented Jun 29, 2014 at 16:23
  • 2
    That's letter, the A formats are in the loop with the sqrt(2)s. Commented Jun 29, 2014 at 16:24
  • This was (pre?)installed on both my Mac OS X and Ubuntu (unlike exiftool). So it gets my vote. Commented Dec 24, 2023 at 4:03
61

Another utility worth looking into is exiftool. It might not be the right tool in your specific case as it doesn't report any information on the geometry of the document but in general it is probably the most feature-complete tool for inspecting PDF metadata.

Here's an example of a command that will print all available meta information (-a), sorted by groups (-G1):

exiftool -a -G1 "$File" 

The official documentation offers an overview of the supported PDF-related tags:

You can install exiftool on Debian/Ubuntu with:

sudo apt-get install libimage-exiftool-perl 

If you are more into the GUI side of things you could give my project PDFMtEd a try. It's a set of tools that serve as graphical frontends to exiftool and allow viewing and editing PDF metadata.

Here are a couple of screenshots:

enter image description here

enter image description here

4
  • This is great! In particular I like that it shows all metadata. In my case I was looking for PDF/A metadata, which pdfinfo does not show. Commented Oct 10, 2023 at 15:35
  • 1
    This looks useful, but I'm confused why in Mac Finder's 'Get Info' panel it shows a lengthy URL for "Where from" (googleusercontent.com domain as it's a GDrive generated PDF) that does not appear in output of exiftool -a -G1. So the tool does not appear to report all metadata. Commented Mar 6, 2024 at 10:31
  • 1
    @geotheory The data display as "Where from" is stored in the extended attribute com.apple.metadata:kMDItemWhereFromwhich is not stored in the file but on file system level. On macOS you can use xattr to list, read, write and delete extended attributes. Commented Jan 7 at 22:00
  • Thanks @StefanSchmidt that's useful Commented Jan 16 at 12:04

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.