In 2007, a PDF submitted by the UK Ministry of Defence about the death of an Iraqi citizen was found to contain theΒ full name of an undercover intelligence officer in the metadata β information that had been carefully avoided in the visible text. Incidents like this happen regularly. Here is what your PDFs might be hiding.
What Is PDF Metadata?
PDF metadata is data about the document stored inside the file but not rendered on any page. There are two metadata systems in the PDF specification:
Document Information Dictionary
The original PDF metadata format. A key-value dictionary with fields like Title, Author, Subject, Keywords, Creator, Producer, CreationDate, and ModDate.
XMP (Extensible Metadata Platform)
An Adobe standard using embedded XML. Contains all the same fields as the document info dictionary, but can also hold custom namespaces with arbitrarily detailed data β including GPS, rights management, and workflow history.
What Metadata Can Reveal
Field
Author
Example value
John Smith
Privacy risk
Reveals the real identity in anonymous submissions, whistleblower documents, or confidential proposals.
Field
Company
Example value
Acme Legal LLP
Privacy risk
Exposes the submitting organisation in sealed bids, anonymous feedback, or NDA-protected drafts.
Field
Creator / Producer
Example value
Microsoft Word 16.0
Privacy risk
Reveals the software stack. Some versions of Word embed the Windows username in the Producer field.
Field
CreationDate
Example value
2024-03-15T09:47:22
Privacy risk
Can contradict stated timelines in legal disputes or reveal when a “final” document was actually created.
Field
Revision history
Example value
Title: CONFIDENTIAL DRAFT v3
Privacy risk
Previous document titles or subject fields can contain information stripped from the visible document.
Field
Embedded thumbnails
Example value
Page 1 preview image
Privacy risk
The embedded thumbnail image may show a version of the page before redactions were applied.
Field
GPS / scan metadata
Example value
51.5074Β° N, 0.1278Β° W
Privacy risk
Scanned documents processed via mobile PDF apps can embed GPS coordinates of where the scan was taken.
Field
Tracked changes (via Word)
Example value
Deleted text: "the fee is $450,000"
Privacy risk
If a Word file with tracked changes is saved as PDF, the deleted/changed text may be embedded in the PDF structure.
How to View Your PDF's Metadata Right Now
Before removing metadata, it's useful to see what is actually embedded in your file:
Adobe Acrobat Reader
File β Properties β Description tab. Also check the Custom tab for any extra fields.
Browser (Chrome / Firefox / Edge)
Open the PDF in the browser, then go to the URL bar and add ?view=properties at the end. Alternatively, use the browser's built-in PDF viewer PDF properties panel if available.
Command line (ExifTool)
Run: exiftool filename.pdf β ExifTool is free, cross-platform, and shows all embedded fields including XMP namespaces.
EditoraPDF β Edit Metadata tool
Upload your PDF to the Edit Metadata tool. All current field values are displayed before you make any changes.
How to Remove Metadata from a PDF
Use the Sanitize PDF tool
Open EditoraPDF β Sanitize PDF. This strips all Document Information Dictionary fields, XMP metadata, embedded thumbnails, hidden annotations, and scripts.
Or manually clear individual fields
If you want to keep some metadata (like the title) while removing others, use Edit Metadata to clear or update specific fields.
Verify the result
Open the sanitized PDF in Acrobat or run ExifTool on it. Author, Creator, and XMP data should now be absent or blank.
For maximum privacy, also apply redaction
If the document has sensitive visible content, combine sanitization with Redact PDF to ensure neither the visible content nor the hidden metadata leaks information.
When Should You Remove PDF Metadata?
Frequently Asked Questions
What is PDF metadata?+
Hidden data embedded in the file structure: author name, creation date, software used, keywords, GPS location, revision history, and more β none of it visible on the page.
Can PDF metadata reveal sensitive information?+
Yes β it has exposed undercover agent identities, leaked real authors of anonymous documents, revealed company names in sealed bids, and shown deleted text through tracked changes.
How do I remove metadata from a PDF?+
Use EditoraPDF's Sanitize PDF tool. It strips all Document Info fields, XMP metadata, embedded thumbnails, hidden annotations, and scripts β entirely in your browser, no server upload.
Does printing to PDF remove metadata?+
Mostly, but not reliably. The print driver may embed its own metadata. Dedicated sanitization is the only way to guarantee a clean file.
Strip Hidden Data from Your PDF
Use the free Sanitize PDF tool β removes all metadata, hidden annotations, and embedded thumbnails locally in your browser.
Free Β· No signup Β· No server uploads