Skip to main content
Privacy & Security

PDF Metadata: What It Is and How to Remove It

Your file might be leaking your name, company, and revision history without you knowing

In 2007, a PDF submitted by the UK Ministry of Defence about the death of an Iraqi citizen was found to contain theΒ full name of an undercover intelligence officer in the metadata β€” information that had been carefully avoided in the visible text. Incidents like this happen regularly. Here is what your PDFs might be hiding.

What Is PDF Metadata?

PDF metadata is data about the document stored inside the file but not rendered on any page. There are two metadata systems in the PDF specification:

Document Information Dictionary

The original PDF metadata format. A key-value dictionary with fields like Title, Author, Subject, Keywords, Creator, Producer, CreationDate, and ModDate.

XMP (Extensible Metadata Platform)

An Adobe standard using embedded XML. Contains all the same fields as the document info dictionary, but can also hold custom namespaces with arbitrarily detailed data β€” including GPS, rights management, and workflow history.

What Metadata Can Reveal

Field

Author

Example value

John Smith

Privacy risk

Reveals the real identity in anonymous submissions, whistleblower documents, or confidential proposals.

Field

Company

Example value

Acme Legal LLP

Privacy risk

Exposes the submitting organisation in sealed bids, anonymous feedback, or NDA-protected drafts.

Field

Creator / Producer

Example value

Microsoft Word 16.0

Privacy risk

Reveals the software stack. Some versions of Word embed the Windows username in the Producer field.

Field

CreationDate

Example value

2024-03-15T09:47:22

Privacy risk

Can contradict stated timelines in legal disputes or reveal when a “final” document was actually created.

Field

Revision history

Example value

Title: CONFIDENTIAL DRAFT v3

Privacy risk

Previous document titles or subject fields can contain information stripped from the visible document.

Field

Embedded thumbnails

Example value

Page 1 preview image

Privacy risk

The embedded thumbnail image may show a version of the page before redactions were applied.

Field

GPS / scan metadata

Example value

51.5074Β° N, 0.1278Β° W

Privacy risk

Scanned documents processed via mobile PDF apps can embed GPS coordinates of where the scan was taken.

Field

Tracked changes (via Word)

Example value

Deleted text: "the fee is $450,000"

Privacy risk

If a Word file with tracked changes is saved as PDF, the deleted/changed text may be embedded in the PDF structure.

How to View Your PDF's Metadata Right Now

Before removing metadata, it's useful to see what is actually embedded in your file:

Adobe Acrobat Reader

File β†’ Properties β†’ Description tab. Also check the Custom tab for any extra fields.

Browser (Chrome / Firefox / Edge)

Open the PDF in the browser, then go to the URL bar and add ?view=properties at the end. Alternatively, use the browser's built-in PDF viewer PDF properties panel if available.

Command line (ExifTool)

Run: exiftool filename.pdf β€” ExifTool is free, cross-platform, and shows all embedded fields including XMP namespaces.

EditoraPDF β€” Edit Metadata tool

Upload your PDF to the Edit Metadata tool. All current field values are displayed before you make any changes.

How to Remove Metadata from a PDF

1

Use the Sanitize PDF tool

Open EditoraPDF β†’ Sanitize PDF. This strips all Document Information Dictionary fields, XMP metadata, embedded thumbnails, hidden annotations, and scripts.

2

Or manually clear individual fields

If you want to keep some metadata (like the title) while removing others, use Edit Metadata to clear or update specific fields.

3

Verify the result

Open the sanitized PDF in Acrobat or run ExifTool on it. Author, Creator, and XMP data should now be absent or blank.

4

For maximum privacy, also apply redaction

If the document has sensitive visible content, combine sanitization with Redact PDF to ensure neither the visible content nor the hidden metadata leaks information.

When Should You Remove PDF Metadata?

Before submitting a document anonymously or pseudonymously
Before sending to opposing counsel in legal proceedings
Before publishing PDFs on public websites or portals
Before submitting tender, bid, or grant applications
Before sharing any document outside your organisation
Before archiving documents with personal data (GDPR compliance)
After scanning documents with a mobile device (GPS risk)
Before sharing whistleblower or source-protection documents

Frequently Asked Questions

What is PDF metadata?+

Hidden data embedded in the file structure: author name, creation date, software used, keywords, GPS location, revision history, and more β€” none of it visible on the page.

Can PDF metadata reveal sensitive information?+

Yes β€” it has exposed undercover agent identities, leaked real authors of anonymous documents, revealed company names in sealed bids, and shown deleted text through tracked changes.

How do I remove metadata from a PDF?+

Use EditoraPDF's Sanitize PDF tool. It strips all Document Info fields, XMP metadata, embedded thumbnails, hidden annotations, and scripts β€” entirely in your browser, no server upload.

Does printing to PDF remove metadata?+

Mostly, but not reliably. The print driver may embed its own metadata. Dedicated sanitization is the only way to guarantee a clean file.

Strip Hidden Data from Your PDF

Use the free Sanitize PDF tool β€” removes all metadata, hidden annotations, and embedded thumbnails locally in your browser.

Free Β· No signup Β· No server uploads