I made a small tool for sanitizing PDF files!


Some users in our org receive random PDF files, but with the amount of junk, scripts, phishing links and PDF reader exploits that PDF files can contain, i needed a tool to sanitize the received PDF files before allowing users to open them.

This tool is extremely simple, it just renders the PDF, converts it to a bunch of images and then generates a new, image-only PDF.

This removes all scripts, links, external references, flash files, actions and whatnot while preserving any visual content. Unfortunately, it also makes the text uncopyable and removes any legitimate links.

I included a Firejail profile so you can run it in a sandbox in case MuPDF (the rendering engine) is pwned.

Available here: https://github.com/lacioffi/PDFSanitizer

Posting this here for posterity + looking for any additional tips, suggestions and criticisms on how to make this better!

