ParanoiDF - A PDF Analysis Suite
ParanoiDF is a PDF analysis suite based on PeePDF. It is the swiss army knife of PDF analysis tools.
Requirements:
- PdfCrack (To crack passwords)
- Calibre's ebook-convert (To remove DRM)
- QPDF (To decrypt PDFs)
- NLTK Natural Language ToolKit, and Java (To use the command redact)
- lxml (To support XML output)
Usage:
paranoiDF.py [options] InputFile
- -f: Ignores the parsing errors. Analysing malicious files probably leads to parsing errors, so this parameter should be set.
- -l: Sets the loose mode, so does not search for the endobj tag because it's not obligatory. Helpful with malformed files.
Simple execution:
- Shows the statistics of the file after being decoded/decrypted and analyzed:
python paranoiDF.py [options] pdf_file
Interactive console:
- Executes the interactive console, giving a wide range of tools to play with.
python paranoiDF.py -i
Batch execution:
- It's possible to use a commands file to specify the commands to be executed in the batch mode. This type of execution is good to automatize analysis of multiple files:
python paranoiDF.py [options] -s script_file
Tools/functions when running the ParanoiDF:
- -t Text Display: Using pdf2txt.py from PDFMiner this option parses and renders all pure text inside a PDF.
- -u URL: Downloads the PDF from the link and saves it in a new directory named after the website it was obtained from. This option simply uses an OS call to the command WGET.
- crackpw: This executes PDFCrack tool by performing an OS call. The command allows the user to input a custom dictionary, perform a benchmark or continue from a saved state file. If no custom dictionary is given, this command will attempt to brute force a password using a modifiable charset text file in the "ParanoiDF/pdfcrack" directory.
- decrypt: This uses an OS call to "QPDF" which decrypts the PDF document and outputs the decrypted file. This requires the user-password.
- encrypt: Encrypts an input PDF document with any password you specify. Uses 128-bit RC4 encryption.
- embedf: Creates a blank PDF document with an embedded file. This is for research purposes to show how files can be embedded in PDFs. This command imports Make-pdf-embedded.py script as a module.
- embedjs: Similiar to "embedf", but embeds custom JavaScript file inside a new blank PDF document. If no custom JavaScript file is given, a default app.alert messagebox is embedded.
- extractJS: This attempts to extract any embedded JavaScript in a PDF document.
- redact: Generates a list of words that fit inside a redaction box in a PDF document. The words (with a custom sentence) can then be parsed in a grammar parser and a custom amount can be displayed depending on their score.
- removeDRM: Remove DRM (editing, copying etc.) restrictions from PDF document and output to a new file. This does not need the owner-password and there is a possibility the document will lose some formatting. This command works by calling Calibre's "ebook-convert" tool.
Note: Type "help" to get a list of commands. Type "help [command]" to get a description/usage on a specific command.
Source: www.effecthacking.com
ParanoiDF - A PDF Analysis Suite
Reviewed by Anonymous
on
12:34 PM
Rating: