pdfDX (TM) - PDF File Text Data Extractor.
Have you ever tried to get text from a PDF File?...
The concept of the paper-less office is creating many more PDF Files. These Files are DIFFICULT, IF NOT IMPOSSIBLE to extract complete and accurate information effectively and efficiently.
Many times today, when many large documents are presented, such as in Legal Discovery or Business Acquisitions and many more situations, all you receive is a CD, FULL OF HUGE PDF FILES. And, sometimes you're only given a short amount of time to peruse all this documentation. How do you do it?
Some PDF Files won't even let you copy and paste text. AND, when you can copy and paste, the text is scrambled, practically unusable! Other programs promising to convert PDF to text really aren't much better!
pdfDX intelligently scans one or more PDF files and accurately extracts the text data into easily readable, scan-able text in a format closely resembling the original document.
With the use of programming or macros and regular expressions, it allows easier, more effective and more precise searching, data mining and/or data scraping from most PDF Documents.
The pdfDX Command Line mode allows batch formatting as many PDF Files as desired. Command Line mode allows formatting ALL the PDF Files in a Folder. Command Line mode allows operating pdfDX in shell scripts and from custom computer programs.