PDF to plain text
(1)
This workflow will extract the plain text content of PDF files supplied to the input port. You can connect the Load PDF from directory workflow to this workflows input. We recommend you send the output from this workflow to the Clean plain text workflow, because the PDF to text process can add characters into the text that are XML-invalid and therefore can not be sent to most services as plain text. Another way round this problem is to encode the text as Base64 using the handy loc...
Created: 2010-02-19
| Last updated: 2011-12-13
Credits:
James Eales