From PDF to lemmatized text
This workflow uses the web service stationed in JSI (IJS Slovenia), which is based on Matjaž JuršiÄ's LemmaGen - lemmatization engine.
The workflow accepts a PDF file as an input an uses James Eales's wrokflows to preprocess the data. The workflow interactively asks the user of which language is the text, since the lemmatization process is language based. The output is a string in Taverna Workbench.
Created: 2010-09-16
| Last updated: 2012-01-18
James Eales
PDF to plain text
Clean plain text