Description/summary not set
Other contact details:
Not specified
Interests:
Not specified
Field/Industry:
data mining
Occupation/Role(s):
researcher
Organisation(s):
Jožef Stefan Institute, Ljubljana, Slevenia
Note: some items may not be visible to you, due to viewing permissions.
Contents (click to expand/contract)
Text stemming with Porter Stemmer
(1)
This workflow does text stemming. Stemming removes the inflicted endings of words. It is often used as text preprocessing for text mining, since stemmed words can be easily matched and counted.
The input to the workflow is the text to be stemmed, the output is the stemmed text.
Created: 2011-01-11
| Last updated: 2011-01-11
Credits:
Petra Kralj Novak
Text preprocessing
(1)
The input to this workflow is plain text. The text is preprocessed so that non- alfanumeric symbols are removed, the text is transformed to to lower case and stop words are removed.
The workflow first removes the charachters from this set: `~!@#$%^&*()_+=-{}|\][":;'?><,./.
Then it transforms the text to lower case. The user will be prompted to select a dictionary for stop words from a list. The workflow will, based on the selected list, remove the stop words.
Stop words are...
Created: 2011-01-07
| Last updated: 2011-01-07
Credits:
Petra Kralj Novak
Select from a list of possible web service...
(1)
The workflow for selecting from a list of possible web service parameter values has two input ports: the wsdl address of the web service and the variable name. It parses the web service wsdl description (the web service http://ropot.ijs.si/webservices/janez/getvalues.php?wsdl does that) and then it asks the user to select one value from a drop-down menu. This workflow is very useful when web services have inputs which expect as a parameter one value from a list of possible values.
Created: 2010-12-23
| Last updated: 2010-12-23
Credits:
Petra Kralj Novak
Janez Kranjc
Lemmatization
(3)
The workflow lemmatizes the text in the input port.
Takes text as input and returns (language dependent) lemmatized text as output. All the words in the resulting text are in the same order as in the original text, but they are transformed to their dictionary form.
The workflow asks for the language of lemmatization. Currently, 12 languages are supported: en,sl,ge,bg,cs,et,fr,hu,ro,sr,it,sp.
Created: 2010-12-17
| Last updated: 2010-12-23
Credits:
Petra Kralj Novak
Attributions:
Select from a list of possible web service parameter values