myExperiment - Groups

RapidMiner

Uploader

Simon Fischer

Download

This is an image mining process using the image mining Web service provided by NHRF within e-Lico. It first uploads a set of images found in a directory, then preprocesses the images and visualizes the result. Furthermore, references to the uploaded images are stored in the local RapidMiner repository so they can later be used for further processing without uploading images a second time.

Created: 2010-04-28 | Last updated: 2012-01-16

Creator

James Eales

View

Download

Core text mining workflows

Created: 2010-02-19 10:12:33 | Last updated: 2011-12-13 16:03:17

This pack contains workflows we have created to support core text mining tasks. We currently provide workflows to do these tasks Loading documents (text or PDF) PDF to text conversion Sentence splitting Text cleaning (ASCII or XML-valid) Term recognition (using NaCTeM service TerMine)

7 items in this pack

Comments: 0 | Viewed: 727 times | Downloaded: 136 times

Tags:

Taverna 2

Uploader

James Eales

Termine with c-value threshold (1)

Download

This workflow accepts a list of sentences from a single document and returns the terms found by the TerMine web service. It also allows you to set a threshold c-value score so that only terms with a user-controlled probability (of being a real term) are returned as an output. To get sentences to supply to this workflow you can use the sentence splitting workflow. The TerMine service (used in this workflow) only accepts text in ASCII encoding, so you should also use the Clean p...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

PDF to plain text (1)

Download

This workflow will extract the plain text content of PDF files supplied to the input port. You can connect the Load PDF from directory workflow to this workflows input. We recommend you send the output from this workflow to the Clean plain text workflow, because the PDF to text process can add characters into the text that are XML-invalid and therefore can not be sent to most services as plain text. Another way round this problem is to encode the text as Base64 using the handy loc...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Sentence splitting (1)

Download

This workflow will attempt to split up text into sentences, returning a list of sentences to the output port. The sentence splitting service makes use of the OpenNLP sentence detector and has been trained to work on english text. This workflow can be used to provide input to the Termine with c-value threshold workflow. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Terms from collection of PDF files (2)

Download

This workflow will give you a set of candidate terms for each PDF document in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows. These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow t...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Creator

Simon Fischer

View

Download

Who Wants to be a Data Miner?

Created: 2011-11-02 17:54:07 | Last updated: 2013-09-09 16:22:11

One of the most fun events at the annual RapidMiner Community Meeting and Conference (RCOMM) is the live data mining process design competition "Who Wants to be a Data Miner?". In this competition, participants must design RapidMiner processes for a given goal within a few minutes. The tasks are related to data mining and data analysis, but are rather uncommon. In fact, most of the challenges ask for things RapidMiner was never supposed to do. This pack contains solutions for these...

12 items in this pack

Comments: 0 | Viewed: 260 times | Downloaded: 142 times

Tags:

Taverna 2

Uploader

James Eales

Terms from collection of text files (1)

Download

This workflow will give you a set of candidate terms for each text file in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows. These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow then...

Created: 2010-02-22 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Load PDF from directory (1)

Download

This workflow will automate the reading of a set of PDF files stored in a single directory (the path to which should be supplied as a single input value). This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Load plain text from directory (1)

Download

This workflow will automate the reading of a set of text files stored in a single directory (the path to which should be supplied as a single input value). It will assume that the text files are saved using the default character encoding for the system that Taverna is running on. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Clean plain text (ASCII) (1)

Download

This workflow will remove any XML-invalid and non-ASCII characters (e.g. for sending to the ASCII-only Termine service) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: James Eales

Creator

Matko BoÅ¡njak

View

Download

e-LICO recommender workflows

Created: 2011-03-15 15:33:48 | Last updated: 2012-01-28 19:39:06

This pack contains recommender system workflows created for the purpose of e-LICO project.

6 items in this pack

Comments: 0 | Viewed: 305 times | Downloaded: 162 times

Tags:

Taverna 2

Uploader

James Eales

Clean plain text (1)

Download

This workflow will remove any XML-invalid characters (these characters often appear in the output of PDF to text software) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: James Eales

Creator

Ninoaf

View

Download

Recommender systems workflow templates 2012

Created: 2012-01-08 12:27:43 | Last updated: 2012-06-03 19:48:22

The Recommender Extension can be downloaded from the Rapid-I Marketplace from: http://rapidupdate.de:8180/UpdateServer/faces/product_details.xhtml?productId=rmx_irbrecommender . More details can be found: http://elico.rapid-i.com/recommender-extension.html

12 items in this pack

Comments: 0 | Viewed: 476 times | Downloaded: 145 times

Tags:

Taverna 2

Uploader

Simon Jupp

miRNA GFF to entrez gene (1)

Download

This workflow reads a GFF file of miRNA cooridinates and uses BioMart to search human ensemble genes for the gene that codes for the miRNA. The workflow returns a list of miRNAid, chromosome, start, stop, strand, entrez gene id, gene name, gene strand. Example input file here: ftp://mirbase.org/pub/mirbase/CURRENT/genomes/hsa.gff

Created: 2011-01-26 | Last updated: 2012-01-11

RapidMiner

Uploader

Sebastian land

Using Remember / Recall for "tunneling" re... (1)

Download

This process shows how Remeber and Recall operators can be used for passing results from one position to another position in the process, when it's impossible to make a direct connection. This process introduces another advanced RapidMiner technique: The macro handling. We have used the predefined macro a, accessed by %{a}, that gives the apply count of the operator. So we are remembering each application of the models that are generated in the learning subprocess of the Split validation. Af...

Created: 2010-04-29 | Last updated: 2012-01-16

Uploader

Lawrynka

View

Download

Data supplementary to meta-mining workflows

Created: 2012-03-05 22:22:33 | Last updated: 2012-03-05 22:23:51

Credits: Lawrynka

License: Creative Commons Attribution-Share Alike 3.0 Unported License

- Repositories of RapidMiner baseline workflows, and used datasets - DMOP ontology files from ver5.2 -input files to meta-mining workflows

File type: ZIP archive

Comments: 0 | Viewed: 74 times | Downloaded: 39 times

This File has no tags!

Uploader

Lawrynka

View

Download

Digital Multimedia Repositories Ontology (DMRO) and ...

Created: 2012-01-29 16:35:26 | Last updated: 2012-01-29 16:38:25

Credits: Lawrynka

License: Creative Commons Attribution-Share Alike 3.0 Unported License

For the information on the ontology see: http://www.e-lico.eu/?q=node/288 For the information on the original dataset see: http://www.ecmlpkdd2011.org/challenge.php The ontology and KB files are zipped into one file.

File type: ZIP archive

Comments: 0 | Viewed: 226 times | Downloaded: 44 times

Tags:

RapidMiner

Uploader

Lawrynka

Loading OWL files (RDF version of videolec... (1)

Download

The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/). Operator "Build knowledge base" is responsible for collecting data either from OWL files or SPARQL endpoints or RDF repositories and provide it to the subsequent operators in a workflow. In this workflow it is parametrized in this way, that is builds a Sesame/OWLIM repository from the files specified in "Load file" operators. Paths to OWL files are specified as parameter va...

Created: 2012-01-29 | Last updated: 2012-01-29

RapidMiner

Uploader

Lawrynka

Semantic clustering (with alpha-clustering... (1)

Download

The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. The measure used in this particualr workflow is a kernel that exploits membership of clustered individuals to OWL classes from a background ontology ("Epistemic" kernel from [1]). Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". This ...

Created: 2012-01-29 | Last updated: 2012-01-30

Content from the e-LICO group