Note: some items may not be visible to you, due to viewing permissions.
Contents (click to expand/contract)
RCOMM 2013 Challenge: 2. Solution (Re-infe...
(1)
This process is the solution for one of the RCOMM 2013 data mining challenge tasks which participants had to solve within 10 minutes. The task was this: Given
(1) a variant of the Golf data set (found in the //Samples/data folder) where the attribute Outlook is missing,
(2) a decision tree model built on the complete Golf data set, and
(3) a utility data set containing only the three distinct values of Golf,
create an example set based on the incomplete data set from (1) containing all po...
Created: 2013-09-09
RCOMM 2013 Challenge: 1. Generate input data
(1)
This process generates the input for one of the RCOMM 2013 data mining challenge tasks. For a description, please refer to the solution process.
Created: 2013-09-09
| Last updated: 2013-09-09
RCOMM 2012 Sudoku Challenge: 4 - Solve Sudoku
(1)
Process from the http://rcomm2012.org live data mining challenge. The task was to partially solve a Sudoku puzzle by solving three subtasks. Processes 1-3 in this pack are the actual solutions to the tasks, process 0 loads the required data into your repository and process 4 is the bonus that solves the entire puzzle.
Finally, this process solves the entire Sudoku. It uses process 3 to add inferred cell values to the set of predefined values. It does so repeatedly until 81 values are defined...
Created: 2012-09-04
RCOMM 2012 Sudoku Challenge: 3 - Drop impo...
(1)
Process from the http://rcomm2012.org live data mining challenge. The task was to partially solve a Sudoku puzzle by solving three subtasks. Processes 1-3 in this pack are the actual solutions to the tasks, process 0 loads the required data into your repository and process 4 is the bonus that solves the entire puzzle.
IMPORTANT: Save this process as "RCOMM 2012 Sudoku Challenge: 3 - Drop impossible values (loop)" since it is included from process 4.
This process uses process 2 repeatedly to ...
Created: 2012-09-04
RCOMM 2012 Sudoku Challenge: 2 - Drop impo...
(1)
Process from the http://rcomm2012.org live data mining challenge. The task was to partially solve a Sudoku puzzle by solving three subtasks. Processes 1-3 in this pack are the actual solutions to the tasks, process 0 loads the required data into your repository and process 4 is the bonus that solves the entire puzzle.
IMPORTANT: Save this process as "RCOMM 2012 Sudoku Challenge: 2 - Drop impossible values" since it is included from process 3.
This process eliminates impossible rows from the...
Created: 2012-09-04
RCOMM 2012 Sudoku Challenge: 1 - Generate ...
(1)
Process from the http://rcomm2012.org live data mining challenge. The task was to partially solve a Sudoku puzzle by solving three subtasks. Processes 1-3 in this pack are the actual solutions to the tasks, process 0 loads the required data into your repository and process 4 is the bonus that solves the entire puzzle.
This process loads the numbers 1..9 and uses two Cartesian Product operators to compute the set of all combinations x, y, v. Finally we compute the index of the 3x3 sub-field a...
Created: 2012-09-04
RCOMM 2012 Sudoku Challenge: 0 - Get Data
(1)
Process from the http://rcomm2012.org live data mining challenge. The task was to partially solve a Sudoku puzzle by solving three subtasks. Processes 1-3 in this pack are the solutions to the actual tasks, process 0 loads the required data into your repository and process 4 is the bonus that solves the entire puzzle.
This process loads the two required data sets from rapid-i.com and stores them in your repository. The first data set contains the numbers 1 to 9 and the second contains the pr...
Created: 2012-09-04
RCOMM 2011 Challenge 3: RapidDraw
(1)
This is a solution for Challenge 3 of the a live data mining process design competition "Who Wants to be a Data Miner" held at RCOMM 2011 in Dublin.
The task was to generate a dataset that looks like a spiral when viewed in an appropriate plotter.
This process opens a file with three initial data points and subsequently adds more points to the data set in a loop, using macros to extract data values of the predecessors and a "Generate Attributes" operator to add new data points.
To view the...
Created: 2011-11-02
RCOMM 2011 Challenge 2: Vodka or President?
(1)
This is a solution for Challenge 2 of the a live data mining process design competition "Who Wants to be a Data Miner" held at RCOMM 2011 in Dublin.
Those of you who loved "You Don't Know Jack" will remember this task: To tell whether a certain word is the name of a vodka or the name of a leader of the Soviet Union. The RapidMiner process was allowed to download data from Wikipedia to make this decision.
One input file contains a list of words for which two attributes "Vodka" or "Leader" wi...
Created: 2011-11-02
RCOMM 2011 Challenge 1: Hobbit Genealogy
(1)
This is a solution for Challenge 1 of the a live data mining process design competition "Who Wants to be a Data Miner" held at RCOMM 2011 in Dublin.
As you certainly know, Balbo Baggins is the common ancestor of Balbo and Frodo Baggins. The file opened by the operator "Open Ancestor" contains a table with details about parentship in the Baggins family (insert a breakpoint after read CSV). Each example contains a parent and a child. Of course, the same parent can be contai...
Created: 2011-11-02
| Last updated: 2011-11-02
POSTing CSV file to RapidAnalytics Web ser...
(1)
This is the second demo process used in the RapidAnalytics video on creating Web services. This process is the actual scoring process and used the model generated by the first process. The first input must be a CSV blob in the repository. Once the process is exposed as a Web service in RapidAnalytics, this input will be replaced by the body of the HTTP POST request.
Created: 2011-11-02
| Last updated: 2011-11-02
POSTing CSV file to RapidAnalytics Web ser...
(1)
This is the first demo used in the RapidAnalytics video on creating Web services. It downloads three data sets provided by data.gov.uk and generates a regression model and stores it as "RegressionTree" in the repository.
Created: 2011-11-02
RCOMM Challenge 3: Fibonacci Numbers (Inte...
(1)
At the RComm 2010 (www.rcomm2010.org), an unusual competition was held. Titled "Who Wants to Be a Data Miner", three challenges were issued to the participants of the conference. In all challenges, participants had to design RapidMiner processes as quickly as possible. This is the original solution I had in mind for Challenge 2: "Fibonacci Numbers". It defines a macro n, recurses by applying itself using an "Embed Process" operator on n-1 and n-2, appends the results (so the length is F(n-1)...
Created: 2010-09-17
| Last updated: 2010-09-17
RCOMM Challenge 3: Fibonacci Numbers (Impr...
(1)
At the RComm 2010 (www.rcomm2010.org), an unusual competition was held. Titled "Who Wants to Be a Data Miner", three challenges were issued to the participants of the conference. In all challenges, participants had to design RapidMiner processes as quickly as possible. This is the winning process of Challenge 2: "Fibonacci Numbers" by Matko Bošnjak. This was the task:
The n-th Fibonacci number is F(n)=F(n-1)+F(n-2), and F(0)=0, F(1)=1. Create a process that creates an example set with F(n)...
Created: 2010-09-17
| Last updated: 2010-09-17
RCOMM Challenge 2: Broken Iris
(1)
At the RComm 2010 (www.rcomm2010.org), an unusual competition was held. Titled "Who Wants to Be a Data Miner", three challenges were issued to the participants of the conference. In all challenges, participants had to design RapidMiner processes as quickly as possible. This is the winning process of Challenge 2: "Broken Iris" by Nico Piatkowski. This was the task:
You are given a decision tree model (M) designed on the well-known Iris data set and unlabelled data (U) on which the model is t...
Created: 2010-09-17
RCOMM Challenge 1: 99 bottles of beer
(1)
At the RComm 2010 (www.rcomm2010.org), an unusual competition was held. Titled "Who Wants to Be a Data Miner", three challenges were issued to the participants of the conference. In all challenges, participants had to design RapidMiner processes as quickly as possible. This is the winning process of Challenge 1: "99 bottles of beer" by Sebastian Land. This was the task:
Design a process that produces an example set the rows of which form the lyrics of the well-known song "99 bottles of beer...
Created: 2010-09-17
Optimizing Discretization
(1)
This process generates a decision tree on the Iris data set. Before the decision tree is generated, the input attributes are discretised so we only work on nominal attributes. We use a combination of "Select Subprocess" and "Optimize Parameters" to select the best out of five different discretizazion methods independently for each of the attributes. The process shows, that the resulting accuracy heavily depends on the choice of the method. It varies between 64% and 94%.
Created: 2010-06-18
| Last updated: 2010-06-18
Transaction Analysis Demo from RM 5 Intro Day
(1)
This is the demo process presented at the RapidMiner 5 Intro Day. It combines customer segmentation with direct mailing.
It loads some transaction data, aggregates and pivotes the data so it can be used by a clustering to perform a customer segmentation. Then, additional data is joined with the clustered data. First, response/no-response data is joined, and them some additional information about the users is added. Finally, customers are classified into response/no-response classes.
The dat...
Created: 2010-04-30
| Last updated: 2010-05-05
Stacking
(1)
RapidMiner supports Meta Learning by embedding one or several basic learners as children into a parent meta learning operator. Here, we use a three base learners inside the stacking operator: decision tree induction, linear regression, and a nearest neighbours classifier. Finally, a Naive Bayes learner is used as a stacking learner which uses the predictions of the preceeding three learners to make a combined prediction.
Created: 2010-04-29
Crossvalidation with SVM
(1)
Performs a crossvalidation on a given data set with nominal label, using a Support Vector Machine as a learning algorithm. Inside the cross validation, the first subprocess generates an SVM model, and the second subprocess evaluates it. applying it on a so-far unused subset of the data and counting the misclassifications.
Created: 2010-04-29
Image Mining with RapidMiner
(1)
This is an image mining process using the image mining Web service provided by NHRF within e-Lico. It first uploads a set of images found in a directory, then preprocesses the images and visualizes the result. Furthermore, references to the uploaded images are stored in the local RapidMiner repository so they can later be used for further processing without uploading images a second time.
Created: 2010-04-28
| Last updated: 2012-01-16