Feature selection with background knowledge
Sample workflow that exploits background knowledge in the form of links connecting attributes in the different measurement data.
The input data is assumed to be labeled and consists of different interconnected biological levels. Additionally a number of files defining links between attributes in the different measurement data are provided. This data was used in the KUP data mining challenge (http://tunedit.org/challenge/ON). Here we focus on two biological levels: protein data measured by LCMS, and miRNA is measured using specific pan-miRNA arrays. The goal is to build a regression model for predicting Pelvic Diameter that has a good performance on a holdout miRNA data.
This WF first selects 20 proteins using ReliefF from LCMS data labeled with Differential Renal Function, then it selects 20 features from miRNA data labeled with Pelvic Diameter, then it extends the 20 miRNAs by those miRNAs that are related with the selected 20 proteins from LCMS. This procedure gives rise to 54 miRNAs in total. Then we compare the performance of a simple linear SVM evaluated on a holdout miRNA data and trained on the miRNA data with only 20 features, with 54 features obtained from both miRNA and LCMS, and 54 features selected only from miRNA.
Preview
Run
Not available
Workflow Components
Unavailable
Workflow Type
Version 1 (of 1)
Log in to add Tags
Shared with Groups (0)
None
Log in to add to one of your Packs
Statistics
Reviews (0)
Other workflows that use similar services (0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment