My role as a biologist and bioinformatician in e-science is to help increase the usefulness of emerging information technologies for biology, while experimenting with new ways to increase insight into mechanisms related to structure and function of DNA in the cell. I experiment with technologies such as workflow, knowledge extraction from text, semantic web and virtual research environments such as myExperiment.

More information on the blog below (originally uploaded as an example for the 'NBIC on workflows' workshop in Lunteren, the Netherlands, March 2008).

http://www.myexperiment.org/blogs/15

Other contact details:

Also see my web page at the University of Amsterdam: http://home.medewerker.uva.nl/m.roos1

Interests:

Structure/function relationship of DNA in the cell, e-science, automated support for modeling biological mechanisms by knowledge extraction and semantic web technology.

Field/Industry: Biology

Occupation/Role(s): PhD, e-(bio)scientist (biology 'power-user'), biology e-Science liaison for NBIC and e-Science organisations

Organisation(s):

Leiden University Medical Centre
University of Amsterdam
NBIC
OMII-UK / myGrid

Note: some items may not be visible to you, due to viewing permissions.

Contents (click to expand/contract)

Taverna 2

Uploader

Marco Roos

HPO-UMLS-ConceptID mapping (1)

Download

Generate HPO-Concept profiles via HPO-UMLS mappings. The result is a list of Concept IDs corresponding to Concept profiles for UMLS concepts that approximate HPO concepts. The output is a table of UMLS-ID, HPO- ID, COncept-ID rows.

Created: 2014-10-20

Credits: Marco Roos BioSemantics

Taverna 2

Uploader

Marco Roos

Get HPO concept label and synonym (1)

Download

This workflow queries bioportal for label and synonyms of Human Phenotype Ontology concepts.Note: this workflow requires a BioPortal API key to work. It can be requested from bioportal.bioontology.org

Created: 2014-10-20 | Last updated: 2014-10-20

Credits: Rajireturn BioSemantics

Taverna 2

Uploader

Marco Roos

Match concept to HPO profiles (1)

Download

This workflow matches a query concept to the list of Human Phenotypes. The Human Phenotypes are the subset of the Human Phenotype Ontology for which we have a mapped UMLS concept available and a concept profile. HPO-UMLS mapping: Winnenburg, R., & Bodenreider, O. (2014). Coverage of Phenotypes in Standard Terminologies. In Proceedings of the ISMB’2014 SIG meeting “BioLINK.” Retrieved from http://phenoday2014.bio-lark.org/pdf/5.pdf Concept Profile Database: July 2012

Created: 2014-10-20

Credits: Marco Roos BioSemantics

Attributions: Match concept profiles Get concept information

Taverna 2

Uploader

Marco Roos

OPS REST services (1)

Download

Library of REST services developed for the Open Pharmacological Space (OPS) by the OpenPHACTS project. Usage: 1. Copy-paste a service into your workflow 2. Add an output to responseBody 3. Run the workflow (this will produce output in XML) 4. Copy the XML output 5. Go back to the design window and add the XPath widget to the canvas 6. Link the responseBody output to the XPath widget input 7. Paste the XML output to the example window in the XPath configure window 8. Select the desired eleme...

Created: 2013-07-06

Credits: Marco Roos Open PHACTS

Taverna 2

Uploader

Marco Roos

OPS_FreetextToTargetInfo (1)

Download

Workflow to retrieve target information for the concepts as refered to by humans (the input). Known issues: It produces error values for the concepts returned by ConceptWiki that are apparently not present in OPS (e.g. for "ezh2" and limit=10, it gives 7/10 error values vs "ezh2 (homo sapiens)" giving 2 valid values).

Created: 2013-06-18

Credits: Marco Roos Katy Wolstencroft paul groth

Taverna 2

Uploader

Marco Roos

Mining the Kegg pathway database with the ... (1)

Download

Genome-Wide Association studies (GWAS) with metabolomic phenotypes yield several statistically significant SNP-metabolite associations. To understand the biological basis of the association, scientists typically dwell on identifying genes in the vicinity of the SNP and the possible pathways that the gene participates in. The information needed to arrive at an understanding of the mechanistic basis of the association requires integration of disparate data sources. The purpose of this workflow ...

Created: 2013-01-30

Credits: Marco Roos

Taverna 2

Uploader

Marco Roos

DatabaseID to ConceptID (7)

Download

Purpose: This workflow maps input Identifiers, common database identifiers, to the Concept Identifiers from the EMC ontology. Result: Concept Identifiers from the EMC ontology. Comments: Database: one of CAS, DRUG, etc. The supported databases are listed below (database, description, example). CAS, Chemical Abstracts Service registry number, 64-17-5. DRUG, Drug Bank, DB00316. AF, Affymetrix, 200007_at. CHEB, ChEBI, 16236. CHID, ChemIDplus, 0000050000. EG, Entrez-Gene, 3064. GO, Gene Ontol...

Created: 2012-06-25 | Last updated: 2014-07-14

Credits: Marco Roos Martijn Schuemie Reinout van Schouwen BioSemantics

Taverna 2

Uploader

Marco Roos

Match gene lists based on information in l... (7)

Download

[THIS WORKFLOW IS IN BETA STAGE] This workflow computes the match between two lists of Entrez Gene Identifiers by means of concept profile matching (Jelier et al., van Haagen et al.). The result of this is a list of concepts ordered by their matching score (the length of the list set by maxMatchNr). Of this list the summed scores are explained by computing the concepts that contribute most to the combination of the matching genes. Example to explain (by analogy): When a group of informatic...

Created: 2012-04-17 | Last updated: 2012-04-25

Credits: Marco Roos Reinout van Schouwen Eleni Kristina Hettne BioSemantics

Attributions: Match concept profiles Explain concept scores

Taverna 2

Uploader

Marco Roos

Match concept profiles (6)

Download

Purpose of workflow: The workflow can be used to match a set of concept profiles with another set of concept profiles. Result: A list of concepts ordered by their match to the query concept profiles.

Created: 2011-12-02 | Last updated: 2014-07-14

Credits: Marco Roos Kristina Hettne Martijn Schuemie Reinout van Schouwen

Taverna 2

Uploader

Marco Roos

Gene expression interpretation by the Glob... (1)

This workflow adds meaning to gene expresion values by performing a standard and a literature weighted Global Test. Gene expression is expected to be from Affymetrix microarrays, for which an RMA normalization and entrez Gene ID mapping/summation is performed. Original workflow is by Dennis Leenheer, edits by Marco Roos. Scripts by Kristina Hettne, acknowledging Rob Jellier, Jelle Goeman, and Peter-Bram 't Hoen. The workflow was created for the LUMC BioSemantics group, part of the Human Gen...

Created: 2011-04-26 | Last updated: 2011-04-26

Credits: Marco Roos

Taverna 2

Uploader

Marco Roos

BioAID_ProteinDiscovery (8)

Download

The workflow extracts protein names from documents retrieved from MedLine based on a user Query (cf Apache Lucene syntax). The protein names are filtered by checking if there exists a valid UniProt ID for the given protein name.

Created: 2010-05-10 | Last updated: 2013-08-16

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

BioAID_ProteinDiscovery_filterOnHumanUnipr... (11)

Download

This workflow finds proteins relevant to the query string via the following steps: A user query: a single gene/protein name. E.g.: (EZH2 OR "Enhancer of Zeste"). Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apache's Lucene) Discover proteins: extract proteins discovered in the set of relevant abstracts with a 'named entity recognizer' trained on genomic terms using a Bayesian approach; the AIDA serv...

Created: 2009-05-28

Credits: Marco Roos Martijn Schuemie AID AID_myGrid_collaboration

Attributions: BioAID_DiseaseDiscovery_RatHumanMouseUniprotFilter

Taverna 1

Uploader

Marco Roos

BioAID_EnirchBioModelWithProteinsFromText (7)

Download

This workflow is for demonstration purposes only. Please contact the authors if you wish to try it. We will gladly collaborate with you. Summary This workflow extracts proteins and protein relations from Medline. Extracted protein names (symbols of at least 3 characters) are validated against mouse, rat, and human UniProt symbols, so the results are limited to these species. This workflow follows the following basic steps: it retrieves documents relevant for the query string i...

Created: 2009-05-16 | Last updated: 2009-05-16

Credits: Marco Roos Sophia katrenko Andrew Gibson M. Scott Marshall Willem van Hage Edgar Martijn Schuemie AID

Taverna 1

Uploader

Marco Roos

BioAID_DiseaseDiscovery_RatHumanMouseUnipr... (4)

Download

This workflow finds disease relevant to the query string via the following steps: 1. A user query: a list of terms or boolean query - look at the Apache Lucene project for all details. E.g.: (EZH2 OR "Enhancer of Zeste" +(mutation chromatin) -clinical); consider adding 'ProteinSynonymsToQuery' in front of the input if your query is a protein. 2. Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apa...

Created: 2008-12-15 | Last updated: 2011-08-11

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

Demo_DiseaseDiscovery_byHumanUniprot_scaffold (1)

Download

This workflow finds disease relevant to the query string via the following steps: A user query: a list of terms or boolean query - look at the Apache Lucene project for all details. E.g.: (EZH2 OR "Enhancer of Zeste" +(mutation chromatin) -clinical); consider adding 'ProteinSynonymsToQuery' in front of the input if your query is a protein. Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apache's Lucene)...

Created: 2007-12-10

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

Retrieve_documents_MR1 (1)

Download

This workflow applies the search web service from the AIDA toolbox. Comments: This search service is based on lucene defaults; it may be necessary to optimize the querystring to adopt the behaviour to what is most relevant in a particular domain (e.g. for medline prioritizing based on publication date is useful). Lucene favours shorter sentences, which may be bad for subsequent information extraction.

Created: 2007-12-10

Credits: Marco Roos Edgar AID

Taverna 1

Uploader

Marco Roos

Retrieve_bio_documents (2)

Download

This workflow retrieves relevant documents, based on a query optimized by adding a string to the original query that will rank the search output according to the most recent years. The added string adds years with priorities (most recent is highest); it starts at 2007.

Created: 2007-12-10 | Last updated: 2007-12-10

Credits: Marco Roos Edgar AID

Taverna 1

Uploader

Marco Roos

Lucene_bioquery_optimizer_MR1 (1)

Download

This workflow does four things: it retrieves documents relevant for the query string it discovers entities in those documents, these are considered relevant entities it filters proteins from those entities (on the tag protein_molecule) it removes all terms from the list produced by 3 (query terms temporarily considered proteins) ToDo Replace step 4 by the following procedure: 1. remove the query terms from the output of NER (probably by a regexp matching on what is inside the tag, ...

Created: 2007-12-10

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

Link_protein_to_OMIM_disease (1)

Download

No description

Created: 2007-12-10

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

Flatten_and_make_unique (1)

Download

No description

Created: 2007-12-10

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

Extract_proteins (2)

Download

This workflow filters protein_molecule-labeled terms from an input string(list). The result is a tagged list of proteins (disregarding false positives in the input). Internal information: This workflow is a copy of 'filter_protein_molecule_MR3' used for the NBIC poster (now in Archive).

Created: 2007-12-10 | Last updated: 2007-12-10

Credits: Marco Roos

Taverna 1

Uploader

Marco Roos

Discover_entities (2)

Download

This workflow contains the 'Named Entity Recognize' web service from the AIDA toolbox, created by Sophia Katrenko. It can be used to discover entities of a certain type (determined by 'learned_model') in documents provided in a lucene output format. Known issues: The output of NErecognize contains concepts with / characters, breaking the xml. For post-processing its results it is better to use string manipulation than xml manipulations. The output is per document, which means entities will ...

Created: 2007-12-10 | Last updated: 2007-12-10

Credits: Marco Roos Sophia katrenko AID

Taverna 1

Uploader

Marco Roos

TestIteratorStrategy_withCloning (2)

Download

This workflow implements a strategy for this problem: > I would like to perform an iteration including a dot product between > a list and a list of lists; example: > Input: > > [1] (1) > [A,B,C] (2) > [[a,b],[c,d],[e,f]] (3) > > Desired output: > > [1Aa, 1Ab, 1Bc, 1Bd, 1Ce, 1Cf] In this implementation a java beanshell is used to clone the items in list 2 as many times per item as there are items in the sublists of list 3. The iteration stra...

Created: 2007-11-29 | Last updated: 2007-11-29

Credits: Marco Roos

Taverna 1

Uploader

Marco Roos

CloneItemsInList (1)

Download

Utility workflow that clones an item copy_number times. You can use this to work around standard iteration strategies, e.g. in combination with the CountListItems workflow. Workflow examples: TestIterationStrategy_withClones. For an alternative approach see TestIterationStrategy_withNesting. Example I/O: input: A copy_number: 3 result: [A,A,A] input: [A,B,C] copy_number: 3 result: [[A,A,A][B,B,B][C,C,C]] input: [A,B,C] copy_number: [3,2] result: [[[A,A,A],[A,A]][[B,B,B],[B,B]],[[C,C,C],...

Created: 2007-11-29

Credits: Marco Roos

Taverna 1

Uploader

Marco Roos

TestIteratorStrategy_withNesting (1)

Download

Implementation of the iteration workaround by Tom Oin conform the Q&A below. The nested workflow 'NestedProcessor' is called that to conform to Tom's explanation. For an alternative solution using a java beanshell to clone list items see 'TestIteratorStrategy_withCloning. This workflow implements the following Q&A: Marco Roos wrote: > Dear Taverna user, > > Issue 1: Complex iteration > > I would like to perform an iteration including a dot product between > a list and a list of li...

Created: 2007-11-29

Credits: Marco Roos Tomoinn

Taverna 1

Uploader

Marco Roos

TestIterator (1)

Download

Workflow to experiment with list iteration strategies. Look at metadata of nested workflow 'Concatenate' to see the current iteration strategy.

Created: 2007-11-28

Credits: Marco Roos

Taverna 1

Uploader

Marco Roos

BioAID_Discover_proteins_from_text_plus_sy... (1)

This workflow discovers proteins from plain text and adds synonyms using Martijn Schuemie's proteins synonym service. Proteins are discovered with the AIDA 'Named Entity Recognize' web service by Sophia Katrenko (service based on LingPipe), from which output it filters out proteins. The Named Recognizer services uses the pre-learned genomics model, named 'MedLine', to find genomics concepts in plain text.

Created: 2007-11-15

Credits: Marco Roos Martijn Schuemie AID

Taverna 1

Uploader

Marco Roos

Discover_proteins_from_text (2)

Download

This workflow discovers proteins from plain text. It is built around the AIDA 'Named Entity Recognize' web service by Sophia Katrenko (service based on LingPipe), from which output it filters out proteins. The Named Recognizer services uses the pre-learned genomics model, named 'MedLine', to find genomics concepts in plain text.

Created: 2007-11-15 | Last updated: 2007-11-15

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

BioAID_ProteinToDiseases (1)

Download

This workflow was based on BioAID_DiseaseDiscovery, changes: expects only one protein name, adds protein synonyms). This workflow finds diseases relevant to the query string via the following steps: A user query: a single protein name Add synonyms (service courtesy of Martijn Scheumie, Erasmus University Rotterdam) Retrieve documents: finds relevant documents (abstract+title) based on query Discover proteins: extract proteins discovered in the set of relevant abstracts 5. Link proteins ...

Created: 2007-11-14 | Last updated: 2007-11-15

Credits: Marco Roos Martijn Schuemie AID

Attributions: BioAID_DiseaseDiscovery_RatHumanMouseUniprotFilter

Taverna 1

Uploader

Marco Roos

CountListElements (5)

Download

Very simple workflow to count the number of items in a list (top level only in case of nested lists). Does no more than count = list.size();

Created: 2007-10-17 | Last updated: 2007-10-17

Taverna 1

Uploader

Marco Roos

DiscoverProteinLink (2)

Download

COMPETITION: For friends only: If you find any two topics that return true positives with this workflow I will buy you a bottle of wine (or equivalent). Terms: if we confirm that the protein was indeed never mentioned together with both input topics in one article, we will publish this together. ---- This workflow implements Swanson's prinicple with services from the AIDA toolbox. It tries to find proteins that link two topics, while they never mentioned together with both topics in ...

Created: 2007-10-03 | Last updated: 2007-11-15

Credits: Marco Roos AID

Taverna 1

Uploader

Marco Roos

ProteinSynonymsToQuery (2)

Download

This workflow uses Martijn Schuemie's protein synonym service to produce synonyms and a new query string from the input query term. The service is limited to proteins, enzymes and genes. An input query that is a boolean string will be split and processed, but the boolean logic of the input query will be lost. Workflow URL: http://rdf.adaptivedisclosure.org/~marco/BioAID/Public/Workflows/BioAID/ProteinSynonymsToQuery.xml

Created: 2007-10-03 | Last updated: 2007-11-13

Credits: Marco Roos Martijn Schuemie AID

What is this?

Linked Data

Non-Information Resource URI:

Alternative Formats