DiscoverProteinLink
This workflow implements Swanson's prinicple with services from the AIDA toolbox. It tries to find proteins that link two topics, while they never mentioned together with both topics in any one of the top ranking papers related to either topic 1 or topic 2.
It uses the following logic:
Discovered Protein Link = (Protein[Topic1 AND NOT Topic2] AND Protein[Topic2 AND NOT Topic1]) AND NOT Protein[Topic1 AND Topic2] where 'Protein[Topic1 OPERATOR Topic2]' represents a protein discovered in abstracts returned from Medline using 'Topic1 OPERATOR Topic2' as query.
Comments:
- It may be useful to optimize the queries for the topics by experimenting with a DiscoverProteins subworkflow first. For example 'cancer' surprisingly does not return any proteins, possibly because clinical papers dominate the retrieval results. The query '+cancer -(therapy clinic) +(protein^10.0 proteins^10.0 gene^9 genes^9)' performs much better. It contains the Lucene priority operator '^[priority], where priority=1 is the default. - The nature of the Swansson algorithm makes it much more likely that this workflow returns no results or false positives, than that it returns true positives. - True positives returned by this workflow are true with respect to the results of the information retrieval step and information extraction step. Limits: 1. Information retrieval: limited number of documents returned, uses indexes for searching, searches and returns abstracts only; 2. entity recognition: not guaranteed to recognize all instances of proteins.
Preview
Run
Run this Workflow in the Taverna Workbench...
Option 1:
Copy and paste this link into File > 'Open workflow location...'
http://myexperiment.org/workflows/31/download?version=1
[ More Info ]
Workflow Components
Reviews (0)
Other workflows that use similar services (0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (1)
Log in to make a comment
Is that bottle of wine still available? :)