Lucene_bioquery_optimizer_MR1
(1)
This workflow does four things:
it retrieves documents relevant for the query string
it discovers entities in those documents, these are considered relevant entities
it filters proteins from those entities (on the tag protein_molecule)
it removes all terms from the list produced by 3 (query terms temporarily considered proteins)
ToDo
Replace step 4 by the following procedure:
1. remove the query terms from the output of NER (probably by a regexp matching on what is inside the tag, ...
Created: 2007-12-10
Credits:
Marco Roos
AID