CPGP Peregrine Get Abstracts Create tables
Created: 2014-05-27 11:29:40
Last updated: 2014-06-30 09:15:55
No description has been set
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Titles (1)
Concept Profile Generation Pipeline |
Descriptions (1)
Requirements:
Have MySQL installed with a database called mydb or please create one.
Have a mysql connector file. This can be downloaded from mysql.com.
Rename the downloaded file to mysql-connector-java.jar and place it in the the local lib file of Taverna as written on the dependencies tab.
Choose this dependency by rightclicking a beanshell script and selecting it for the whole workflow as system classloader.
Download the Peregrine SKOS CLI from: https://trac.nbic.nl/biosemantics/downloads
Install LVG2013Lite. See: https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Change lvg.properties
LVG_DIR=/home/path/to/lvg2013lite/
This file can be found in:
/home/path/to/lvg2013lite/data/config/lvg.properties
Copy properties file from production.properties which can be found on: https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Change the normalizer.lvg.properties and normalizer.lvg.binaryCache properties to point to the LVG installation path.
This file can be obtained from https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Have Peregrine indexer installed locally and give it the correct path in the component named "Indexer_tool". Which can be found in the nested workflow named Peregrine indexer.
If using a Windows machine please install cygwin. Set cygwin in the $PATH of the environment variabale.
Go to taverna File --> preferences --> Tool invocation --> set modify on explicit locations --> edit a default local to Shell: C:\Windows\system32\cmd.exe /c and save that. -->Set Modify on: symbolic locations --> edit --> choose default local and save.
Have JAVA_HOME C:\Program Files\jdk 1.7.0 in environment variabale when using Windows.
The nested workflow Get_abstract uses a sql command to retrieve abstracts which presumes that it uses the mydb database. Please change this when using a other sql database. |
Dependencies (1)
Inputs (2)
Name |
Description |
Ontology_input |
A Skos formated dictionary/thesaurus of predefined concepts.
|
db_properties |
Please load a file in as seen in the example where the user, password, url (which is the database it should use), and driver is stated.
|
Processors (3)
Name |
Type |
Description |
Peregrine indexer |
workflow |
The nested workflow Peregrine Indexer uses a java peregrine skos cli.jar file to index the occurence of concepts in documents.
The Peregrine indexer program can be downloaded from trac.nbic.nl/biosemantics/downloads.
Install LVG2013Lite. See: https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Copy properties file from production.properties which can be downloaded from: https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Change lvg.properties
LVG_DIR=/home/path/to/lvg2013lite/
This file can be found in:
/home/path/to/lvg2013lite/data/config/lvg.properties
Change the normalizer.lvg.properties and normalizer.lvg.binaryCache properties to point to the LVG installation path.
normalizer.lvg.properties =/home/path/to/lvg2013lite/data/config/config/lvg.properties normalizer.lvg.binaryCahce=/home/path/to/lvg2013lite/standartNormCache2013.bin
Have MySQL installed with a database called mydb or please create one.
Have mysql-connector-java-5.1.28-bin.jar or better placed in the local JAR files.
|
Create Tables |
workflow |
This nested workflows create the necessary tables for inserting the generated values. The following tables are created:
- Co-occurence table
This table will consist out of
- Occurence table
- Uncertainty coefficient table
- Inner product table
- Big table |
Get_abstracts |
workflow |
A nested workflow to retrieve abstracts from a sql table. |
Beanshells (9)
Name |
Description |
Inputs |
Outputs |
Beanshell_SQL_generate_table_UC |
A Beanshell script that generates a table for the uncertainty coefficient values. |
time_stamp
|
output
|
Beanshell_SQL_CREATE_last_big_table |
|
timestamp
|
out1
|
Beanshell_SQL_query_generator_Concept_Doc_occurence |
A Beanshell script that generates a Sql statement to INSERT
the concept URI and document URI in the table named after the timestamp/tablename the workflow started.
It uses a database called mydb. If there is not a database/schema named mydb, please create one. |
table_name
concept_occurence_document
|
out1
|
Beanshell_extract_uri_doc_and_abstract |
A Beanshell script which loops through the input to seperate the Document URIs and Abstract URIs. |
Input
|
doc_uri
abstracts
|
Beanshell_remove_dot_and_put_correct_doc_uri_in_con_occ_doc |
A Beanshell script that replaces the temp created Taverna file with the correct Document URI. |
concept_occurence_document
doc_uri
|
out1
|
Beanshell_generate_SQL_occurence_concept_doc_table |
A Beanshell script which generates a sql statement to CREATE a TABLE.
TABLE name is the input of the timestamp.
The table consists out of 4 columns: concept_occ_doc_id, concept_uri, occurence_uri and doc_uri.
Where as concept_occ_doc_id is the PRIMARY KEY and AUTOINCREMENTED.
Every other column has a VARCHAR(200)
|
timeStamp
|
Table_sql
|
Beanshell_get_timestamp |
A Beanshell script which retreives the time the program started.
As in year, month, day, hour, minutes and seconds in military time. |
|
timeStamp
|
Beanshell_generate_SQL_table_inner_product |
|
timestamp
|
out1
|
Beanshell_SQL_create_table_co_occurence |
|
timestamp
|
out1
|
Outputs (2)
Name |
Description |
end |
|
indexer_output |
|
Datalinks (8)
Source |
Sink |
Create Tables:timeStamp |
Peregrine indexer:table_name |
Ontology_input |
Peregrine indexer:ontology |
Get_abstracts:db_results |
Peregrine indexer:Input |
db_properties |
Peregrine indexer:propertyString |
db_properties |
Create Tables:db_properties |
db_properties |
Get_abstracts:propertyString |
Peregrine indexer:output |
end |
Peregrine indexer:indexer_output |
indexer_output |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1 (earliest)
(of 3)
Credits (1)
(People/Groups)
Attributions (0)
(Workflows/Files)
None
Shared with Groups (0)
None
Featured In Packs (1)
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment