Concept profile calculation CPGP
Created: 2014-05-27 11:40:24
Last updated: 2014-06-30 09:21:05
Part of the CPGP workflow that makes concept pairs, uses these to calculate a contingency table and the uncertainty coefficient. The values generated are inserted into a premade mysql table.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Titles (1)
Concept Profile Generation Pipeline |
Descriptions (1)
Requirements:
Have MySQL installed with a database called mydb or please create one.
Have a mysql connector file. This can be downloaded from mysql.com.
Rename the downloaded file to mysql-connector-java.jar and place it in the the local lib file of Taverna as written on the dependencies tab.
Choose this dependency by rightclicking a beanshell script and selecting it for the whole workflow as system classloader.
Download the Peregrine SKOS CLI from: https://trac.nbic.nl/biosemantics/downloads
Install LVG2013Lite. See: https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Change lvg.properties
LVG_DIR=/home/path/to/lvg2013lite/
This file can be found in:
/home/path/to/lvg2013lite/data/config/lvg.properties
Copy properties file from production.properties which can be found on: https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Change the normalizer.lvg.properties and normalizer.lvg.binaryCache properties to point to the LVG installation path.
This file can be obtained from https://trac.nbic.nl/biosemantics/wiki/Peregrine%20SKOS%20CLI
Have Peregrine indexer installed locally and give it the correct path in the component named "Indexer_tool". Which can be found in the nested workflow named Peregrine indexer.
If using a Windows machine please install cygwin. Set cygwin in the $PATH of the environment variabale.
Go to taverna File --> preferences --> Tool invocation --> set modify on explicit locations --> edit a default local to Shell: C:\Windows\system32\cmd.exe /c and save that. -->Set Modify on: symbolic locations --> edit --> choose default local and save.
Have JAVA_HOME C:\Program Files\jdk 1.7.0 in environment variabale when using Windows.
The nested workflow Get_abstract uses a sql command to retrieve abstracts which presumes that it uses the mydb database. Please change this when using a other sql database. |
Dependencies (0)
Inputs (1)
Name |
Description |
db_properties |
Please load a file in as seen in the example where the user, password, url (which is the database it should use), and driver is stated.
|
Processors (3)
Name |
Type |
Description |
Calculate Uncertainty Coefficient |
workflow |
A nested workflow to calculate the Uncertainty coefficient with the contingency values of the concept pairs.
The calculated values are afterwards inserted into a SQL table.
|
Calculate Contincengy table |
workflow |
A nested workflow that retrieves the values of a 2x2 contingency table and puts the co-occurence of made concept pairs into the SQL table TableName_co_occurence.
This is done by creating concept pairs which is retrieved from the Sql table TableName_occurence.
This contincengy table consits out of a matrix where two concepts co-occur, concept A only occurs
and not concept B, concept B occurs and not concept A and neither concepts occur in literature.
B Not B
A AB A
not A B not A not B
This is calculated for every possible concept pair. |
table_name |
stringconstant |
Valuealpha |
Beanshells (6)
Name |
Description |
Inputs |
Outputs |
Beanshell_SQL_INSERT_UC_values_INTO_table |
A Beanshell script that generates a SQL statement to insert the calculated values of uncertainty coefficient per
concept pair into the table. |
timestamp
concept_pairs
UC_value
|
output_SQL
|
Beanshell_calculate_Uncertainty_Coefficient |
A Beanshell script that calculates the Uncertainty coefficient based on the script of Herman van Haagen. |
corrected_contingency_values
|
UC_value
|
Beanshell_SQL_generate_contingency_table |
A Beanshell script that generates a SQL statement to retrieve the amount two concepts co-occur, concept A only occurs
and not concept B, concept B occurs and not concept A and neither concepts occur in literature. The output of only concept A and only concept B needs to be corrected. This is done in the Beanshell script called: "Beanshell_correct_contincengy_table". Because the output of those two values were with the co-occurence. This needed to be subtracted. |
concept_pairs
table_name
|
out1
|
Beanshell_correct_contincengy_table |
A Beanshell script that corrects the output SQL retrieved contincengy table by subtracting the co-occurence values of the concept only a and of concept only b value. |
contingency_values
|
corrected_contingency_values
|
Beanshell_generate_SQL_statement_Get_destinct_concept |
A Beanshell script that retrieves every concept from the Sql table TableName_occurence. |
table_name
|
out1
|
Beanshell_make_concept_pairs |
A Beanshell script that gets as input a list of concepts to create concept pairs. |
concept_B
concept_A
|
concept_pair
|
Datalinks (7)
Source |
Sink |
Calculate Contincengy table:outputlist |
Calculate Uncertainty Coefficient:concept_pairs |
Calculate Contincengy table:corrected_contingency_values |
Calculate Uncertainty Coefficient:corrected_contingency_values |
db_properties |
Calculate Uncertainty Coefficient:propertyString |
table_name:value |
Calculate Uncertainty Coefficient:timestamp |
db_properties |
Calculate Contincengy table:propertyString |
table_name:value |
Calculate Contincengy table:table_name |
Calculate Uncertainty Coefficient:output |
end |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1 (earliest)
(of 2)
Credits (1)
(People/Groups)
Attributions (0)
(Workflows/Files)
None
Shared with Groups (0)
None
Featured In Packs (1)
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment