compare_pubmed_results_geographically
Created: 2015-05-05 12:35:42
Last updated: 2015-05-05 12:44:14
This workflow analyzes the scientific output, as documented by PubMed, geographically. The workflow takes as input the PubMed data in XML and the ISO 3166-1 and ISO 3166-3 country lists. The XML file can contain any subset from a specific PubMed search.
The XPath components extract author affiliations, and feed these to a series of Beanshell components that match these with countries in the ISO standard. This data is then fed to an Rshell using the rworldmap R package to map the affiliation counts to a world map.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Magnus Palmblad and Arzu Tugce Guler
Leiden University Medical Center
2015 |
Titles (1)
compare_pubmed_results_geographically |
Descriptions (1)
This workflow analyzes the scientific output, as documented by PubMed, geographically. The workflow takes as input the PubMed data in XML and the ISO 3166-1 and ISO 3166-3 country lists. The XML file can contain any subset from a specific PubMed search.
The XPath components extract author affiliations, and feed these to a series of Beanshell components that match these with countries in the ISO standard. This data is then fed to an Rshell using the rworldmap R package to map the affiliation counts to a world map. |
Dependencies (0)
Inputs (3)
Name |
Description |
pubmed_output |
|
countries |
|
former_countries |
|
Processors (6)
Name |
Type |
Description |
get_authors_affiliations |
xpath |
Xpath Expression/PubmedArticleSet/PubmedArticle/MedlineCitation/Article/AuthorList/Author/AffiliationInfo/Affiliation |
count_affiliations_by_country |
beanshell |
Script/*
Inputs: country_code (1)
country_name (1)
affiliations (1)
Outputs: country_list (1)
count_list (1)
*/
country_list = new ArrayList(); //list of all countries in 3 letter codes
count_list = new ArrayList(); //list of counts, corresponding to countries
matched_countries = new ArrayList(); //countries are matched for a single affiliation
string_index = new ArrayList(); //keeps indexes of matched countries for each affiliation
string_length = new ArrayList(); //keeps the string length of matched countries for each affiliation
country_matches = new ArrayList(); //matched countries after index filter (which picks the last) for each affiliation
int index1,index2;
int length1,length2;
//key->3 letter country code, value->affiliation count
map = new Hashtable();
//creates unified country list with default 0s
for(int i=0; i=0){
//add the corresponding 3-letter country code to matched_countries list of that affiliation
matched_countries.add(country_code.get(j));
//add the index of the country_name to string_index list of that affiliation
string_index.add(si);
//add the length of the matched country_name to string_length list of that affiliation
string_length.add(country_name.get(j).length());
}
}
try{
//get the first element of matched_countries (Exception if the list is empty)
country_matches.add(matched_countries.get(0));
//get the index of the first country_name that is appearing in the affiliation
index1 = string_index.get(0);
//set country the value which has the largest index (the country_name match that appears near the end)
for(int k=1;kindex1){
country_matches.clear();
country_matches.add(matched_countries.get(k));
}
else if(index2==index1){
country_matches.add(matched_countries.get(k));
}
index1=index2;
}
//get the first element of country_matches (Exception if the list is empty)
country = country_matches.get(0);
//get the length of the first country_name that is appearing in the affiliation
length1 = string_length.get(0);
//set country the value which is the longest match (US vs USSR)
for(int k=1;klength1){
country = country_matches.get(k);
}
length1=length2;
}
//increase the count for the country which is matched for the affiliation
temp1=map.get(country)+1;
map.put(country,temp1);
}
catch (Exception e){
//The affiliation is not matched with any of the countries
}
//Reset the arraylists for the next iteration of affiliation
matched_countries.clear();
string_index.clear();
string_length.clear();
country_matches.clear();
}
//get the counts for each country and copy them to the output arraylist
for(int i=0; i |
import_country_list |
spreadsheet |
|
map_former_countries |
beanshell |
Script/*
Inputs: meta_country_list (1)
meta_count_list (1)
former_countries_list (1)
current_countries_list (1)
Outputs: country_list (1)
count_list (1)
*/
country_list = new ArrayList();
count_list = new ArrayList();
successor_countries = new ArrayList();
//creates hashtable with country codes as keys and counts as values
country_count = new Hashtable();
for(int i=0; i |
import_former_countries_list |
spreadsheet |
|
draw_world_map |
rshell |
Script#Inputs: countries (String vector)
# counts (Integer vector)
#
#Outputs: output (String)
library(rworldmap)
affiliation<-as.data.frame(matrix(nrow=length(countries), ncol=2));
affiliation[,1]<-data.matrix(countries);
affiliation[,2]<-data.matrix(counts);
total<-sum(counts);
sPDF<-joinCountryData2Map(affiliation, joinCode="ISO3", nameJoinColumn="V1");
output<-"output.png";
png(output);
par(mai=c(0,0,0.2,0),xaxs="i",yaxs="i");
mapCountryData(sPDF, nameColumnToPlot="V2", catMethod = exp(seq(from=0, to=log(max(affiliation[,2])), length.out=100)), colourPalette = "heat", addLegend = TRUE, borderCol ='black', mapTitle ='Publications on mass spectrometry in PubMed');
dev.off();
R Serverlocalhost:6311 |
Beanshells (2)
Name |
Description |
Inputs |
Outputs |
count_affiliations_by_country |
|
affiliations
country_name
country_code
|
country_list
count_list
|
map_former_countries |
|
meta_count_list
meta_country_list
former_countries_list
current_countries_list
|
count_list
country_list
|
Outputs (1)
Name |
Description |
world_map |
|
Datalinks (13)
Source |
Sink |
pubmed_output |
get_authors_affiliations:xml_text |
get_authors_affiliations:nodelist |
count_affiliations_by_country:affiliations |
import_country_list:A |
count_affiliations_by_country:country_code |
import_country_list:B |
count_affiliations_by_country:country_name |
countries |
import_country_list:fileurl |
count_affiliations_by_country:count_list |
map_former_countries:meta_count_list |
count_affiliations_by_country:country_list |
map_former_countries:meta_country_list |
import_former_countries_list:A |
map_former_countries:former_countries_list |
import_former_countries_list:B |
map_former_countries:current_countries_list |
former_countries |
import_former_countries_list:fileurl |
map_former_countries:count_list |
draw_world_map:counts |
map_former_countries:country_list |
draw_world_map:countries |
draw_world_map:output |
world_map |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1
(of 1)
Credits (3)
(People/Groups)
Attributions (2)
(Workflows/Files)
Shared with Groups (0)
None
Featured In Packs (0)
None
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (1)
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment