Ecological niche modelling workflow

Created: 2013-01-07 18:29:13 Last updated: 2015-06-11 13:32:35

Download Workflow

This workflow takes as input a file containing species occurrence points to create a model with the openModeller Web Service. Algorithm, environmental layers and mask are selected during the workflow. The model is tested (internal test and optional external 10-fold cross validation) and projected one or more times. All points from the input file are used to create a single model, so it is important to make sure that the records refer to the same species, unless you are interested in some sort of multi-species model. Cross validation calculates the mean AUC.

For more information about the input file format, please look at the documentation for the corresponding parameter. If you use the default occurrence points you should know that Gammarus tigrinus is an aquatic species, so you need to choose marine environmental layers during the modelling procedure.

Workflow requirements: When running on Taverna workbench, this workflow requires Internet connection and the Taverna interaction plugin installed.

More information and documentation about this workflow can be found here: https://wiki.biovel.eu/display/doc/Ecological+Niche+Modelling+Workflow

Preview

Download as scalable diagram (SVG)

Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/3355/download?version=9
[ More Info Expand ]

Workflow Components

Authors (1)

Titles (1)

Descriptions (1)

Dependencies (0)

Inputs (1)

Name	Description
input_points	This input takes a text file containing species occurrence points. The file must be formatted in the following way: each line corresponds to a different record, values are separated by comma and the first line must be a header (also separated by comma) containing the column names. The following column names are mandatory to run this workflow: occurrenceID, nameComplete, decimalLongitude and decimalLatitude. All records will be used to generate a single model regardless the species name.

Processors (25)

Name	Type	Description
create_model	workflow	This part is responsible for creating the model.
select_algorithm	workflow	This part displays interfaces so that users can select algorithm and parameter values.
internal_test	workflow	This part is responsible for perfoming the internal test..
parse_input_points	workflow	This part is responsible for parsing the input points, determining column indexes and returning the records as a single string (same original format) without the header.
make_configuration_of_create_model	workflow	This part creates the XML configuration for model creation.
select_layers	workflow	This part displays an interface so that users can select layers for model creation.
make_configuration_of_test_model	workflow	This part creates the XML configuration for model testing (internal test).
extract_model_algorithm	workflow	This part extracts the resulting model in XML.
show_test_results	workflow	This part is responsible for showing the results of the internal test.
allocate_points	workflow	This part is responsible for transforming the csv lines into a list of XML points. It returns a list of all points as well as two lists with 10 elements containig training and testing points to be used in 10-fold cross validation.
run_cross_validation	workflow	This part is responsible for performing 10-fold cross validation.
extract_aucs	workflow	This part extracts the AUC value from all test results.
Flatten_AUC_List	localworker	Script flatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1);
calculate_mean	workflow	This part calculates the mean AUC value form the cross validation.
retrieve_algorithms	workflow	This part is reponsible for retrieving all available algorithms from the niche modelling service.
retrieve_layers	workflow	This part is reponsible for retrieving all available layers from the niche modelling service and from the GeoServer repository at Fraunhofer.
input_mask_selection	workflow	This part displays an interface so that users can select or create a mask for model creation.
update_biostif_layers	beanshell	Script ArrayList new_biostif_layers_xml_list = new ArrayList(); if (created.equals("0")) { new_biostif_layers_xml_list = biostif_layers_xml_list; } else { String[] parts = mask_id.split(">"); String url = parts[1]; String layer_id = parts[2]; String[] subparts = layer_id.split(":"); workspace = subparts[0]; layer_name = subparts[1]; String xml_piece = ""+layer_id+""+url+""; for (int i = 0; i < biostif_layers_xml_list.size(); ++i) { xml = biostif_layers_xml_list.get(i); if (xml.contains(""+workspace+":")) { xml = xml.replaceFirst("", xml_piece); } new_biostif_layers_xml_list.add(xml); } }
run_projection	workflow	This part is responsible for running one or more projections, including visualization of the result.
upload_csv_data	workflow	This part is responsible for uploading the species occurrence points to the BioSTIF service.
show_projections	workflow	This part is responsible for displaying the projections in BioSTIF.
Merge_String_List_to_a_String	localworker	Script String seperatorString = "\n"; if (seperator != void) { seperatorString = seperator; } StringBuffer sb = new StringBuffer(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); sb.append(item); if (i.hasNext()) { sb.append(seperatorString); } } concatenated = sb.toString();
comma	stringconstant	Value ,
empty_list	beanshell	Script ArrayList empty_list = new ArrayList();
decide_cross_validation	workflow	This part is responsible for showing the results of the internal test.

Beanshells (25)

Name	Inputs	Outputs
update_biostif_layers	biostif_layers_xml_list created mask_id	new_biostif_layers_xml_list
empty_list		empty_list
create_algorithm_xml	algorithm_id algorithm_version parameter_names parameter_values	createmodel_algorithm
create_xpath_to_get_algorithm	in1	out1
check_parameters	xml_parameter_list	has_parameters no_parameters
parse_header	csv_content	name_idx id_idx long_idx lat_idx
extract_taxon_points	csv_content name_idx taxon_name	taxon_points
get_first_taxon	csv_content name_idx	taxon_name
check_service_ok	status	status_failed status_ok
flatten_outputs	stif_layerdescription png_url raster_layername	out_stif_layerdescription out_pngurl out_layername
trimRESTurlResult	url	resultUrl
checkDataUpload	status	dataUpload_ok dataUpload_failed uploadStatus
append_values	projection_log in_logs img_url in_urls area_statistics in_statistics stif_layer in_stif_layers	out_logs out_urls out_statistics out_stif_layers
mask_was_selected	selected_mask	yes no
extract_first_element_1	inlist	first_element
extract_first_element_2	inlist	first_element
parse_json_raster_shim_results	json	layername wmsurl pngurl error serverurl formats resolution boundingbox srs nativeFormat
mask_was_selected	selected_mask	yes no
extract_first_element_1	inlist	first_element
extract_first_element_2	inlist	first_element
allocate_points	all_points folds	training_points testing_points
csv_to_xml_list	csv_points id_idx long_idx lat_idx	all_points num_points
calculate_mean	auc_list	mean_auc
format_string	input	output
clear_list	input_list	empty_list

Outputs (16)

Name	Description
serialized_final_model	The serialization of the model that was created with all points. This is an XML content specific to openModeller.
create_final_model_log	The log from creating the final model. This is only output for information.
internal_test_model_log	The log from testing the model. This is only output for information.
internal_test_model_statistics	An XML document containing statistics for the result of the testing of the model.
area_statistics	List of projection statistics as XML content returned from openModeller.
img_result	List of projected models as IMG content (Erdas Imagine).
project_model_output_log	List of logs for each model projection.
BioSTIF_output	Output from the BioSTIF interface.
mean_auc	Mean AUC value of the external tests performed during cross validation.
external_auc_list	List of AUCs that resulted from the cross validation.
xval_create_model_requests	A list of create model requests prepared during cross validation. One for each replicate. This output will likely be removed from future versions of the workflow.
xval_create_model_log	A list of logs for each model creation during cross validation. One log for each replicate. This output is only expected to be used to give more details of an eventual problem during cross validation.
xval_serialized_model	A list of serialized models (XML content specific to openModeller) created as part of the cross validation. One model for each replicate. This output is only expected to be used to give more details of an eventual problem during cross validation.
xval_test_model_log	A list of logs for each external test during cross validation. One log for each replicate. This output is only expected to be used to give more details of an eventual problem during cross validation.
xval_test_model_statistics	A list of test results (XML content specific to openModeller) created as part of the cross validation. One result for each replicate. This output is mainly expected to be used to give more details of an eventual problem during cross validation.
answer	This is here just for flow control.

Datalinks (70)

Source	Sink
make_configuration_of_create_model:outputString	create_model:createModel_configuration
retrieve_algorithms:algorithms_xml	select_algorithm:algorithms_xml
make_configuration_of_test_model:outputString	internal_test:testModel_configuration
input_points	parse_input_points:csv_content
select_algorithm:algorithm	make_configuration_of_create_model:algorithm_xml
select_layers:selected_layers_ids	make_configuration_of_create_model:layers
allocate_points:all_points	make_configuration_of_create_model:points
input_mask_selection:mask_id	make_configuration_of_create_model:mask_id
retrieve_layers:om_layers_xml	select_layers:om_layers_xml
retrieve_layers:biostif_layers_xml_list	select_layers:biostif_layers_xml_list
select_layers:selected_layers_ids	make_configuration_of_test_model:layers
extract_model_algorithm:model_algorithm	make_configuration_of_test_model:algorithm_xml
allocate_points:all_points	make_configuration_of_test_model:points
input_mask_selection:mask_id	make_configuration_of_test_model:mask_id
create_model:serialized_model	extract_model_algorithm:serialized_model
internal_test:test_statistics	show_test_results:statistics
parse_input_points:id_idx	allocate_points:id_idx
parse_input_points:long_idx	allocate_points:long_idx
parse_input_points:lat_idx	allocate_points:lat_idx
parse_input_points:taxon_points	allocate_points:csv_points
select_layers:selected_layers_ids	run_cross_validation:layers_str
allocate_points:training_points	run_cross_validation:testing_points
allocate_points:testing_points	run_cross_validation:training_points
select_algorithm:algorithm	run_cross_validation:algorithm_xml
decide_cross_validation:answer	run_cross_validation:sentinel
run_cross_validation:xval_test_model_statistics	extract_aucs:test_statistics_xml
extract_aucs:auc	Flatten_AUC_List:inputlist
Flatten_AUC_List:outputlist	calculate_mean:auc_list
select_layers:selected_layers_labels	input_mask_selection:selected_layers_labels
select_layers:selected_layers_ids	input_mask_selection:selected_layers_ids
retrieve_layers:om_layers_xml	input_mask_selection:om_layers_xml
retrieve_layers:biostif_layers_xml_list	input_mask_selection:biostif_layers_xml_list
upload_csv_data:csvDataURI	input_mask_selection:csvDataURI
retrieve_layers:biostif_layers_xml_list	update_biostif_layers:biostif_layers_xml_list
input_mask_selection:created	update_biostif_layers:created
input_mask_selection:mask_id	update_biostif_layers:mask_id
extract_model_algorithm:model_algorithm	run_projection:algorithm_xml
show_test_results:answer	run_projection:sentinel
select_layers:selected_layers_ids	run_projection:model_layers_ids
select_layers:selected_layers_labels	run_projection:model_layers_labels
input_mask_selection:mask_id	run_projection:model_mask_id
update_biostif_layers:new_biostif_layers_xml_list	run_projection:biostif_layers_xml_list
retrieve_layers:om_layers_xml	run_projection:om_layers_xml
parse_input_points:first_taxon_name	run_projection:default_label
empty_list:empty_list	run_projection:img_result
empty_list:empty_list	run_projection:area_statistics
empty_list:empty_list	run_projection:STIF_layerdescription
empty_list:empty_list	run_projection:output_log
input_points	upload_csv_data:csvDataContent
upload_csv_data:csvDataURI	show_projections:csvDataURI
Merge_String_List_to_a_String:concatenated	show_projections:user_layer_definition
run_projection:STIF_layerdescription	Merge_String_List_to_a_String:stringlist
comma:value	Merge_String_List_to_a_String:seperator
show_test_results:answer	decide_cross_validation:sentinel
create_model:serialized_model	serialized_final_model
create_model:output_log	create_final_model_log
internal_test:output_log	internal_test_model_log
internal_test:test_statistics	internal_test_model_statistics
run_projection:area_statistics	area_statistics
run_projection:img_result	img_result
run_projection:output_log	project_model_output_log
show_projections:csvResultData	BioSTIF_output
calculate_mean:mean_auc	mean_auc
Flatten_AUC_List:outputlist	external_auc_list
run_cross_validation:xval_create_model_requests	xval_create_model_requests
run_cross_validation:xval_create_model_log	xval_create_model_log
run_cross_validation:xval_serialized_model	xval_serialized_model
run_cross_validation:xval_test_model_log	xval_test_model_log
run_cross_validation:xval_test_model_statistics	xval_test_model_statistics
run_projection:answer	answer

Coordinations (6)

Controller	Target
run_cross_validation	extract_aucs
decide_cross_validation	run_projection
parse_input_points	select_algorithm
select_algorithm	select_layers
select_layers	input_mask_selection
run_projection	show_projections

Information Workflow Type

Taverna 2

Information Uploader

Renato De Giovanni

Information License

All versions of this Workflow are licensed under:

Information Version 9 (of 28)

View version:

Information Credits (6)

(People/Groups)

Information Attributions (1)

(Workflows/Files)

Private item

Information Tags (27)

Uploader tags

Log in to add Tags

Information Shared with Groups (2)

Information Featured In Packs (2)

Log in to add to one of your Packs

Information Attributed By (1)

(Workflows/Files)

Private item

Information Favourited By (6)

Information Statistics

22099 viewings

3821 downloads

[ see breakdown ]

Citations (0)

None

Version History

In chronological order:

Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 07 January 2013 18:29:08 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Wednesday 09 January 2013 16:41:20 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 10 January 2013 01:31:11 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 10 January 2013 17:31:05 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 14 January 2013 17:35:43 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 17 January 2013 16:20:07 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 28 January 2013 11:16:58 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Friday 01 February 2013 13:49:24 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 14 February 2013 16:33:33 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 18 April 2013 20:22:32 (UTC)

Revision comment:

When a model is created using a BioSTIF layer, now all BioSTIF layers appear as options when selecting the corresponding projection layer.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Wednesday 15 May 2013 23:01:04 (UTC)

Revision comment:

Models are now projected as GeoTiff. Mask creation interface now checks if the mask was created or not by the user before proceeding.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 16 May 2013 15:19:56 (UTC)

Revision comment:

Bugfix: cross-validation now uses the specified input mask for model creation and testing (it was previously using always the first layer selected for model creation).
Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 24 June 2013 14:48:23 (UTC)

Revision comment:

Changed raster style parameter for proper display of GeoTiff maps in BioSTIF and improved workflow documentation.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Friday 20 September 2013 19:56:02 (UTC)

Revision comment:

Included output containing the full BioSTIF link.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Friday 04 October 2013 08:43:51 (UTC)

Revision comment:

Compatibility with the new openModeller service version.
Ecological niche modelling workflow

Created by Renato De Giovanni on Tuesday 08 October 2013 17:21:27 (UTC)

Revision comment:

Renamed workflow and revised documentation.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 09 October 2013 16:52:25 (UTC)

Revision comment:

Replaced all XML input splitters for getProgress calls to fix a bug introduced in version 15.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 23 October 2013 22:03:05 (UTC)

Revision comment:

No new features or bugfixes, but several changes were made in preparation for creating workflow components in future versions.
Ecological niche modelling workflow

Created by Renato De Giovanni on Friday 29 November 2013 11:09:05 (UTC)

Revision comment:

Number of replicates in cross-validation can be specified now.
Ecological niche modelling workflow

Created by Renato De Giovanni on Monday 02 December 2013 16:49:39 (UTC)

Revision comment:

Included possibility to calculate omission error in cross validation.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 29 January 2014 13:12:41 (UTC)

Revision comment:

This version ignores the output from the last BioSTIF interaction so that the workflow can be used with parameter data sweeps.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 26 March 2014 13:03:45 (UTC)

Revision comment:

New parameter to indicate if BioSTIF layers will be used or not, and new code to parse CSV content.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 25 June 2014 14:26:46 (UTC)

Revision comment:

Included BOM removal beanshell to handle certain types of input point files in UTF-8 and updated authors list.
Ecological niche modelling workflow

Created by Renato De Giovanni on Tuesday 25 November 2014 11:46:22 (UTC)

Revision comment:

Initial version of the workflow to be based on ENM components.
Ecological niche modelling workflow

Created by Renato De Giovanni on Thursday 04 December 2014 16:54:03 (UTC)

Revision comment:

Updated components versions.
Ecological niche modelling workflow

Created by Renato De Giovanni on Friday 27 March 2015 12:45:11 (UTC)

Revision comment:

Updated get_available_layers compnent to order BioSTIF layers alphabetically.
Ecological niche modelling workflow

Created by Renato De Giovanni on Saturday 04 April 2015 21:17:06 (UTC)

Revision comment:

Replaced googlecode pages with github pages.
Ecological niche modelling workflow

Created by Renato De Giovanni on Thursday 11 June 2015 13:32:35 (UTC)

Revision comment:

Replaced BioSTIF domain with "biostif.at.biovel.eu" when creating links to the BioSTIF interface.