Ecological niche modelling workflow

Created: 2013-01-07 18:29:13 Last updated: 2015-06-11 13:32:35

Download Workflow

This workflow takes as input a file containing species occurrence points to create a model with the openModeller Web Service. Algorithm, environmental layers and mask are selected during the workflow. The model is tested (internal test and optional cross validation external test) and then projected one or more times. All points from the input file are used to create a single model, even if there are differences in the scientific names. Cross validation calculates the mean AUC. Model projections can be downloaded from the links in the workflow output. They are geotiff files with suitability values ranging from 0 to 254 (nodata=255).

For more information about the input file format, please check the documentation of the corresponding parameter. The default occurrence points are from a marine species called Gammarus tigrinus, so it is necessary to choose marine environmental layers during the modelling procedure to use it.

Workflow requirements: When running on Taverna workbench, this workflow requires Internet connection and the Taverna interaction plugin installed (for versions < 2.5).

Please note that ecological niche modelling experiments can take a long time to run depending on the parameters - sometimes several hours. This may happen with high resolution environmental layers, thousands of occurrence points and heavy algorithms, such as ANN and GARP BS. Cancelling a workflow run may not cancel the corresponding job on the server side, so if this procedure is repeated the server may get overloaded.

More information and documentation about this workflow can be found here: https://wiki.biovel.eu/display/doc/Ecological+Niche+Modelling+%28ENM%29+Workflow

Preview

Download as scalable diagram (SVG)

Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/3355/download?version=23
[ More Info Expand ]

Workflow Components

Authors (1)

Titles (1)

Descriptions (1)

Dependencies (0)

Inputs (2)

Name	Description
input_points	This input takes a text file containing species occurrence points in CSV format. Each line in the file corresponds to a different record with values separated by comma. The first line must be a header containing column names also separated by comma. The following columns are mandatory to run this workflow AND must be spelled EXACTLY as follows: occurrenceID, nameComplete, decimalLongitude and decimalLatitude. Other columns can be present on the file. Columns can be in any order, but they must match the order of the corresponding values. All records are used to generate a single model regardless of the species name.
use_biostif_layers	Indicates if the BioSTIF service should be queried for available layers, in case they will be used in the modelling procedure. Possible values are "yes" or "no" (string without double quotes).

Processors (28)

Name	Type	Description
select_algorithm	workflow	This part displays interfaces so that users can select algorithm and parameter values.
parse_input_points	workflow	This part is responsible for parsing the input points, determining column indexes and returning the records as a single string (same original format) without the header.
create_model	workflow	This part creates the XML configuration for model creation.
select_layers	workflow	This part displays an interface so that users can select layers for model creation.
test_model	workflow	This part is responsible for testing a model.
show_test_results	workflow	This part is responsible for showing the results of the internal test.
run_cross_validation	workflow	This part is responsible for performing 10-fold cross validation.
extract_values	workflow	This part extracts the AUC value from all test results.
Flatten_AUC_List	localworker	Script flatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1);
retrieve_algorithms	workflow	This part is reponsible for retrieving all available algorithms from the niche modelling service.
retrieve_layers	workflow	This part is reponsible for retrieving all available layers from the niche modelling service and from the GeoServer repository at Fraunhofer.
input_mask_selection	workflow	This part displays an interface so that users can select or create a mask for model creation.
update_biostif_layers	beanshell	Script ArrayList new_biostif_layers_xml_list = new ArrayList(); if (created.equals("0")) { new_biostif_layers_xml_list = biostif_layers_xml_list; } else { String[] parts = mask_id.split(">"); String url = parts[1]; String layer_id = parts[2]; String[] subparts = layer_id.split(":"); workspace = subparts[0]; layer_name = subparts[1]; String xml_piece = ""+layer_id+""+url+""; boolean not_found = true; for (int i = 0; i < biostif_layers_xml_list.size(); ++i) { xml = biostif_layers_xml_list.get(i); if (xml.contains(""+workspace+":")) { not_found = false; xml = xml.replaceFirst("", xml_piece); } new_biostif_layers_xml_list.add(xml); } if (not_found) { new_biostif_layers_xml_list = biostif_layers_xml_list; new_biostif_layers_xml_list.add(xml_piece + ""); } }
run_projection	workflow	This part is responsible for running one or more projections, including visualization of the result.
upload_csv_data	workflow	This part is responsible for uploading the species occurrence points to the BioSTIF service.
show_projections	workflow	This part is responsible for displaying the projections in BioSTIF.
Merge_String_List_to_a_String	localworker	Script String seperatorString = "\n"; if (seperator != void) { seperatorString = seperator; } StringBuffer sb = new StringBuffer(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); sb.append(item); if (i.hasNext()) { sb.append(seperatorString); } } concatenated = sb.toString();
comma	stringconstant	Value ,
empty_list	beanshell	Script ArrayList empty_list = new ArrayList();
decide_cross_validation	workflow	This part is responsible for deciding about external test.
make_biostif_url	localworker	Script String seperatorString = "\n"; if (seperator != void) { seperatorString = seperator; } StringBuffer sb = new StringBuffer(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); sb.append(item); if (i.hasNext()) { sb.append(seperatorString); } } concatenated = "http://biovel.iais.fraunhofer.de/biostif/main.jsp?debug=true&layers="+sb.toString()+"&label=species_points&contenttype=csv&url="+csv_url;
constant_values	workflow	This is just a convenient way to group together constants that are needed by most ENM components.
calculate_mean_omission	workflow	This part calculates the mean omission value from the cross validation.
Flatten_omission_List	localworker	Script flatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1);
calculate_mean_auc	workflow	This part calculates the mean AUC value from the cross validation.
flatten_cross_validation_outputs	workflow
terminate	beanshell	This script was created as a workaround to a Taverna 2.4 bug. It is used to receive the output of a nested workflow, so that the nested workflow terminates. Script
remove_possible_BOM	beanshell	Script if (sIn.startsWith("\uFEFF")) { sOut = sIn.substring(1); } else{ sOut = sIn; }

Beanshells (40)

Name	Description	Inputs	Outputs
update_biostif_layers		biostif_layers_xml_list created mask_id	new_biostif_layers_xml_list
empty_list			empty_list
terminate	This script was created as a workaround to a Taverna 2.4 bug. It is used to receive the output of a nested workflow, so that the nested workflow terminates.	input
remove_possible_BOM		sIn	sOut
make_xml		mask_id layers model_xml presence_points_xml absence_points_xml srs species_label	xml
calculate_threshold		model_values threshold	threshold_value
control_biostif		use_biostif_layers	flag empty_list
make_xml		environmentally_unique spatially_unique mask_id layers algorithm_xml presence_points_xml absence_points_xml srs species_label	xml
clear_list		input_list	empty_list
calculate_mean		values_list	mean_value
append_values		projection_log in_logs img_url in_urls area_statistics in_statistics stif_layer in_stif_layers	out_logs out_urls out_statistics out_stif_layers
make_xml		mask_id layers model_xml threshold template_id output_format	xml
router		measure_auc calculate_matrix	extract_auc skip_auc extract_omission skip_omission
check_service_ok		status	status_failed status_ok
flatten_outputs		stif_layerdescription png_url raster_layername	out_stif_layerdescription out_pngurl out_layername
allocate_points		all_points folds choice	training_points testing_points flag
count_points		all_points	num_points
trimRESTurlResult		url	resultUrl
checkDataUpload		status	dataUpload_ok dataUpload_failed uploadStatus
make_xml		calculate_matrix calculate_roc mask_id layers model_xml presence_points_xml threshold num_background_points absence_points_xml srs species_label	xml
format_string		input	output
parse_json_raster_shim_results		json	layername wmsurl pngurl error serverurl formats resolution boundingbox srs nativeFormat
create_algorithm_xml		algorithm_id algorithm_version parameter_names parameter_values	algorithm_xml
create_xpath_to_get_algorithm		in1	out1
check_parameters		xml_parameter_list	has_parameters no_parameters
mask_was_selected		selected_mask	yes no
extract_first_element_1		inlist	first_element
extract_first_element_2		inlist	first_element
parse_header		csv_content	name_idx id_idx long_idx lat_idx
get_first_taxon		csv_content name_idx	taxon_name
extract_taxon_points		csv_content name_idx taxon_name id_idx long_idx lat_idx	all_points num_points
mask_was_selected		selected_mask	yes no
extract_first_element_1		inlist	first_element
extract_first_element_2		inlist	first_element
assign_zero			zero
check_if_threshold_must_be_calculated		threshold	calculate_threshold repeat_value
repeat_value		input_threshold flag	output_threshold
get_first_item		list	single_value
calculate_mean		values_list	mean_value
assign_zero			zero

Outputs (18)

Name	Description
serialized_final_model	The serialization of the model that was created with all points. This is an XML content specific to openModeller.
create_final_model_log	The log from creating the final model. This is only output for information.
internal_test_model_log	The log from testing the model. This is only output for information.
internal_test_model_statistics	An XML document containing statistics for the result of the testing of the model.
area_statistics	List of projection statistics as XML content returned from openModeller.
projection_url	List of projected models as URLs from where the corresponding files can be downloaded.
project_model_output_log	List of logs for each model projection.
mean_auc	Mean AUC value of the external tests performed during cross validation.
external_auc_list	List of AUCs that resulted from the cross validation.
xval_create_model_log	A list of logs for each model creation during cross validation. One log for each replicate. This output is only expected to be used to give more details of an eventual problem during cross validation.
xval_test_model_log	A list of logs for each external test during cross validation. One log for each replicate. This output is only expected to be used to give more details of an eventual problem during cross validation.
xval_test_model_statistics	A list of test results (XML content specific to openModeller) created as part of the cross validation. One result for each replicate. This output is mainly expected to be used to give more details of an eventual problem during cross validation.
answer	This is here just for flow control - you can ignore this value.
BioSTIF_csv_data_url
BioSTIF_link
xval_threshold
mean_omission
external_omission_list

Datalinks (97)

Source	Sink
retrieve_algorithms:algorithms_xml	select_algorithm:algorithms_xml
remove_possible_BOM:sOut	parse_input_points:csv_content
select_algorithm:algorithm_xml	create_model:algorithm_xml
select_layers:selected_layers_ids	create_model:layers
input_mask_selection:mask_id	create_model:mask_id
constant_values:no	create_model:environmentally_unique
constant_values:no	create_model:spatially_unique
constant_values:default_species_label	create_model:species_label
constant_values:default_srs	create_model:srs
constant_values:empty_value	create_model:absence_points_xml
parse_input_points:all_points_xml	create_model:presence_points_xml
retrieve_layers:om_layers_xml	select_layers:om_layers_xml
retrieve_layers:biostif_layers_xml_list	select_layers:biostif_layers_xml_list
select_layers:selected_layers_ids	test_model:layers
input_mask_selection:mask_id	test_model:mask_id
constant_values:yes	test_model:calculate_roc
constant_values:yes	test_model:calculate_matrix
constant_values:empty_value	test_model:absence_points_xml
constant_values:default_srs	test_model:srs
constant_values:default_threshold	test_model:threshold
constant_values:default_num_points	test_model:num_background_points
constant_values:default_species_label	test_model:species_label
create_model:model_xml	test_model:model_xml
parse_input_points:all_points_xml	test_model:presence_points_xml
test_model:test_statistics	show_test_results:statistics
select_layers:selected_layers_ids	run_cross_validation:layers_str
select_algorithm:algorithm_xml	run_cross_validation:algorithm_xml
decide_cross_validation:flag	run_cross_validation:sentinel
input_mask_selection:mask_id	run_cross_validation:mask
decide_cross_validation:testing_points	run_cross_validation:testing_points
decide_cross_validation:training_points	run_cross_validation:training_points
decide_cross_validation:calculate_matrix	run_cross_validation:calculate_matrix
decide_cross_validation:measure_auc	run_cross_validation:measure_auc
decide_cross_validation:threshold	run_cross_validation:threshold
decide_cross_validation:measure_auc	extract_values:measure_auc
decide_cross_validation:calculate_matrix	extract_values:calculate_matrix
flatten_cross_validation_outputs:xval_test_model_statistics	extract_values:test_statistics_xml
extract_values:auc	Flatten_AUC_List:inputlist
use_biostif_layers	retrieve_layers:use_biostif_layers
select_layers:selected_layers_labels	input_mask_selection:selected_layers_labels
select_layers:selected_layers_ids	input_mask_selection:selected_layers_ids
retrieve_layers:om_layers_xml	input_mask_selection:om_layers_xml
retrieve_layers:biostif_layers_xml_list	input_mask_selection:biostif_layers_xml_list
upload_csv_data:csvDataURI	input_mask_selection:csvDataURI
retrieve_layers:biostif_layers_xml_list	update_biostif_layers:biostif_layers_xml_list
input_mask_selection:created	update_biostif_layers:created
input_mask_selection:mask_id	update_biostif_layers:mask_id
show_test_results:answer	run_projection:sentinel
select_layers:selected_layers_ids	run_projection:model_layers_ids
select_layers:selected_layers_labels	run_projection:model_layers_labels
input_mask_selection:mask_id	run_projection:model_mask_id
update_biostif_layers:new_biostif_layers_xml_list	run_projection:biostif_layers_xml_list
retrieve_layers:om_layers_xml	run_projection:om_layers_xml
parse_input_points:first_taxon_name	run_projection:default_label
empty_list:empty_list	run_projection:area_statistics
empty_list:empty_list	run_projection:STIF_layerdescription
empty_list:empty_list	run_projection:output_log
create_model:model_xml	run_projection:model_xml
empty_list:empty_list	run_projection:projection_url
remove_possible_BOM:sOut	upload_csv_data:csvDataContent
upload_csv_data:csvDataURI	show_projections:csvDataURI
Merge_String_List_to_a_String:concatenated	show_projections:user_layer_definition
run_projection:STIF_layerdescription	Merge_String_List_to_a_String:stringlist
comma:value	Merge_String_List_to_a_String:seperator
show_test_results:answer	decide_cross_validation:sentinel
parse_input_points:all_points_xml	decide_cross_validation:all_points
comma:value	make_biostif_url:seperator
run_projection:STIF_layerdescription	make_biostif_url:stringlist
upload_csv_data:csvDataURI	make_biostif_url:csv_url
Flatten_omission_List:outputlist	calculate_mean_omission:values_list
extract_values:omission	Flatten_omission_List:inputlist
Flatten_AUC_List:outputlist	calculate_mean_auc:values_list
run_cross_validation:xval_test_model_log	flatten_cross_validation_outputs:xval_test_model_log
run_cross_validation:xval_threshold	flatten_cross_validation_outputs:xval_threshold
run_cross_validation:xval_test_model_statistics	flatten_cross_validation_outputs:xval_test_model_statistics
run_cross_validation:xval_serialized_model	flatten_cross_validation_outputs:xval_serialized_model
run_cross_validation:xval_create_model_log	flatten_cross_validation_outputs:xval_create_model_log
show_projections:csvResultData	terminate:input
input_points	remove_possible_BOM:sIn
create_model:full_serialized_final_model	serialized_final_model
create_model:log	create_final_model_log
test_model:log	internal_test_model_log
test_model:test_statistics	internal_test_model_statistics
run_projection:area_statistics	area_statistics
run_projection:projection_url	projection_url
run_projection:output_log	project_model_output_log
calculate_mean_auc:mean_value	mean_auc
Flatten_AUC_List:outputlist	external_auc_list
flatten_cross_validation_outputs:xval_create_model_log	xval_create_model_log
flatten_cross_validation_outputs:xval_test_model_log	xval_test_model_log
flatten_cross_validation_outputs:xval_test_model_statistics	xval_test_model_statistics
run_projection:answer	answer
upload_csv_data:csvDataURI	BioSTIF_csv_data_url
make_biostif_url:concatenated	BioSTIF_link
flatten_cross_validation_outputs:xval_threshold	xval_threshold
calculate_mean_omission:mean_value	mean_omission
Flatten_omission_List:outputlist	external_omission_list

Coordinations (7)

Controller	Target
decide_cross_validation	run_projection
select_algorithm	select_layers
parse_input_points	select_algorithm
flatten_cross_validation_outputs	extract_values
run_projection	show_projections
select_layers	input_mask_selection
run_cross_validation	flatten_cross_validation_outputs

Information Workflow Type

Taverna 2

Information Uploader

Renato De Giovanni

Information License

All versions of this Workflow are licensed under:

Information Version 23 (of 28)

View version:

Information Credits (6)

(People/Groups)

Information Attributions (1)

(Workflows/Files)

Private item

Information Tags (27)

Uploader tags

Log in to add Tags

Information Shared with Groups (2)

Information Featured In Packs (2)

Log in to add to one of your Packs

Information Attributed By (1)

(Workflows/Files)

Private item

Information Favourited By (6)

Information Statistics

24325 viewings

6576 downloads

[ see breakdown ]

Citations (0)

None

Version History

In chronological order:

Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 07 January 2013 18:29:08 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Wednesday 09 January 2013 16:41:20 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 10 January 2013 01:31:11 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 10 January 2013 17:31:05 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 14 January 2013 17:35:43 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 17 January 2013 16:20:07 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 28 January 2013 11:16:58 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Friday 01 February 2013 13:49:24 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 14 February 2013 16:33:33 (UTC)
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 18 April 2013 20:22:32 (UTC)

Revision comment:

When a model is created using a BioSTIF layer, now all BioSTIF layers appear as options when selecting the corresponding projection layer.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Wednesday 15 May 2013 23:01:04 (UTC)

Revision comment:

Models are now projected as GeoTiff. Mask creation interface now checks if the mask was created or not by the user before proceeding.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Thursday 16 May 2013 15:19:56 (UTC)

Revision comment:

Bugfix: cross-validation now uses the specified input mask for model creation and testing (it was previously using always the first layer selected for model creation).
Generic ENM workflow with interaction

Created by Renato De Giovanni on Monday 24 June 2013 14:48:23 (UTC)

Revision comment:

Changed raster style parameter for proper display of GeoTiff maps in BioSTIF and improved workflow documentation.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Friday 20 September 2013 19:56:02 (UTC)

Revision comment:

Included output containing the full BioSTIF link.
Generic ENM workflow with interaction

Created by Renato De Giovanni on Friday 04 October 2013 08:43:51 (UTC)

Revision comment:

Compatibility with the new openModeller service version.
Ecological niche modelling workflow

Created by Renato De Giovanni on Tuesday 08 October 2013 17:21:27 (UTC)

Revision comment:

Renamed workflow and revised documentation.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 09 October 2013 16:52:25 (UTC)

Revision comment:

Replaced all XML input splitters for getProgress calls to fix a bug introduced in version 15.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 23 October 2013 22:03:05 (UTC)

Revision comment:

No new features or bugfixes, but several changes were made in preparation for creating workflow components in future versions.
Ecological niche modelling workflow

Created by Renato De Giovanni on Friday 29 November 2013 11:09:05 (UTC)

Revision comment:

Number of replicates in cross-validation can be specified now.
Ecological niche modelling workflow

Created by Renato De Giovanni on Monday 02 December 2013 16:49:39 (UTC)

Revision comment:

Included possibility to calculate omission error in cross validation.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 29 January 2014 13:12:41 (UTC)

Revision comment:

This version ignores the output from the last BioSTIF interaction so that the workflow can be used with parameter data sweeps.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 26 March 2014 13:03:45 (UTC)

Revision comment:

New parameter to indicate if BioSTIF layers will be used or not, and new code to parse CSV content.
Ecological niche modelling workflow

Created by Renato De Giovanni on Wednesday 25 June 2014 14:26:46 (UTC)

Revision comment:

Included BOM removal beanshell to handle certain types of input point files in UTF-8 and updated authors list.
Ecological niche modelling workflow

Created by Renato De Giovanni on Tuesday 25 November 2014 11:46:22 (UTC)

Revision comment:

Initial version of the workflow to be based on ENM components.
Ecological niche modelling workflow

Created by Renato De Giovanni on Thursday 04 December 2014 16:54:03 (UTC)

Revision comment:

Updated components versions.
Ecological niche modelling workflow

Created by Renato De Giovanni on Friday 27 March 2015 12:45:11 (UTC)

Revision comment:

Updated get_available_layers compnent to order BioSTIF layers alphabetically.
Ecological niche modelling workflow

Created by Renato De Giovanni on Saturday 04 April 2015 21:17:06 (UTC)

Revision comment:

Replaced googlecode pages with github pages.
Ecological niche modelling workflow

Created by Renato De Giovanni on Thursday 11 June 2015 13:32:35 (UTC)

Revision comment:

Replaced BioSTIF domain with "biostif.at.biovel.eu" when creating links to the BioSTIF interface.