Ecological niche modelling workflow
This workflow takes as input a file containing species occurrence points to create a model with the openModeller Web Service. Algorithm, environmental layers and mask are selected during the workflow. The model is tested (internal test and optional cross validation external test) and then projected one or more times. All points from the input file are used to create a single model, even if there are differences in the scientific names. Cross validation calculates the mean AUC. Model projections can be downloaded from the links in the workflow output. They are geotiff files with suitability values ranging from 0 to 254 (nodata=255).
For more information about the input file format, please check the documentation of the corresponding parameter. The default occurrence points are from a marine species called Gammarus tigrinus, so it is necessary to choose marine environmental layers during the modelling procedure to use it.
Workflow requirements: When running on Taverna workbench, this workflow requires Internet connection and the Taverna interaction plugin installed.
Please note that ecological niche modelling experiments can take a long time to run depending on the parameters - sometimes several hours. This may happen with high resolution environmental layers, thousands of occurrence points and heavy algorithms, such as ANN and GARP BS. Cancelling a workflow run may not cancel the corresponding job on the server side, so if this procedure is repeated the server may get overloaded.
More information and documentation about this workflow can be found here: https://wiki.biovel.eu/display/doc/Ecological+Niche+Modelling+%28ENM%29+Workflow
Preview
Run
Run this Workflow in the Taverna Workbench...
Option 1:
Copy and paste this link into File > 'Open workflow location...'
http://myexperiment.org/workflows/3355/download?version=21
[ More Info ]
Taverna is available from http://taverna.sourceforge.net/
If you are having problems downloading it in Taverna, you may need to provide your username and password in the URL so that Taverna can access the Workflow:
Replace http:// in the link above with http://yourusername:yourpassword@
Workflow Components
Alan R Williams, Renato De Giovanni, Vera Hernandez & Robert Kulawik |
Ecological niche modelling workflow |
This workflow takes as input a file containing species occurrence points to create a model with the openModeller Web Service. Algorithm, environmental layers and mask are selected during the workflow. The model is tested (internal test and optional cross validation external test) and then projected one or more times. All points from the input file are used to create a single model, even if there are differences in the scientific names. Cross validation calculates the mean AUC. Model projections can be downloaded from the links in the workflow output. They are geotiff files with suitability values ranging from 0 to 254 (nodata=255). For more information about the input file format, please check the documentation of the corresponding parameter. The default occurrence points are from a marine species called Gammarus tigrinus, so it is necessary to choose marine environmental layers during the modelling procedure to use it. Workflow requirements: When running on Taverna workbench, this workflow requires Internet connection and the Taverna interaction plugin installed. Please note that ecological niche modelling experiments can take a long time to run depending on the parameters - sometimes several hours. This may happen with high resolution environmental layers, thousands of occurrence points and heavy algorithms, such as ANN and GARP BS. Cancelling a workflow run may not cancel the corresponding job on the server side, so if this procedure is repeated the server may get overloaded. More information and documentation about this workflow can be found here: https://wiki.biovel.eu/display/doc/Ecological+Niche+Modelling+%28ENM%29+Workflow |
None
Name | Description |
---|---|
input_points | This input takes a text file containing species occurrence points in CSV format. Each line in the file corresponds to a different record with values separated by comma. The first line must be a header containing column names also separated by comma. The following columns are mandatory to run this workflow AND must be spelled EXACTLY as follows: occurrenceID, nameComplete, decimalLongitude and decimalLatitude. Other columns can be present on the file. Columns can be in any order, but they must match the order of the corresponding values. All records are used to generate a single model regardless of the species name. |
Name | Type | Description |
---|---|---|
select_algorithm | workflow | This part displays interfaces so that users can select algorithm and parameter values. |
parse_input_points | workflow | This part is responsible for parsing the input points, determining column indexes and returning the records as a single string (same original format) without the header. |
create_model | workflow | This part creates the XML configuration for model creation. |
select_layers | workflow | This part displays an interface so that users can select layers for model creation. |
test_model | workflow | This part is responsible for testing a model. |
show_test_results | workflow | This part is responsible for showing the results of the internal test. |
allocate_points | workflow | This part is responsible for transforming the csv lines into a list of XML points. It returns a list of all points as well as two lists with 10 elements containig training and testing points to be used in 10-fold cross validation. |
run_cross_validation | workflow | This part is responsible for performing 10-fold cross validation. |
extract_values | workflow | This part extracts the AUC value from all test results. |
Flatten_AUC_List | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
retrieve_algorithms | workflow | This part is reponsible for retrieving all available algorithms from the niche modelling service. |
retrieve_layers | workflow | This part is reponsible for retrieving all available layers from the niche modelling service and from the GeoServer repository at Fraunhofer. |
input_mask_selection | workflow | This part displays an interface so that users can select or create a mask for model creation. |
update_biostif_layers | beanshell |
ScriptArrayList new_biostif_layers_xml_list = new ArrayList(); if (created.equals("0")) { new_biostif_layers_xml_list = biostif_layers_xml_list; } else { String[] parts = mask_id.split(">"); String url = parts[1]; String layer_id = parts[2]; String[] subparts = layer_id.split(":"); workspace = subparts[0]; layer_name = subparts[1]; String xml_piece = " |
run_projection | workflow | This part is responsible for running one or more projections, including visualization of the result. |
upload_csv_data | workflow | This part is responsible for uploading the species occurrence points to the BioSTIF service. |
show_projections | workflow | This part is responsible for displaying the projections in BioSTIF. |
Merge_String_List_to_a_String | localworker |
ScriptString seperatorString = "\n"; if (seperator != void) { seperatorString = seperator; } StringBuffer sb = new StringBuffer(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); sb.append(item); if (i.hasNext()) { sb.append(seperatorString); } } concatenated = sb.toString(); |
comma | stringconstant |
Value, |
empty_list | beanshell |
ScriptArrayList empty_list = new ArrayList(); |
decide_cross_validation | workflow | This part is responsible for deciding about external test. |
make_biostif_url | localworker |
ScriptString seperatorString = "\n"; if (seperator != void) { seperatorString = seperator; } StringBuffer sb = new StringBuffer(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); sb.append(item); if (i.hasNext()) { sb.append(seperatorString); } } concatenated = "http://biovel.iais.fraunhofer.de/biostif/main.jsp?debug=true&layers="+sb.toString()+"&label=species_points&contenttype=csv&url="+csv_url; |
constant_values | workflow | This is just a convenient way to group together constants that are needed by most ENM components. |
calculate_mean_omission | workflow | This part calculates the mean omission value from the cross validation. |
Flatten_omission_List | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
calculate_mean_auc | workflow | This part calculates the mean AUC value from the cross validation. |
flatten_cross_validation_outputs | workflow | |
terminate | beanshell |
This script was created as a workaround to a Taverna 2.4 bug. It is used to receive the output of a nested workflow, so that the nested workflow terminates. Script |
Name | Description | Inputs | Outputs |
---|---|---|---|
update_biostif_layers |
biostif_layers_xml_list created mask_id |
new_biostif_layers_xml_list | |
empty_list | empty_list | ||
terminate | This script was created as a workaround to a Taverna 2.4 bug. It is used to receive the output of a nested workflow, so that the nested workflow terminates. | input | |
format_string | input | output | |
parse_header | csv_content |
name_idx id_idx long_idx lat_idx |
|
extract_taxon_points |
csv_content name_idx taxon_name |
taxon_points | |
get_first_taxon |
csv_content name_idx |
taxon_name | |
make_xml |
environmentally_unique spatially_unique mask_id layers algorithm_xml presence_points_xml absence_points_xml srs species_label |
xml | |
check_if_threshold_must_be_calculated | threshold |
calculate_threshold repeat_value |
|
repeat_value |
input_threshold flag |
output_threshold | |
get_first_item | list | single_value | |
assign_zero | zero | ||
calculate_mean | values_list | mean_value | |
make_xml |
calculate_matrix calculate_roc mask_id layers model_xml presence_points_xml threshold num_background_points absence_points_xml srs species_label |
xml | |
make_xml |
mask_id layers model_xml presence_points_xml absence_points_xml srs species_label |
xml | |
check_service_ok | status |
status_failed status_ok |
|
flatten_outputs |
stif_layerdescription png_url raster_layername |
out_stif_layerdescription out_pngurl out_layername |
|
append_values |
projection_log in_logs img_url in_urls area_statistics in_statistics stif_layer in_stif_layers |
out_logs out_urls out_statistics out_stif_layers |
|
calculate_mean | values_list | mean_value | |
create_algorithm_xml |
algorithm_id algorithm_version parameter_names parameter_values |
algorithm_xml | |
create_xpath_to_get_algorithm | in1 | out1 | |
check_parameters | xml_parameter_list |
has_parameters no_parameters |
|
mask_was_selected | selected_mask |
yes no |
|
extract_first_element_1 | inlist | first_element | |
extract_first_element_2 | inlist | first_element | |
allocate_points |
all_points folds choice |
training_points testing_points flag |
|
count_points | all_points | num_points | |
mask_was_selected | selected_mask |
yes no |
|
extract_first_element_1 | inlist | first_element | |
extract_first_element_2 | inlist | first_element | |
clear_list | input_list | empty_list | |
make_xml |
mask_id layers model_xml threshold template_id output_format |
xml | |
router |
measure_auc calculate_matrix |
extract_auc skip_auc extract_omission skip_omission |
|
assign_zero | zero | ||
csv_to_xml_list |
csv_points id_idx long_idx lat_idx |
all_points num_points |
|
trimRESTurlResult | url | resultUrl | |
checkDataUpload | status |
dataUpload_ok dataUpload_failed uploadStatus |
|
calculate_threshold |
model_values threshold |
threshold_value | |
parse_json_raster_shim_results | json |
layername wmsurl pngurl error serverurl formats resolution boundingbox srs nativeFormat |
Name | Description |
---|---|
serialized_final_model | The serialization of the model that was created with all points. This is an XML content specific to openModeller. |
create_final_model_log | The log from creating the final model. This is only output for information. |
internal_test_model_log | The log from testing the model. This is only output for information. |
internal_test_model_statistics | An XML document containing statistics for the result of the testing of the model. |
area_statistics | List of projection statistics as XML content returned from openModeller. |
projection_url | List of projected models as URLs from where the corresponding files can be downloaded. |
project_model_output_log | List of logs for each model projection. |
mean_auc | Mean AUC value of the external tests performed during cross validation. |
external_auc_list | List of AUCs that resulted from the cross validation. |
xval_create_model_log | A list of logs for each model creation during cross validation. One log for each replicate. This output is only expected to be used to give more details of an eventual problem during cross validation. |
xval_test_model_log | A list of logs for each external test during cross validation. One log for each replicate. This output is only expected to be used to give more details of an eventual problem during cross validation. |
xval_test_model_statistics | A list of test results (XML content specific to openModeller) created as part of the cross validation. One result for each replicate. This output is mainly expected to be used to give more details of an eventual problem during cross validation. |
answer | This is here just for flow control - you can ignore this value. |
BioSTIF_csv_data_url | |
BioSTIF_link | |
xval_threshold | |
mean_omission | |
external_omission_list |
Source | Sink |
---|---|
retrieve_algorithms:algorithms_xml | select_algorithm:algorithms_xml |
input_points | parse_input_points:csv_content |
select_algorithm:algorithm_xml | create_model:algorithm_xml |
select_layers:selected_layers_ids | create_model:layers |
input_mask_selection:mask_id | create_model:mask_id |
constant_values:no | create_model:environmentally_unique |
constant_values:no | create_model:spatially_unique |
constant_values:default_species_label | create_model:species_label |
constant_values:default_srs | create_model:srs |
constant_values:empty_value | create_model:absence_points_xml |
allocate_points:all_points | create_model:presence_points_xml |
retrieve_layers:om_layers_xml | select_layers:om_layers_xml |
retrieve_layers:biostif_layers_xml_list | select_layers:biostif_layers_xml_list |
select_layers:selected_layers_ids | test_model:layers |
input_mask_selection:mask_id | test_model:mask_id |
constant_values:yes | test_model:calculate_roc |
constant_values:yes | test_model:calculate_matrix |
allocate_points:all_points | test_model:presence_points_xml |
constant_values:empty_value | test_model:absence_points_xml |
constant_values:default_srs | test_model:srs |
constant_values:default_threshold | test_model:threshold |
constant_values:default_num_points | test_model:num_background_points |
constant_values:default_species_label | test_model:species_label |
create_model:model_xml | test_model:model_xml |
test_model:test_statistics | show_test_results:statistics |
parse_input_points:id_idx | allocate_points:id_idx |
parse_input_points:long_idx | allocate_points:long_idx |
parse_input_points:lat_idx | allocate_points:lat_idx |
parse_input_points:taxon_points | allocate_points:csv_points |
select_layers:selected_layers_ids | run_cross_validation:layers_str |
select_algorithm:algorithm_xml | run_cross_validation:algorithm_xml |
decide_cross_validation:flag | run_cross_validation:sentinel |
input_mask_selection:mask_id | run_cross_validation:mask |
decide_cross_validation:testing_points | run_cross_validation:testing_points |
decide_cross_validation:training_points | run_cross_validation:training_points |
decide_cross_validation:calculate_matrix | run_cross_validation:calculate_matrix |
decide_cross_validation:measure_auc | run_cross_validation:measure_auc |
decide_cross_validation:threshold | run_cross_validation:threshold |
decide_cross_validation:measure_auc | extract_values:measure_auc |
decide_cross_validation:calculate_matrix | extract_values:calculate_matrix |
flatten_cross_validation_outputs:xval_test_model_statistics | extract_values:test_statistics_xml |
extract_values:auc | Flatten_AUC_List:inputlist |
select_layers:selected_layers_labels | input_mask_selection:selected_layers_labels |
select_layers:selected_layers_ids | input_mask_selection:selected_layers_ids |
retrieve_layers:om_layers_xml | input_mask_selection:om_layers_xml |
retrieve_layers:biostif_layers_xml_list | input_mask_selection:biostif_layers_xml_list |
upload_csv_data:csvDataURI | input_mask_selection:csvDataURI |
retrieve_layers:biostif_layers_xml_list | update_biostif_layers:biostif_layers_xml_list |
input_mask_selection:created | update_biostif_layers:created |
input_mask_selection:mask_id | update_biostif_layers:mask_id |
show_test_results:answer | run_projection:sentinel |
select_layers:selected_layers_ids | run_projection:model_layers_ids |
select_layers:selected_layers_labels | run_projection:model_layers_labels |
input_mask_selection:mask_id | run_projection:model_mask_id |
update_biostif_layers:new_biostif_layers_xml_list | run_projection:biostif_layers_xml_list |
retrieve_layers:om_layers_xml | run_projection:om_layers_xml |
parse_input_points:first_taxon_name | run_projection:default_label |
empty_list:empty_list | run_projection:area_statistics |
empty_list:empty_list | run_projection:STIF_layerdescription |
empty_list:empty_list | run_projection:output_log |
create_model:model_xml | run_projection:model_xml |
empty_list:empty_list | run_projection:projection_url |
input_points | upload_csv_data:csvDataContent |
upload_csv_data:csvDataURI | show_projections:csvDataURI |
Merge_String_List_to_a_String:concatenated | show_projections:user_layer_definition |
run_projection:STIF_layerdescription | Merge_String_List_to_a_String:stringlist |
comma:value | Merge_String_List_to_a_String:seperator |
show_test_results:answer | decide_cross_validation:sentinel |
allocate_points:all_points | decide_cross_validation:all_points |
comma:value | make_biostif_url:seperator |
run_projection:STIF_layerdescription | make_biostif_url:stringlist |
upload_csv_data:csvDataURI | make_biostif_url:csv_url |
Flatten_omission_List:outputlist | calculate_mean_omission:values_list |
extract_values:omission | Flatten_omission_List:inputlist |
Flatten_AUC_List:outputlist | calculate_mean_auc:values_list |
run_cross_validation:xval_test_model_log | flatten_cross_validation_outputs:xval_test_model_log |
run_cross_validation:xval_threshold | flatten_cross_validation_outputs:xval_threshold |
run_cross_validation:xval_test_model_statistics | flatten_cross_validation_outputs:xval_test_model_statistics |
run_cross_validation:xval_serialized_model | flatten_cross_validation_outputs:xval_serialized_model |
run_cross_validation:xval_create_model_log | flatten_cross_validation_outputs:xval_create_model_log |
show_projections:csvResultData | terminate:input |
create_model:full_serialized_final_model | serialized_final_model |
create_model:log | create_final_model_log |
test_model:log | internal_test_model_log |
test_model:test_statistics | internal_test_model_statistics |
run_projection:area_statistics | area_statistics |
run_projection:projection_url | projection_url |
run_projection:output_log | project_model_output_log |
calculate_mean_auc:mean_value | mean_auc |
Flatten_AUC_List:outputlist | external_auc_list |
flatten_cross_validation_outputs:xval_create_model_log | xval_create_model_log |
flatten_cross_validation_outputs:xval_test_model_log | xval_test_model_log |
flatten_cross_validation_outputs:xval_test_model_statistics | xval_test_model_statistics |
run_projection:answer | answer |
upload_csv_data:csvDataURI | BioSTIF_csv_data_url |
make_biostif_url:concatenated | BioSTIF_link |
flatten_cross_validation_outputs:xval_threshold | xval_threshold |
calculate_mean_omission:mean_value | mean_omission |
Flatten_omission_List:outputlist | external_omission_list |
Controller | Target |
---|---|
select_layers | input_mask_selection |
flatten_cross_validation_outputs | extract_values |
parse_input_points | select_algorithm |
decide_cross_validation | run_projection |
run_projection | show_projections |
run_cross_validation | flatten_cross_validation_outputs |
select_algorithm | select_layers |
Workflow Type
Version 21 (of 28)
- Private item
- artificial neural network
- |
- bio-oracle
- |
- bioclim
- |
- biostif
- |
- climate space model
- |
- consensus
- |
- ecological niche modelling
- |
- enfa
- |
- envelope score
- |
- environmental distance
- |
- eurolst
- |
- garp
- |
- harmonized world soil database
- |
- hcafv4
- |
- hcafv5
- |
- hwsd
- |
- incofish
- |
- mahalanobis
- |
- maximum entropy
- |
- niche mosaic
- |
- openmodeller
- |
- random forests
- |
- species distribution modelling
- |
- support vector machines
- |
- svm
- |
- virtual niche generator
- |
- worldclim
Log in to add Tags
Shared with Groups (2)
Log in to add to one of your Packs
Statistics
In chronological order:
-
Created by Renato De Giovanni on Monday 07 January 2013 18:29:08 (UTC)
-
Created by Renato De Giovanni on Wednesday 09 January 2013 16:41:20 (UTC)
-
Created by Renato De Giovanni on Thursday 10 January 2013 01:31:11 (UTC)
-
Created by Renato De Giovanni on Thursday 10 January 2013 17:31:05 (UTC)
-
Created by Renato De Giovanni on Monday 14 January 2013 17:35:43 (UTC)
-
Created by Renato De Giovanni on Thursday 17 January 2013 16:20:07 (UTC)
-
Created by Renato De Giovanni on Monday 28 January 2013 11:16:58 (UTC)
-
Created by Renato De Giovanni on Friday 01 February 2013 13:49:24 (UTC)
-
Created by Renato De Giovanni on Thursday 14 February 2013 16:33:33 (UTC)
-
Created by Renato De Giovanni on Thursday 18 April 2013 20:22:32 (UTC)
Revision comment:When a model is created using a BioSTIF layer, now all BioSTIF layers appear as options when selecting the corresponding projection layer.
-
Created by Renato De Giovanni on Wednesday 15 May 2013 23:01:04 (UTC)
Revision comment:Models are now projected as GeoTiff. Mask creation interface now checks if the mask was created or not by the user before proceeding.
-
Created by Renato De Giovanni on Thursday 16 May 2013 15:19:56 (UTC)
Revision comment:Bugfix: cross-validation now uses the specified input mask for model creation and testing (it was previously using always the first layer selected for model creation).
-
Created by Renato De Giovanni on Monday 24 June 2013 14:48:23 (UTC)
Revision comment:Changed raster style parameter for proper display of GeoTiff maps in BioSTIF and improved workflow documentation.
-
Created by Renato De Giovanni on Friday 20 September 2013 19:56:02 (UTC)
Revision comment:Included output containing the full BioSTIF link.
-
Created by Renato De Giovanni on Friday 04 October 2013 08:43:51 (UTC)
Revision comment:Compatibility with the new openModeller service version.
-
Created by Renato De Giovanni on Tuesday 08 October 2013 17:21:27 (UTC)
Revision comment:Renamed workflow and revised documentation.
-
Created by Renato De Giovanni on Wednesday 09 October 2013 16:52:25 (UTC)
Revision comment:Replaced all XML input splitters for getProgress calls to fix a bug introduced in version 15.
-
Created by Renato De Giovanni on Wednesday 23 October 2013 22:03:05 (UTC)
Revision comment:No new features or bugfixes, but several changes were made in preparation for creating workflow components in future versions.
-
Created by Renato De Giovanni on Friday 29 November 2013 11:09:05 (UTC)
Revision comment:Number of replicates in cross-validation can be specified now.
-
Created by Renato De Giovanni on Monday 02 December 2013 16:49:39 (UTC)
Revision comment:Included possibility to calculate omission error in cross validation.
-
Created by Renato De Giovanni on Wednesday 29 January 2014 13:12:41 (UTC)
Revision comment:This version ignores the output from the last BioSTIF interaction so that the workflow can be used with parameter data sweeps.
-
Created by Renato De Giovanni on Wednesday 26 March 2014 13:03:45 (UTC)
Revision comment:New parameter to indicate if BioSTIF layers will be used or not, and new code to parse CSV content.
-
Created by Renato De Giovanni on Wednesday 25 June 2014 14:26:46 (UTC)
Revision comment:Included BOM removal beanshell to handle certain types of input point files in UTF-8 and updated authors list.
-
Created by Renato De Giovanni on Tuesday 25 November 2014 11:46:22 (UTC)
Revision comment:Initial version of the workflow to be based on ENM components.
-
Created by Renato De Giovanni on Thursday 04 December 2014 16:54:03 (UTC)
Revision comment:Updated components versions.
-
Created by Renato De Giovanni on Friday 27 March 2015 12:45:11 (UTC)
Revision comment:Updated get_available_layers compnent to order BioSTIF layers alphabetically.
-
Created by Renato De Giovanni on Saturday 04 April 2015 21:17:06 (UTC)
Revision comment:Replaced googlecode pages with github pages.
-
Created by Renato De Giovanni on Thursday 11 June 2015 13:32:35 (UTC)
Revision comment:Replaced BioSTIF domain with "biostif.at.biovel.eu" when creating links to the BioSTIF interface.
Reviews (0)
Other workflows that use similar services (0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment