parse_csv_points
Created: 2013-12-24 11:41:36
Last updated: 2014-11-18 18:08:26
Parses csv content with species occurrence points in the DarwinCore archive format, determining column indexes and returning the records as a list of points in openModeller format (XML). No distinction is made between presences or absences.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Alan R Williams & Renato De Giovanni |
Titles (1)
Parse csv content with species occurrence points. |
Descriptions (1)
Parses the csv content, determining column indexes and returning the records as a list of points in openModeller format (XML). All points are considered presence points. |
Dependencies (0)
Inputs (1)
Name |
Description |
csv_content |
Comma-separated list of values. Each line in the file corresponds to a different record. The first line must be a header containing column names also separated by comma. The following columns are mandatory AND must be spelled EXACTLY as follows: occurrenceID, nameComplete, decimalLongitude and decimalLatitude. Other columns can be present. Columns can be in any order, but they must match the order of the corresponding values in the following lines.
|
Processors (4)
Name |
Type |
Description |
Merge_String_List_to_a_String |
localworker |
ScriptString seperatorString = "\n";
if (seperator != void) {
seperatorString = seperator;
}
StringBuffer sb = new StringBuffer();
for (Iterator i = stringlist.iterator(); i.hasNext();) {
String item = (String) i.next();
sb.append(item);
if (i.hasNext()) {
sb.append(seperatorString);
}
}
concatenated = sb.toString();
|
parse_header |
beanshell |
Scriptimport java.io.StringReader;
import au.com.bytecode.opencsv.CSVReader;
CSVReader reader = new CSVReader(new StringReader(csv_content),',','"');
int name_idx = -1;
int id_idx = -1;
int long_idx = -1;
int lat_idx = -1;
String[] header = reader.readNext();
if ( header != null) {
List terms = Arrays.asList(header);
name_idx = terms.indexOf("nameComplete");
id_idx = terms.indexOf("occurrenceID");
long_idx = terms.indexOf("decimalLongitude");
lat_idx = terms.indexOf("decimalLatitude");
}
else {
throw new RuntimeException("The input file provided for species occurrence points is empty.");
}
if ( name_idx == -1 ) {
throw new RuntimeException("The column nameComplete is missing from the header of the input points file.");
}
if ( long_idx == -1 ) {
throw new RuntimeException("The column decimalLongitude is missing from the header of the input points file.");
}
if ( lat_idx == -1 ) {
throw new RuntimeException("The column decimalLatitude is missing from the header of the input points file.");
}
|
get_first_taxon |
beanshell |
Scriptimport java.io.StringReader;
import au.com.bytecode.opencsv.CSVReader;
CSVReader reader = new CSVReader(new StringReader(csv_content),',','"');
String taxon_name = "";
int name_idx_int = Integer.parseInt(name_idx);
String[] first_line = reader.readNext();
String[] second_line = reader.readNext();
if (second_line != null) {
if (name_idx_int >= 0) {
if (second_line.length > name_idx_int) {
taxon_name = second_line[name_idx_int];
}
}
}
else {
throw new RuntimeException("The input file provided for species occurrence points has no other lines after the header.");
}
|
extract_taxon_points |
beanshell |
Scriptimport java.io.StringReader;
import au.com.bytecode.opencsv.CSVReader;
CSVReader reader = new CSVReader(new StringReader(csv_content),',','"');
int name_idx_int = Integer.parseInt(name_idx);
int id_idx_int = Integer.parseInt(id_idx);
int long_idx_int = Integer.parseInt(long_idx);
int lat_idx_int = Integer.parseInt(lat_idx);
int max_idx = Math.max(name_idx_int, Math.max(id_idx_int, Math.max(long_idx_int, lat_idx_int)));
ArrayList all_points = new ArrayList();
String id;
int i = 0;
String [] line;
while ((line = reader.readNext()) != null) {
i++;
if (i == 1) {
continue;
}
if (line.length > max_idx) {
if (id_idx_int == -1) {
id = String.valueOf(i);
}
else {
id = line[id_idx_int];
}
if (taxon_name == void || line[name_idx_int].equals(taxon_name)) {
all_points.add("");
}
}
}
num_points = all_points.size(); |
Beanshells (3)
Name |
Description |
Inputs |
Outputs |
parse_header |
|
csv_content
|
name_idx
id_idx
long_idx
lat_idx
|
get_first_taxon |
|
csv_content
name_idx
|
taxon_name
|
extract_taxon_points |
|
csv_content
name_idx
taxon_name
id_idx
long_idx
lat_idx
|
all_points
num_points
|
Outputs (6)
Name |
Description |
id_idx |
Index of the occurrenceID field in the header (starting with 0).
|
long_idx |
Index of the decimalLongitude field in the header (starting with 0).
|
lat_idx |
Index of the decimalLatitude field in the header (starting with 0).
|
first_taxon_name |
First taxon name found in the csv content.
|
all_points_xml |
List of all points (separated by new line) already in XML format for openModeller.
|
num_points |
Number of points.
|
Datalinks (15)
Source |
Sink |
extract_taxon_points:all_points |
Merge_String_List_to_a_String:stringlist |
csv_content |
parse_header:csv_content |
csv_content |
get_first_taxon:csv_content |
parse_header:name_idx |
get_first_taxon:name_idx |
csv_content |
extract_taxon_points:csv_content |
parse_header:id_idx |
extract_taxon_points:id_idx |
parse_header:lat_idx |
extract_taxon_points:lat_idx |
parse_header:long_idx |
extract_taxon_points:long_idx |
parse_header:name_idx |
extract_taxon_points:name_idx |
parse_header:id_idx |
id_idx |
parse_header:long_idx |
long_idx |
parse_header:lat_idx |
lat_idx |
get_first_taxon:taxon_name |
first_taxon_name |
Merge_String_List_to_a_String:concatenated |
all_points_xml |
extract_taxon_points:num_points |
num_points |
Uploader
Component Validity
Loading
License
All versions of this Workflow are
licensed under:
Version 3
(of 4)
Credits (0)
(People/Groups)
None
Attributions (0)
(Workflows/Files)
None
Shared with Groups (1)
Featured In Packs (1)
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment