AIT Matchbox Scenario All
Created: 2012-11-24 22:39:18
Last updated: 2012-11-24 22:39:19
In this scenario matchbox will find duplicates in passed digital collection. All matchbox workflow steps are executed automatically in one turn. User will get a list of duplicates in result. Matchbox in this scenario is installed on remote Linux VM. Digital collection is stored on Windows machine.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Titles (1)
AIT Matchbox Scenario All |
Descriptions (1)
In this scenario matchbox will find duplicates in passed digital collection. All matchbox workflow steps are executed automatically in one turn. User will get a list of duplicates in result. Matchbox in this scenario is installed on remote Linux VM. Digital collection is stored on Windows machine. |
Dependencies (0)
Inputs (1)
Name |
Description |
orig_dirlist_file_path |
Path to directory on server where digital collection that we are going to analyse is located.
|
Processors (2)
Name |
Type |
Description |
matchbox |
externaltool |
This command starts duplicate finding process using the FindDuplicates python script of the matchbox tool. Matchbox tool support python in version 2.7. Execution starts from the directory where python scripts are located. If you use source code from Github, then it is a scape/pc-qa-matchbox/Python/ directory. The python script supports different parameter. In this workflow the default paramter is 'all' in order to execute. |
parse_matchbox_stdout |
beanshell |
Script#!Pairtree pt = new Pairtree();
#!String id = pt.mapToId("/mnt/abonas/linktree/", barcode_path.substring(0,barcode_path.lastIndexOf("/")));
String duplicates_result = "";
String duplicates_matches = "";
duplicates_result += target_collection_path+ ":\n";
duplicates_matches += target_collection_path + "\t";
StringTokenizer st = new StringTokenizer(matchbox_stdout, "\n");
boolean startDuplicates = false;
boolean hasDuplicates = false;
while (st.hasMoreTokens()) {
String token = st.nextToken();
if (startDuplicates) {
if (token.contains("=>")) {
duplicates_result += token + "\n";
hasDuplicates = true;
}
}
if (token.contains("=== List of detected duplicates ===")) {
startDuplicates = true;
}
}
if(hasDuplicates)
duplicates_matches += "1";
else
duplicates_matches += "0"; |
Beanshells (1)
Name |
Description |
Inputs |
Outputs |
parse_matchbox_stdout |
|
target_collection_path
matchbox_stdout
|
duplicates_result
duplicates_matches
|
Outputs (4)
Name |
Description |
results |
|
stderr |
|
stdout |
|
matches |
|
Datalinks (7)
Source |
Sink |
orig_dirlist_file_path |
matchbox:target_collection_path |
matchbox:STDOUT |
parse_matchbox_stdout:matchbox_stdout |
orig_dirlist_file_path |
parse_matchbox_stdout:target_collection_path |
parse_matchbox_stdout:duplicates_result |
results |
matchbox:STDERR |
stderr |
matchbox:STDOUT |
stdout |
parse_matchbox_stdout:duplicates_matches |
matches |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1
(of 1)
Credits (2)
(People/Groups)
Attributions (0)
(Workflows/Files)
None
Shared with Groups (0)
None
Featured In Packs (0)
None
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment