ARC to WARC Migration and CDX Index Comparison

Created: 2014-04-23 13:31:02

Workflow for migrating ARC to WARC and comparing the CDX index files (Linux).

The workflow has an input port “input_directory” which is a local path to the directory containing the ARC files, and an input port “output_directory” which is the directory where the workflow outputs are created. The files in the input directory are migrated using the “arc2warc_migration_cli” tool service component to perform the migration. The “cdx_creator_arc” and “cdx_creator_warc” tool service components create cdx index files for both, the original ARC file and the migrated WARC file which, subsequently, are compared by the “cdx_comparison” tool service component that uses the CSV file comparison tool csvdiff ( to compare defined columns of the two CSV files.

Command line applications used by the tool service components:

arc2warc_migration_cli: cdx_creator_(w)arc: diff_cdx:

Information Preview

Information Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
[ More InfoExpand ]

Information Workflow Components

Information Authors (1)
Information Titles (1)
Information Descriptions (1)
Information Dependencies (0)
Inputs (2)
Processors (11)
Beanshells (0)
Outputs (5)
Datalinks (19)
Coordinations (5)

Information Workflow Type

Taverna 2

Information Uploader

Information License

All versions of this Workflow are licensed under:

Information Version 1 (of 1)

Information Credits (1)


Information Attributions (0)



Information Tags (0)


Log in to add Tags

Information Shared with Groups (1)

Information Featured In Packs (0)


Log in to add to one of your Packs

Information Attributed By (0)



Information Favourited By (0)

No one

Information Statistics


Citations (0)


Version History

In chronological order:

Reviews Reviews (0)

No reviews yet

Be the first to review!

Comments Comments (0)

No comments yet

Log in to make a comment

Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.