Reviews (0)
For Workflow: Find Duplicates using Matchbox command line tool
Find Duplicates using Matchbox command lin... (1)
The workflow takes a list of digital documents as input, extracts SIFT features using image processing algorithms, creates dictionary of visual words, generates BoW (Bag of Words) histogramms and finds duplicates. The count of parallel threads can be passed as a parameter. Finally search results are stored in a text file that contains a list of possible duplicates with associated similarity score. This score values are spread between 0 (low similarity) and 1 (high similarity). Image compariso...
Created: 2012-07-31 | Last updated: 2012-07-31
Credits: Roman