Skip to main content

Searches for duplicate files in folders (recursively, if needed)

Project description

Search Duplicates (searchdups)

This is a simple application that searches for duplicate files in a set of folders. To check whether the files are identical or not, it makes use of md5 or sha256 algorithms, but the application calculates a smart hash to enhance performance: the idea is to calculate a partial hash and finalize the calculation only if needed.

Additionally, this application includes a pseudo hash that consists of checking whether the name of the files is the same. If using this "hash algorithm", if the name of two files is the same, they are considered to be the same even if the content is not the same.

The basic usage is

$ searchdups -r . 
> 8f8db820d89c39029a0629094e0f18c9*
/Users/calfonso/Programacion/norepo/searchdups/a1.jpg
/Users/calfonso/Programacion/norepo/searchdups/a11.jpg

Some other features are:

  • Select the hash algorithm (using parameter -H).
  • Searching in subfolders (using flag -r).
  • Considering hidden folders and files (using flag -a).
  • Show a progress bar during the process (using flag -p).
  • Selecting which files are processed (using -f parameter for sh-like filters, or -e parameter for regular expressions).
  • Exclude the files to process (using -F parameter for sh-like filters, or -E parameter for regular expressions).
  • Summarize the amount of files and folders considered (using flag -s).
  • Get the result in a file (using parameter -o).

Please check the CLI help to get updated information about the usage of this tool.

Installation

To install the tool you can clone the code and execute the next command inside the cloned folder

$ pip install .

or install it from the repositories:

$ pip install searchdups

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

searchdups-1.0.0.tar.gz (9.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page