skip to navigation
skip to content

PyDirDuplicateFinder 0.3.0

Analyse all files in one or more directories and manage duplicate files (the same file present with different names)

Downloads ↓

Introduction

This application help you cleaning your filesystem from duplicate files. The duplicate meaning here is: two or more files have the same content but can have different names.

You can use it in this way:

Usage: duplicatefinder.py [options] [directories]

Analyse all files in one or more directories and manage duplicate files (the
same file present with different names)

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -a ACTION, --action=ACTION
                        choose an action to do when a duplicate is found.
                        Valid options are print,rename,move,ask; print is the
                        default
  -r, --recursive       also check files in subdirectories recursively
  -p PREFIX, --prefix=PREFIX
                        prefix used for renaming duplicated files when the
                        'rename' action is chosen. Default is "DUPLICATED"
  -m PATH, --move-path=PATH
                        the directory where duplicate will be moved when the
                        'move' action is chosen
  -v, --verbose         more verbose output
  -q, --quiet           do not print any messages at all

  Filters:
    Use those options to limit and filter directories and files to check.
    Options belowe that rely on file or directory name support usage of
    jolly characters and can also be used multiple times

    -s MIN_SIZE, --min-size=MIN_SIZE
                        indicate the min size in bytes of a file for being
                        checked. Default is 128. Empty file are always ignored
    --include-dir=INCLUDE_DIR
                        only check directories with this name
    --exclude-dir=EXCLUDE_DIR
                        do not check directories with this name
    --include-file=INCLUDE_FILE
                        limit the search inside file with that name
    --exclude-file=EXCLUDE_FILE
                        ignore the search inside file with that name

Report bugs (and suggestions) to <luca@keul.it>.

TODO

  • More tests coverage (maybe some tests can be merged togheter).
  • Controls recursion maximum depth.
  • Internationalization (at least italian).
  • A "move to trash" action (dependency on trash-cli can be a great idea).
  • Release this as a Debian/Ubuntu/Kubuntu package (I'll really love this).

Credits

  • Thanks to Lord Epzylon for sending me some code and modifications.

Subversion and other

The SVN repository is hosted at the Keul's Python Libraries

Changelog

0.3.0

  • The runnable script name has been changed to duplicatefinder.py.
  • You can now pass multiple target directories as parameters.
  • Added a --action=ask option for choosing at every duplicate what action perform (interactive mode).
  • Added the --include-dir option for limit the search only to specific directories.
  • Added the --exclude-dir option for skipping the search from some directories.
  • Added the --include-file option for match only some files in the search.
  • Added the --exclude-file option for skipping files from the search, based on file name.
  • Using a wrong directory name was not handled, but was producing only abnormal termination.
  • More kindly handle of the break (CTRL+C) user's action.
  • Added the --verbose option to print some more message infos.
  • Added the --quiet option to output nothing at all.
  • Removed the _same_file function. Python already have a filecmp module (hoping this is faster)!
  • Added environment for automated tests, and tests too (use the --action=tests).
  • Some fixes to the command line help.

0.2.0

  • Added the move action.
  • Added the --recursive option, to walk an entire tree of folders (thanks to Lord Epzylon).
  • Added the --min-size option, to specify a minimum size of the files to be checked.

0.1.2

  • Bad bug in the setup.py. Code was ok but the 0.1.1 egg was not installable. Thanks to the everywhere present A. Jung.

0.1.1

  • Fix to the setup.py script.
  • Added doc infos.
  • First egg official release.

0.1.0 - Unreleased

  • First (un)release
 
File Type Py Version Uploaded on Size # downloads
PyDirDuplicateFinder-0.3.0-py2.5.egg (md5) Python Egg 2.5 2009-08-15 31KB 541
PyDirDuplicateFinder-0.3.0.tar.gz (md5) Source 2009-08-15 10KB 514