When using fdupes to delete duplicate files across three directories, such as a, b, and c, and you want to keep a first, then b, and delete duplicates from c first, the key is not a complex rule. It is the order of directory arguments.
In non-interactive delete mode, fdupes keeps the first file it sees in each duplicate group and deletes later duplicates. Therefore, directory arguments should be arranged from highest retention priority to lowest.
In other words, to achieve “delete from c first, then b, and keep a as much as possible”, write the command like this:
|
|
The scan order is a -> b -> c. When the same file exists in all three directories, the file in a is found first and kept, while duplicates in b and c are deleted. If only b and c contain duplicates, b is kept and c is deleted.
Parameter Meaning
Common parameters are:
-r: recursively scan subdirectories.-d: delete duplicate files.-N: when used with-d, skip interactive confirmation, keep the first file in each duplicate group, and delete the rest.
Therefore, the basic format for automatic duplicate deletion is:
|
|
The earlier a directory appears, the higher its retention priority. The later it appears, the more likely its duplicate files are to be deleted.
Preview Before Deleting
Using -dN deletes files directly, so it is better to preview duplicate groups first:
|
|
The output is grouped by duplicate files. In each group, the file shown earlier is the one more likely to be kept in non-interactive deletion mode.
You can also view summary information:
|
|
If the data is important, save the result and inspect it manually:
|
|
After confirming that the order within each duplicate group matches your expectations, run:
|
|
How Subdirectories Are Handled
As long as -r is enabled, fdupes recursively scans all files under the directories you pass in. Retention priority is still determined by the order in which paths appear in the command.
For example:
|
|
This means:
dir_ahas the highest priority.dir_bcomes next.dir_chas the lowest priority.
If dir_a/sub1/file.txt and dir_c/sub1/file.txt have identical content, the file under dir_a is kept. If dir_a/x/y/file.txt and dir_c/file.txt have identical content, the file under dir_a is still kept first. fdupes compares file content; filenames and directory depth do not need to match.
Precisely Controlling Subdirectory Priority
If you only pass parent directories, the scan order inside subdirectories is determined by fdupes traversal behavior. This is enough in most cases. But if you want a specific subdirectory to have higher priority, write it explicitly before its parent directory.
For example, suppose you want to keep dir_a first, then keep dir_b/special, then process the rest of dir_b, and finally process dir_c:
|
|
This makes dir_b/special scan before dir_b. When dir_b is scanned later, files under special have already been recorded, so that subdirectory effectively has higher priority than the rest of dir_b.
This pattern is useful when:
ais the most important baseline directory.- A subdirectory inside
bis more important than the rest ofb. cis mainly a low-priority backup directory.
The path order can be extended further:
|
|
The rule is still the same: the earlier it appears, the more likely it is to be kept.
Use a List for Many Directories
If there are many directories and subdirectories, manually writing a long command is error-prone. You can write paths into a text file such as folders.txt, ordered by priority:
|
|
Then pass them to fdupes with xargs:
|
|
If paths may contain spaces, use null-separated input for better safety:
|
|
Important Boundaries
First, fdupes compares file content, not filenames. Two files with completely different names can still be treated as duplicates if their content is identical.
Second, if directory a contains duplicates internally, fdupes -rdN a b c may also delete later duplicates inside a. This command means “keep the first file according to the overall scan order”, not “never delete anything under a”.
Third, by default, fdupes does not follow symbolic links. If you need to handle files behind symlinks, confirm whether -s is needed and whether that matches your data-safety expectations.
Fourth, fdupes only deletes duplicate files. It does not clean up empty directories. After deletion, if b and c contain empty folders, you can run:
|
|
Safer Operating Habit
If the directories contain important data, do not start with -rdN. A safer workflow is:
- Run
fdupes -r a b cfirst to view duplicate groups. - Confirm that the first file in each group is the one you want to keep.
- Then run
fdupes -rdN a b cfor automatic deletion. - After deletion, check whether empty directories need cleanup.
If you are very worried about accidentally deleting files under a, first clean a smaller range of low-priority directories, or export the results and filter them manually. The directory order in fdupes is useful, but it is not an access-control rule. Once a path is included in the scan, duplicate files inside it may participate in deletion decisions.
Summary
To delete duplicate files with fdupes by priority, put the directories you want to keep earlier and the directories you want to delete from later.
To keep a, then b, and delete from c first:
|
|
To give a subdirectory higher priority, write it before its parent directory:
|
|
The key sentence is simple: fdupes -dN keeps duplicate files that appear first and deletes duplicates that appear later. Directory order is your retention priority.