Bat - Find Duplicates and Delete Older File

Microsoft Windows 7 ultimate 64-bit
May 18, 2010 at 12:37:33
Specs: Windows 7
I have a directory that has a bunch of tv shows video files along with nfo and image files. Each nfo and image file is named after the associated tv show file. For example:

Show Name - 1x01 - Episode Name.mkv, Show Name - 1x01 - Episode Name.nfo, Show Name - 1x01 - Episode Name.tbn

There are multiple files all set up in a similar manner in the directory.

There are cases where there will be duplicate video files and the filenames will be identical up to a certain point. So for example

Show Name - 1x01 - Episode Name.mkv, Show Name - 1x01.mkv

So as you can see above, the videos will be duplicates, and they will have the same name up until after the 1x01 (these numbers change based upon the season and episode number of the TV show).

What I am trying to do is to have a batch script scan the current directory, find any video files (either mkv or avi) that have the same filename up until after the numbers, and then delete the older of the two files.

Is this possible?

See More: Bat - Find Duplicates and Delete Older File

Report •

June 4, 2010 at 13:41:11
Here's my stab at it:

:: Delete.Duplicate.Video.Files.bat

SET file_path=E:\Temp\Test Video
FOR /F "delims=" %%a IN ('DIR /B /S /O:N "%file_path%\*.mkv" "%file_path%\*.avi"') DO (
    CALL :get_file_info "%%a"
    SET old_file=!new_file!
    SET old_file_date=!new_file_date!
    SET new_file=!file_name!
    SET new_file_date=!file_date!
    FOR /F %%b IN ('@ECHO "!old_file!"^|FIND /C "!new_file!"') DO (
        IF "%%b" EQU "1" (
            IF !old_file_date! GTR !new_file_date! (
                ECHO older: !new_file_date! !new_file!
                ECHO newer: !old_file_date! !old_file!
                ECHO DEL "%file_path%\!old_file!.*"
            IF !new_file_date! GTR !old_file_date! (
                ECHO older: !old_file_date! !old_file!
                ECHO newer: !new_file_date! !new_file!
                ECHO DEL "%file_path%\!new_file!.*"
            IF !new_file_date! EQU !old_file_date! (
                ECHO same: !old_file_date! !old_file!
                ECHO same: !new_file_date! !new_file!
                ECHO Same timestamps. No files deleted...

SET file_date=%~t1
FOR /F "tokens=1-4 delims=: " %%c IN ("%file_date%") DO (
    SET hours=%%d
    IF %%d LSS 12 (
        IF "%%f" EQU "PM" (
            SET /A hours=%%d+12
    SET file_date=%%c !hours!:%%e
SET file_name=%~n1

Change "file_path=" to the full path of your video folder, and
run as-is to see if the files identified are to your expectations.
Then change "ECHO DEL" to just "DEL" where appropriate to
do the actual deleting of files.

Caveats: This takes into account that there would be only 1
duplicate episode:

Show Name - 1x01.mkv
Show Name - 1x01 - Episode Name.mkv

and no more. If there are 2 or more duplicates, it is not likely
that the oldest episodic file(s) will be retained.

Report •

June 4, 2010 at 17:08:10
thanks for your help with this, it works great!

When I set the video directory, it locates any duplicates and will delete the older file.

However, i am running into a problem when I use this script in conjunction with my automated tv sorting program.

This program will pass the full path of the current working directory to the script with the parameter %1, which is I am assuming where the script should run. I tried to replace the path in your test script with %1, however that didn't work.

Here is the link that shows the output for the script when it is run from my program:

thanks again for your help.

Report •

June 4, 2010 at 17:55:46
If you're using SABnzbd, set the argument passed as %~1.
This should strip off surrounding quotes if they are present in
the provided path. I leave the surrounding quotes off the path, so I can concatenate file names and extensions if need be.

And it looks like "SET /A hours=%%d+12"
is causing the problem. I hate writing kludgy code, but I seem
to be good at it :-\

Try changing SETs in this section, which should strip off the
leading zero, keeping windows from thinking the dates are in

FOR /F "tokens=1-4 delims=: " %%c IN ("%file_date%") DO (
    SET hours=0%%d
    SET hours=!hours:00=!
    SET hours=!hours:~-2!
    IF %%d LSS 12 (

The first SET will add a leading 0 to the hour.
The second SET will strip away any double zeros, if found, leaving 003 = 3.
The third SET will shrink the hour to the last two digits, so you don't end up with 010, 011 and 012 :-)

Report •

Related Solutions

June 4, 2010 at 18:27:00
Thank you again for your help with this, I really appreciate it.

I am using sabnzbd and I changed what you suggested, however the script does not appear to be working.

Here is the script in it's current form with your changes, maybe I missed something:

It appears that I am getting the same output in sab as well:

I also don't know if this is the problem, but instead of the newer file name being just 'Show Name - 1x01.mkv' it is being sorted as 'Show Name - 1x01 - (garbage text)'. I don't think this is the problem since that doesn't seem to effect the script when I input a specific path.

Again, thank you very much for your help.

Report •

June 4, 2010 at 18:52:33
This should do it:
    IF %%d LSS 12 (
        IF "%%f" EQU "PM" (
            SET /A hours=!hours!+12

That was poor programming on my part. I formatted the
"hours" variable to remove the leading zero, but the original
code still used the unaltered "%%d" variable when doing the
addition! :-o

Report •

June 4, 2010 at 19:37:17
Thanks, that solved most of the errors that are showing up in the output.

I think I have found out however why the duplicates aren't being deleted.

I spoke to soon earlier when I said that the extra text after the 'Show Name - 1x01 - Garbage Text.mvk' wasn't having an effect. It turns out that if there is any text after the '1x01' the script will not recognize the 2 files as being duplicates.

I have posted a question on the sab forums on how to remove the extra text, however, I wasn't sure if there was something in the script that could be changed to recognize the duplicate files even with the extra text.

thanks again.

Report •

June 5, 2010 at 06:59:47
How about this (added /I to FIND command):
FOR /F %%b IN ('@ECHO "!old_file!"^|FIND /I /C "!new_file!"') DO (

which makes the search case insensitive...

Report •

June 5, 2010 at 16:48:40
hmm, i tried adding the /I to the Find command to no avail.

I believe however that I have found a workaround. I am using sab to make sure that any text after the 1x01 is stripped out. So this seems to be working great now. Thank you so much for your help

Report •

June 7, 2010 at 14:35:57
I have one more question...i am trying to have this script run 2 additional programs that will rename and scrape tv show information from

I am just pointing to the exe's to do this, which has usually worked for me, however I noticed that the command you have at the end of your script 'GOTO :EOF' seems to be ending the bat file before the remaining exe's are run.

This is the setup i am using with no luck


"C:\Program Files (x86)\TVRename\TVRename.exe" /doall /hide /quit
"C:\Users\Kevin\Other Stuff\XBMC\Media Companion\mc_com.exe" -e

I would run these in 2 seperate bat files, but unfortunately sab only allows 1 script to run for post processing. Is there anyway to add those two lines to the script at the end?

Report •

June 8, 2010 at 09:25:04
i actually was able to figure out this problem after a few trial and error attempts. I was misreading the script thinking that GOTO :EOF was the end of the script, not realizing that it was the end of the subroutine and that EXIT /B was the end of the script.

I inserted those two lines of code before the Exit and everything is working great now.

Thanks so much again for your help orangeboy

Report •

June 8, 2010 at 11:51:47
Glad to hear it! Yes, the GOTO :EOF is acting like a "Return"
from the CALL, and as you found, the script doesn't really end
until the EXIT /B.

Report •

Ask Question