Removing both sets of duplicates in EXCEL

November 26, 2009 at 11:00:21
Specs: Windows XP, ?
I have 2 files of several thousand CDs. The first has everything listed (about 5,000 lines), the second has several thousand titles less (about 3,500 lines - basically, the ones I have sold).

I have merged the two files to create a file where the unsold items are duplicated. What I am trying to do is create a 'sold' file, containing ONLY the lines that are are NOT duplicated. In other words, where the lines are duplicated, I need to remove BOTH duplicated lines.

In other words, I want the end result to be the approx. 1,500 lines that are in the first file, but not the second.

Any idea how I can do this.

As requested, further information.
Each record takes up 1 row, and over a dozen columns. The easiest to find duplicates would be the barcode (which is unique).

Just to make it absolutely clear....assuming the first file contained the following :

CD Dion, Celine Unison 5 099746 720322
CD Duffy Rockferry 6 02517 56423 7
CD Duran Duran Decade 0 077779317820
CD Dylan, Bob Another Side Of Bob Dylan 5 09975123542
CD Fleetwood Mac Live In London '68 5 014293 666923
CD Furtado, Nelly Whoa, Nelly! 6 00445 03292 7

The second file contained :

CD Dion, Celine Unison 5 099746 720322
CD Duran Duran Decade 0 077779317820
CD Fleetwood Mac Live In London '68 5 014293 666923
CD Furtado, Nelly Whoa, Nelly! 6 00445 03292 7

The merged file would be thus :

CD Dylan, Bob Another Side Of Bob Dylan 5 09975123542
CD Dion, Celine Unison 5 099746 720322
CD Dion, Celine Unison 5 099746 720322
CD Duffy Rockferry 6 02517 56423 7
CD Duran Duran Decade 0 077779317820
CD Duran Duran Decade 0 077779317820
CD Fleetwood Mac Live In London '68 5 014293 666923
CD Fleetwood Mac Live In London '68 5 014293 666923
CD Furtado, Nelly Whoa, Nelly! 6 00445 03292 7
CD Furtado, Nelly Whoa, Nelly! 6 00445 03292 7

Then getting rid of the duplicates would leave the desired result, which would be :

CD Dylan, Bob Another Side Of Bob Dylan 5 09975123542
CD Duffy Rockferry 6 02517 56423 7

Again, any idea how to actually do this?


See More: Removing both sets of duplicates in EXCEL

Report •


#1
November 26, 2009 at 14:43:21
Why are people so set in using the wrong tools?
A database is needed.

Report •

#2
November 26, 2009 at 16:19:49
Yes - I agree. If I had Dbase II Plus on here, I could do it in 2 minutes flat.

The reality is that...

1 - Not many people will (or can) spend a large amount of money on a relational database (the ideal tool for this...

2. Sadly, Excel is the one - and just about only package that is universally recognised. Personally, I would love to never use Excel again (especially since they added the ribbon), but, the reality is that this is just not practical, as I deal with several sites and while they all accept files from database packages, they accept files from different pagages to each other - but, they all accept Excel files.


Report •

#3
November 26, 2009 at 17:56:37
Assuming your data is already sorted so that the duplicates follow one another as in your example, try this code:

What it does is check each item in column B (the bar code) and compare it to the item above it. If they match, it deletes both rows.

When you are deleting items from a list in Excel you have to start at the bottom and work up because if you delete a row on the way down, the counter will increment and skip the row that was moved up after the deletion.

The error checking is used to handle the error that will occur when you reach Row 1. The code will try to check Row (1-1) (Row 0) and throw up an error. The code assumes that once it throws up the error, it must be at Row 1 and therefore done.

Sub DelDups()
'Determine the number of items in column B
 lastRow = Cells(Rows.Count, 2).End(xlUp).Row
On Error GoTo Done
'Loop through rows, starting at the bottom
  For delRow = lastRow To 1 Step -1
'If duplicates are found, delete both rows
  If Cells(delRow, 1) = Cells(delRow - 1, 1) Then
    Cells(delRow, 1).EntireRow.Delete
    Cells(delRow - 1, 1).EntireRow.Delete
   End If
 Next
Done:
End Sub


Report •

Related Solutions

#4
November 26, 2009 at 19:31:49
Brilliant :) Thank you :)

Report •

#5
November 26, 2009 at 22:34:55
On the availability of a database program, there are quite a selection of freely available programs from sqlite to mysql and db2.

Report •

#6
November 27, 2009 at 11:50:36
re: "A database is needed."

Needed is a pretty strong word.

It might be helpful, but it's not needed.

re: If I had Dbase II Plus on here, I could do it in 2 minutes flat

If I hadn't bothered to copy/paste/format the OP's example data just to ensure that I got the same results that the OP posted, I would have been done in less than that.


Report •


Ask Question