Computing.Net > Forums > Unix > remove duplicate entries

remove duplicate entries

Reply to Message Icon

Original Message
Name: cleasse
Date: February 17, 2007 at 22:06:59 Pacific
Subject: remove duplicate entries
OS: unix
CPU/Ram: 512
Model/Manufacturer: hp
Comment:

How to remove duplicate entries with awk.

input file:
1
1
2
3
3
4
4

expected output:
2

basically I want to completely remove duplicate entries and print only non duplicate one.

Thanks for the help!

Regards,


Report Offensive Message For Removal

Response Number 1
Name: cleasse
Date: February 17, 2007 at 22:39:54 Pacific
Subject: remove duplicate entries
Reply: (edit)

it works with
sort <inputfile> | uniq -u

but how to do it with awk?

Regards,


Report Offensive Follow Up For Removal

Response Number 2
Name: nails
Date: February 18, 2007 at 00:17:11 Pacific
Subject: remove duplicate entries
Reply: (edit)

If the data file is already sorted, this works:

cat data.file|uniq -u

I'm assuming that data.file is already sorted:

awk ' {
if(NR == 1)
{
fr=$1
cnt=1
continue
}

if(fr == $1 )
cnt++
else
{
if(cnt == 1)
{
print fr
cnt=1
}
else # skip cause there's more than 1
cnt=1
fr=$1
}
}
END { if(cnt == 1)
print fr
} ' data.file


Report Offensive Follow Up For Removal

Response Number 3
Name: cleasse
Date: February 18, 2007 at 02:14:26 Pacific
Subject: remove duplicate entries
Reply: (edit)

it works! thanks nails!

Regards,


Report Offensive Follow Up For Removal

Response Number 4
Name: thepubba1
Date: February 19, 2007 at 17:25:39 Pacific
Subject: remove duplicate entries
Reply: (edit)

In a shell scripting class, you would have to explain this:

sort <inputfile> | uniq -u

since

sort -n -u < inputfile

does the same thing. 2 commands should never be used when one will do.


Report Offensive Follow Up For Removal

Response Number 5
Name: thepubba1
Date: February 19, 2007 at 17:30:22 Pacific
Subject: remove duplicate entries
Reply: (edit)

Nails:

UUOC award for this suggestion:

cat data.file|uniq -u

uniq < data.file

Not sure why uniq has a -u flag, since it only returns unique lines if invoked without arguements.



Report Offensive Follow Up For Removal


Response Number 6
Name: nails
Date: February 19, 2007 at 19:51:56 Pacific
Subject: remove duplicate entries
Reply: (edit)

Jerry:

Yup, that's a UUOC alright. I should have picked it up.

You might take a closer look at the original requirement. The command:

sort -n -u data.file

eliminates an adjacent duplicate. uniq's -u option surpresses all repeated lines.


Report Offensive Follow Up For Removal






Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: remove duplicate entries

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software