Computing.Net > Forums > Unix > to remove duplicate lines from file

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

to remove duplicate lines from file

Reply to Message Icon

Name: Palsanthi
Date: July 8, 2005 at 00:29:55 Pacific
OS: unix
CPU/Ram: 1.8 GHZ / 256 MB
Comment:

hai,
i want to remove duplicate lines from any part of the file...uniq command is used to remove only the adjacent duplicate lines...if we use sort command with -u option it removes duplicate lines from any part of the file,but the file is sorted...i dont want the file to be sorted,but i want to remove duplicate lines from any part of the file.... this is my problem...if anybody knows answer for this please reply....



Sponsored Link
Ads by Google

Response Number 1
Name: Jim Boothe
Date: July 8, 2005 at 06:45:18 Pacific
Reply:

If your file is not extremely large, the awk script below should work.

awk '{
if ($0 in stored_lines)
   x=1
else
   print
   stored_lines[$0]=1
}' filein > fileout


0

Response Number 2
Name: Palsanthi
Date: July 10, 2005 at 21:59:20 Pacific
Reply:

Reply:

hai Jim,

thanx a lot for ur response....the sript u gave is working...i got wat i want...
thank u very much...


0

Response Number 3
Name: VKJAIN
Date: July 11, 2005 at 06:32:26 Pacific
Reply:

My requirement is little different, actually I have 2 files, one of the file has 800 line another has 50. No I wanted to check with respect to first file if there are any duplicate lines in 2nd file. And if it is present then the output file should remove that line from the 800 lines. It should not add any lines of file2. Only it should check with respect to 1st file.


0

Response Number 4
Name: Jim Boothe
Date: July 11, 2005 at 07:34:30 Pacific
Reply:

awk 'BEGIN{
while ((getline < "file50") > 0)
   list50[$1] = 1}
!list50[$1] {print}' file800 > file800new


0

Response Number 5
Name: VKJAIN
Date: July 12, 2005 at 06:49:55 Pacific
Reply:

Whn i use this script it says

> awk 'BEGIN
Unmatched '.


0

Related Posts

See More



Response Number 6
Name: Jim Boothe
Date: July 12, 2005 at 07:22:38 Pacific
Reply:

Some platforms require each end-of-line to be escaped with a backslash when a quoted string overflows to multiple lines. Try this:

awk 'BEGIN{ \
while ((getline < "file50") > 0) \
   list50[$1] = 1} \
!list50[$1] {print}' file800 > file800new


0

Response Number 7
Name: VKJAIN
Date: July 12, 2005 at 07:23:19 Pacific
Reply:


It worked well after going into bash mode.

Thanks a lot Jim


0

Sponsored Link
Ads by Google
Reply to Message Icon






Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: to remove duplicate lines from file

Remove blank lines from file www.computing.net/answers/unix/remove-blank-lines-from-file/3887.html

script to remove some lines in all www.computing.net/answers/unix/script-to-remove-some-lines-in-all-/6195.html

How to delete duplicate lines ?? www.computing.net/answers/unix/how-to-delete-duplicate-lines-/8277.html