Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.
to remove duplicate lines from file
Name: Palsanthi Date: July 8, 2005 at 00:29:55 Pacific OS: unix CPU/Ram: 1.8 GHZ / 256 MB
Comment:
hai, i want to remove duplicate lines from any part of the file...uniq command is used to remove only the adjacent duplicate lines...if we use sort command with -u option it removes duplicate lines from any part of the file,but the file is sorted...i dont want the file to be sorted,but i want to remove duplicate lines from any part of the file.... this is my problem...if anybody knows answer for this please reply....
Name: Jim Boothe Date: July 8, 2005 at 06:45:18 Pacific
Reply:
If your file is not extremely large, the awk script below should work.
awk '{ if ($0 in stored_lines) x=1 else print stored_lines[$0]=1 }' filein > fileout
0
Response Number 2
Name: Palsanthi Date: July 10, 2005 at 21:59:20 Pacific
Reply:
Reply:
hai Jim,
thanx a lot for ur response....the sript u gave is working...i got wat i want... thank u very much...
0
Response Number 3
Name: VKJAIN Date: July 11, 2005 at 06:32:26 Pacific
Reply:
My requirement is little different, actually I have 2 files, one of the file has 800 line another has 50. No I wanted to check with respect to first file if there are any duplicate lines in 2nd file. And if it is present then the output file should remove that line from the 800 lines. It should not add any lines of file2. Only it should check with respect to 1st file.
0
Response Number 4
Name: Jim Boothe Date: July 11, 2005 at 07:34:30 Pacific
Summary: Hi everyone, I am trying to remove some lines from all the files in a directory, for example I am looking at the fields 25-27 and if it is 19 I am trying to remove the whole line from that file, simi...
Summary: URGENT HELP NEEDED... I need to know how to delete duplicate lines from a pipe delimited file (ksh). it should check only the 1st field for any duplicate entries & if finds any should delete those dup...