Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
Hi all,
I have written one shell script. The output file of this script is having sql output.
In that file, I want to extract the rows which are having multiple entries(duplicate rows).
For example, the output file will be like the following way.===============================================================
<SH12_MC30_CE_VS_NY_HIST_T>
===============================================================
397 44847
400 33653
401 46455
===============================================================
<SH12_MC30_CE_VS_NY_HIST_T_BKP>
===============================================================
397 44847
398 40107
399 39338
400 33653
In this output, I want numeric duplicate rows only. Suppose this file is having lines to separate the values, those lines also considered as duplicate rows. So I want only the out put from this file which is having more than one entry and which is related to numbers.Raghunadh

This awk code requires a line to start with a digit, otherwise will be ignored. If you might have valid lines with leading white space, you would need a slight adjustment to the code.
This awk code summarizes distinct lines into an array. A huge number of lines would overflow the memory, so you would need a different approach, such as sorting all the lines.
This solution prints lines found to be duplicated, and shows the count of lines found. If you want just the lines without the count, the print command should be just "print i"
awk '/^[0-9]/ {lsum[$0]++} END { for (i in lsum) if (lsum[i] > 1) print i, "(" lsum[i] ")" }' myfile 397 44847 (2) 400 33653 (2)

![]() |
![]() |
![]() |
| Login or Register to Reply | |
| Login | Register |
| Ads by Google |