Need sed solution to detect blank lines

March 21, 2011 at 12:00:36
Specs: Linux i686
A colleague was faced with a problem of 804 ASCII files and he needed to extract the tabular data out of the first part and the last part into Excel spread sheets. He couldn't figure out a way using Windows tools to do this. I told him he needed Linux tools and sed was his answer. Since I could see that the useful data in the first part of the file was always in lines 2-44, I was able to get the the data into new files using a shell script like:
for file in *.lst
do
sed '2,44!d' $file > ../Listout/$file
done

And the last part of the file by keying into the string 'Clock':
for file in *.tab
do
sed '/Clock/,$!d' $file > ./Tableout/$file
done

But some of these files have additional useful data in the middle of the file at the line that starts with 'blank char Record' under the line that starts with 'Supplemental'. There are a different number of lines of data that follow the header line, which makes it tricky. I thought I could get sed to not delete the first blank line in the file, but this approach didn't work for me:
for file in *.tab
do
sed '/Record/,/ /!d' $file > ../sup/$file.sup
done

In the result, I got the lines from 'Record' to the end of file'.

The data in this area looks like:
Velocity 1.8 %^M
Width 0.1 %^M
^M
Supplemental_Data ^M
Record Date Time Location(ft) Gauge_Height(ft) Rated_Flow(cfs) Comments^M
01 2011/01/31 10:15:46 0.000 1.870 108.0063 ^M
^M
St Clock Loc Depth IceD %Dep MeasD Npts Spike Vel SNR Angle Verr Bnd Temp CorrFact MeanV Area Flow %Q ^M
() () (ft) (ft) (ft) (*D) (ft) () () (ft/s) (dB) (deg) (ft/s) () (degF) () (ft/s) (ft^2) (cfs) (%) ^M
00 10:16 6.00 0.000 0.000 0.0 0.000 0 0 0.0000 0.0 0 0.0000 0 0.00 1.00 0.0000 0.000 0.0000 0.0^M
01 10:16 6.80 1.680

I'm trying to just copy from the line Record to the next blank line. But it's not really blank. It's got a ^M in it. So this did not work either:
for file in *.tab
do
sed '/Record/,/^[ ]*$/!d' $file > ../sup/$file.sup
done

That gave me the same result, lines from Record to the end of the file. The blank line isn't exactly blank. It has a ^M only in the line. Can't clue into ^M because every line has that on the end of line.

It would be extra tricky to use grep first to find only the files that contained this string 'Supplemental' to pipe to sed.

Or is there even a better approach than sed in a shell script?


See More: Need sed solution to detect blank lines

Report •

#1
March 21, 2011 at 13:17:50
Have you considered removing the control-M, ^M? The dos2unix command will do that or there are scripts. Look here:

Report •
Related Solutions


Ask Question