Solved Extract data from different lines to a single row

July 16, 2015 at 11:42:11
Specs: Windows 7
So here is my issue. I need to extract a few bits of data from a large output file. All the output is setup the same way so what I need it to do is (assuming using space delimiters):
1. Find the string INTERPOLATED HYDROGRAPH
2. Copy next data token 4 to first column of new “MyOutput” (CAC40 in this case)
3. Go down 6 rows and copy token 2 (1223) to column 2 of new “MyOutput”
4. Go down 2 more rows (8 total) and copy tokens 2-5 to columns 3-6 of new “MyOutput”
In the end my output looks like (repeated a thousand times):
CAC40 1223 441 1166 1456 1456
I have bolded these values in the sample input at the end. The first part is simple:

@echo off
setLocal EnableDelayedExpansion
for /f "tokens=4 delims= " %%a in ('find "INTERPOLATED HYDROGRAPH" ^< INPUT.txt') do (echo %%a >> MyOutput.out
)

that gets me the "CAC40", but I don’t know how to get the batch to go down 6 rows and extract token 2 and then go 2 more rows and extract tokens 2-6 and put all the data on a single line in my new output file, and then continue to the next one. Any thoughts?
Here is a snippit of what I am trying to extract from. Everything I need is at the very bottom. Thanks
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***


**************
* *
3018 KK * CAC40 * COMBINE
* *
**************
Combine basin BAC40 and routed flow from CAC49

3020 HC HYDROGRAPH COMBINATION
ICOMP 2 NUMBER OF HYDROGRAPHS TO COMBINE

***

*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA .0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1434. 12.67 979. 641. 269. 116.
(INCHES) .170 .444 .560 .560
(AC-FT) 485. 1272. 1603. 1603.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 10.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1350. 12.67 944. 617. 260. 112.
(INCHES) .163 .428 .540 .540
(AC-FT) 468. 1224. 1545. 1545.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 30.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1266. 12.67 908. 597. 250. 108.
(INCHES) .157 .414 .519 .519
(AC-FT) 450. 1184. 1485. 1485.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 60.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1215. 12.67 887. 586. 244. 105.
(INCHES) .154 .406 .507 .507
(AC-FT) 440. 1163. 1450. 1450.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 90.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1184. 12.67 874. 580. 240. 104.
(INCHES) .151 .402 .499 .499
(AC-FT) 433. 1150. 1429. 1429.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 120.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1166. 12.67 866. 576. 238. 103.
(INCHES) .150 .399 .495 .495
(AC-FT) 429. 1142. 1416. 1416.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 150.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1150. 12.67 860. 573. 236. 102.
(INCHES) .149 .397 .491 .491
(AC-FT) 426. 1136. 1405. 1405.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 300.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1106. 12.67 842. 563. 231. 100.
(INCHES) .146 .390 .480 .480
(AC-FT) 417. 1117. 1375. 1375.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** ***

HYDROGRAPH AT STATION CAC40
TRANSPOSITION AREA 500.0 SQ MI

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1067. 12.67 826. 555. 227. 98.
(INCHES) .143 .385 .471 .471
(AC-FT) 410. 1102. 1349. 1349.

CUMULATIVE AREA = 53.67 SQ MI

*** *** *** *** ***

INTERPOLATED HYDROGRAPH AT CAC40

PEAK FLOW TIME MAXIMUM AVERAGE FLOW
6-HR 24-HR 72-HR 166.58-HR
+ (CFS) (HR)
(CFS)
+ 1223. 12.67 890. 588. 245. 106.
(INCHES) .154 .408 .509 .509
(AC-FT) 441. 1166. 1456. 1456.

CUMULATIVE AREA = 53.67 SQ MI


*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***


See More: Extract data from different lines to a single row

Report •


#1
July 16, 2015 at 19:12:58
✔ Best Answer
@echo off>myoutput & setlocal
set oryan=input.txt
for /f "tokens=1,5 delims=[] " %%a in ('find /n "INTERPOLATED HYDROGRAPH"^<%oryan%') do (
set /a x=%%a+6
set k0=%%b
echo line !x!
set /a y=%%a+8
call :xx
)
goto :eof

:xx
echo incoming: %k0% %x% %y%
find /n /v ""<%oryan%|findstr "^\[!x!\]"
rem pause
for /f "tokens=2 delims=: " %%a in ('find /n /v ""^<%oryan%^|findstr "^\[!x!\] "') do set k=%%a
for /f "tokens=2-6 delims=: " %%a in ('find /n /v ""^<%oryan%^|findstr "^\[!y!\] "') do set k2=%%a %%b %%c %%d

echo %k0% %k% %k2%
>>myoutput echo %k0% %k% %k2%
::==== end batchscript - left some debugging display activated.
not optimized for speed, so that may become an issue, but seemed to work in my tests

message edited by nbrane


Report •

#2
July 28, 2015 at 09:05:32
Thanks for the reply, I was hoping to complete this before I left for a camp with my boy last week, but I was unable to, thus the long delay.

I have run this and it doesn't quite seem to work.

... echo incoming: %k0% %x% %y% gives me the first text value and then the line #'s (%x% %y% ) that it should extract the text from, but the code after that does not seem to extract the data that I need correctly, it is just blank. My output file ends up with:

CAC40

instead of:
CAC40 1223 441 1166 1456 1456

Any thoughts on how to utilize those line #'s to get the right data out? thanks


Report •

#3
July 28, 2015 at 19:29:59
I forgot to enable expansion. (My system has it defaulted, so I forget to add it in line one).
I think this is the source of this problem. Just change line one of the script from:
@echo off>myoutput & setlocal
to:
@echo off>myoutput & setlocal enabledelayedexpansion

which will allow the exclams to work in our favor. There may be other problems, but this script worked in my tests as far as it goes. Hope this formula worked:
Kit+Camp=Fun-Injuries


Report •

Related Solutions

#4
July 28, 2015 at 23:31:59
thanks, that worked perfectly. I love it when it is a simple issue like that. I did have a quick question about the code though:

when it is writing to the file, why do you use delims=: " when it is a space delineated file as opposed to omitting the colon? It seems to work both ways, but I was curious if there was an advantage one way or the other.

It is a touch slow, but beats the hell out of trying to extract things by hand. Thanks again. My final code with some comments:

@echo off>output.txt & setlocal enabledelayedexpansion
set input=input.txt

rem this finds the text INTERPOLATED HYDROGRAPH in the input file
rem sets k0 = to that line#
rem sets x = to that line# + 6 and
rem sets y = to that line# + 8
for /f "tokens=1,5 delims=[] " %%a in ('find /n "INTERPOLATED HYDROGRAPH"^<%input%') do (
set /a x=%%a+6
set k0=%%b
echo line !x!
set /a y=%%a+8
call :xx
)
goto :eof
:xx

rem this line would echo the line #'s for each instance in the file
:: echo incoming: %k0% %x% %y%

rem this line takes the line #s and extracts the following:
rem 2nd column of line !x! and the 3rd, 4th & 5th columns of line !y!

find /n /v ""<%input%|findstr "^\[!x!\]"
for /f "tokens=2 delims=: " %%a in ('find /n /v ""^<%input%^|findstr "^\[!x!\] "') do set k=%%a
for /f "tokens=3-5 delims=: " %%a in ('find /n /v ""^<%input%^|findstr "^\[!y!\] "') do set k2=%%a %%b %%c

rem this line would echo the extracted values for each instance
::echo %k0% %k% %k2%

rem this writes the values to a text file
>>output.txt echo %k0% %k% %k2%


Report •

#5
July 29, 2015 at 18:47:01
Very perceptive! You nailed my slop right off the bat. The colon delimiter is indeed superflous, because it was intended for FINDSTR and not FIND, and the output string would have had neither since FINDSTR was not applied with the /n option. You can remove the DELIMS clause completely from those two lines. Also, I left an unnecessary feedback in the display:

echo incoming: %k0% %x% %y%
find /n /v ""<%oryan%|findstr "^\[!x!\]"
rem pause

The middle line, especially, would slow things down in practical execution since it has to browse the entire file. I employed it for my own debugging and it slipped its way in to the forum by stealth and deciept.



Report •

Ask Question