extract data from file using script

April 28, 2011 at 13:38:08
Specs: Windows Vista
Hi I have a problem I can't get my head around, need help please. I have a large data file each record begins with "9" and is 21 lines long, each line ending with \. I need to extract data from the same places in each record using only dos batch script and end up with new file.

here is example of 2 records of large file

"9" 0000000000001 00 15 12 00 00 0 00021 \
0 00031 0200010000000 04 "????" 0 0 9 000000 00000999 \
00000000 00 01 "????" "????" 000 000 00000000 0 000000 \
0 0 000000 000000 000000 \
"TEST LABEL TEXT 1" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"TEXT 2" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"TEXT 3" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"????" \
"????" "????" 0000 "????" "????" \
"????" \
00 0 \
"????" 0 "????" "????" "????" \
"????" "????" "????" "????" "????" "????" "????" "????" "????" "????" \
"????" "????" "????" "????" "????" "????" "????" "????" "????" "????" 0 1 0 00
"9" 0000000000002 00 20 11 00 00 0 00071 \
0 00081 0200020000000 01 "????" 0 0 8 000000 00000888 \
00000000 00 00 "????" "????" 000 000 00000000 0 000000 \
0 0 000000 000000 000000 \
"TEST TEXT 8" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"TEST TEXT 9" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"" "" "" "" "" "" "" "" "" "" \
"????" \
"????" "????" 0000 "????" "????" \
"????" \
00 0 \
"????" 0 "????" "????" "????" \
"????" "????" "????" "????" "????" "????" "????" "????" "????" "????" \
"????" "????" "????" "????" "????" "????" "????" "????" "????" "????" 0 1 0 00
"9" 0000000000999 00 20 00 00 00 3 00000 \

looking to get output file like this, these characters are located in same place in every record, any help would be greatly appreciated. Thanks in advance!

0200010000000,15,00000999,00021,00031,12,04,9,TEST TEXT 1,TEST TEXT 2
0200020000000,20,00000888,00071,00081,11,01,8,TEST TEXT 8,TEST TEXT 9


See More: extract data from file using script

Report •


#1
April 29, 2011 at 01:43:00
This might get done if you specify which parts to extract.


=====================================
Life is too important to be taken seriously.

M2


Report •

#2
April 29, 2011 at 12:15:29
Thanks for prompt reply, From each record I need to extract as follows
from 1st line, positions 4,5 & 8
from 2nd line, positions 2,3,4,8 & 10
from 5th line, position 1
from 7th line, position 1

looking to get output file like this, these characters are located in same place in every record. Every record is 21 lines long and begins with "9" Thanks.

0200010000000,15,00000999,00021,00031,12,04,9,TEST TEXT 1,TEST TEXT 2
0200020000000,20,00000888,00071,00081,11,01,8,TEST TEXT 8,TEST TEXT 9


Report •

#3
April 29, 2011 at 12:19:01
sorry 1 mistake
from 1st line, positions 4,5 & 9 not 8

Report •

Related Solutions

#4
April 30, 2011 at 01:50:01
I don't get 0200010000000 as pos 4.


=====================================
Life is too important to be taken seriously.

M2


Report •

#5
April 30, 2011 at 04:32:00
Sorry, to clarify correctly, from 1st record

From 1st line in 1st record (positions 4,5 & 9)
"9" 0000000000001 00 15 12 00 00 0 00021 \
15,12,00021,

From 2nd line in 1st record (positions 2,3,4,8 & 10)
0 00031 0200010000000 04 "????" 0 0 9 000000 00000999 \
00031,0200010000000,4,9,00000999,

From 5th line in first record (position 1)
"TEST LABEL TEXT 1" "" "" "" "" "" "" "" "" "" \
TEST LABEL TEXT1

From 7th line in 1st record
"TEXT 2" "" "" "" "" "" "" "" "" "" \
TEXT 2

need to get 1st record in this format
15,12,00021,00031,0200010000000,4,9,00000999,TEST LABEL TEXT1,TEXT 2
need second and subsequent record's in same format. second record as follows
20,11.00071,00081,0200020000000,1,8,00000888,TEST TEXT8,TEST TEXT9.

only text fields are variable length.
Thanks for your interest hope you can help me out.



Report •

#6
May 1, 2011 at 16:00:44
Dosmann. I have the feeling that your post is being somewhat ignored by the scripting gurus because you change the output specs.. In your first post you indicate that you want the extracted output to be :
0200010000000,15,00000999,00021,00031,12,04,9,TEST TEXT 1,TEST TEXT 2
0200020000000,20,00000888,00071,00081,11,01,8,TEST TEXT 8,TEST TEXT 9
despite the string TEST TEXT 1 not appearing in the input file you show.

Now it appears that you want the extracted output to be :

15,12,00021,00031,0200010000000,4,9,00000999,TEST LABEL TEXT1,TEXT 2
20,11.00071,00081,0200020000000,1,8,00000888,TEST TEXT8,TEST TEXT9.
(also note the removal of leading zero from 04 and 01 and the use of . {fullstop} which probably should have been , {comma})

It's just not possible to give you what you want unless you specify exactly what that is.


Please come back & tell us if your problem is resolved.


Report •

#7
May 2, 2011 at 03:50:17
Hi Wahine, I was looking to get data extracted from a large data file i.e. 10 pieces of data from each record to be compiled into a new file. Its not really important which order the fields are extracted provided they are put in the same location in the new file, and preferably with the text field at the end of line because it makes it easier to read as the text field is variable in length. The reason I had to change the format was because one of the replies I received needed further explaining to accurately point to where the data was coming from in the data file. The original post had TEST LABELTEXT 1, and I corrected that in the later posts, sorry my mistake and also the full stop is a typing error. I have managed to write a script to extract the data but it needs minor adjustments. Its a bit big to post so if anyone is interested in helping me over the final hurdle I would be very thankful.

Report •

Ask Question