|
|
|
Awk to parse and output
|
Original Message
|
Name: DeeDogg
Date: September 12, 2005 at 09:55:42 Pacific
Subject: Awk to parse and outputOS: Sun OSCPU/Ram: sparc9 2Gigs Rams |
Comment: Hi, I have a list of items separated by the semicolon. Is it possible to output each item to a separate file using awk(or nawk). Thanks in advance. -Jean Example: item#1 aaa bbb ccc ; item#2 aaa bbb ccc ; itme#3 aaa bbb ccc ; ....
Report Offensive Message For Removal
|
|
Response Number 1
|
Name: Luke Chi
Date: September 12, 2005 at 10:55:07 Pacific
|
Reply: (edit)program: awk -F";" ' { print $1 > "item1.txt"; print $2 > "item2.txt"; print $3 > "item3.txt"; } ' input.txt input.txt: 1;2;3 a;b;c first;second;third output: item1.txt: 1 a first item2.txt: 2 b second item3.txt: 3 c third Luke Chi
Report Offensive Follow Up For Removal
|
|
Response Number 2
|
Name: DeeDogg
Date: September 12, 2005 at 11:36:14 Pacific
|
Reply: (edit)Hi Luke, Thanks for the reply. I was looking for: item1.txt: aaa bbb ccc ; then item2.txt: aaa bbb ccc ; etc... I don't want to separate the words on the lines, but rather the paragraphs, which are separated by the semicolon.
Thanks, Jean
Report Offensive Follow Up For Removal
|
|
Response Number 3
|
Name: Jim Boothe
Date: September 12, 2005 at 12:15:15 Pacific
|
Reply: (edit)The csplit command can come in very handy at times. The following command will split infile into separate files named outfile00, outfile01, etc, with lines consisting of semicolon being the separation line. The {*} following the pattern says to use that pattern repeatedly. The problem with this solution is that the matched separator line becomes the first line of each new output file. I think you want to throw that line away, and I do not find a csplit option to do so. csplit -f outfile infile '/^;$/' '{*}'So here is an awk solution, and I coded it two different ways, but the output will be identical.awk '\ BEGIN {fileout="outfile.001" seq=1} {if ($0==";") {if (seq!=0) close fileout seq++ fileout=sprintf "outfile.%3.3d",seq next} print > fileout }' infile awk '\ BEGIN {seq=01 fileout="fileout.001"} /;/ {close fileout seq++ fileout=sprintf "fileout.%3.3d",seq open fileout next} {print > fileout}' infile
Report Offensive Follow Up For Removal
|
|
Response Number 4
|
Name: Luke Chi
Date: September 12, 2005 at 12:41:12 Pacific
|
Reply: (edit)On Redhat linux: $ csplit -f item input.txt /\;/+1 {*} input.txt: 1 2 3 ; a b c ; first second third ; output: item00: 1 2 3 ; item01: a b c ; item02: first second third ; Note: My Solaris has problem to deal with {*}. Luke Chi
Report Offensive Follow Up For Removal
|
|
Response Number 5
|
Name: Luke Chi
Date: September 12, 2005 at 13:12:21 Pacific
|
Reply: (edit)The following is for Solaris: CT=`grep ^\;$ input.txt | wc -l` csplit -f item input.txt /\;/+1 {`expr $CT - 2`} My HP is down and I can't test it on HP machine at this moment. Luke Chi
Report Offensive Follow Up For Removal
|
|
Response Number 6
|
Name: Jim Boothe
Date: September 13, 2005 at 07:00:21 Pacific
|
Reply: (edit)But still, the csplit +1 operand does nothing to get rid of the delimiter line. The +1 says instead of using the line containing the pattern as the delimiting line, to use the line following. So instead of each output file beginning with the delimiter line, each output file will end with a delimiter line.
Report Offensive Follow Up For Removal
|
|
Response Number 7
|
Name: DeeDogg
Date: September 13, 2005 at 07:06:02 Pacific
|
Reply: (edit)Hi, Thanks for the help I went with the csplit command: csplit -k -n{3} -fpin input.file '/^-/' '/;/+1' '{99}' I told it where the file started '-' and where the file ended ';'. The limitation is the problem, which is 99, and I have about 578 list of items. Before I would get an out of range error, so I added the -n option. Thanks for the help Luke, Jim, Jean
Report Offensive Follow Up For Removal
|
Use following form to reply to current message:
|
|

|