Specialty Forums
Security and Virus
General Hardware
CPUs/Overclocking
Networking
Digital Photo/Video
Office Software
PC Gaming
Console Gaming
Programming
Database
Web Development
Digital Home

General Forums
Windows XP
Windows Vista
Windows 95/98
Windows Me
Windows NT
Windows 2000
Win Server 2008
Win Server 2003
Windows 3.1
Linux
PDAs
BeOS
Novell Netware
OpenVMS
Solaris
Disk Op. System
Unix
Mac
OS/2

Drivers
Driver Scan
Driver Forum

Software
Automatic Updates

BIOS Updates

My Computing.Net

Solution Center

Free IT eBook

Howtos

Site Search

Message Find

RSS Feeds

Install Guides

Data Recovery

About

Home
Reply to Message Icon Go to Main Page Icon

Split row into meaningful records

Original Message
Name: nimi
Date: April 19, 2007 at 00:45:33 Pacific
Subject: Split row into meaningful records
OS: Unix
CPU/Ram: N/A
Model/Manufacturer: N/A
Comment:
Hi,

Please help me to write the following to

..cnai #Generated on Tue Mar 20 15:47:23 2007 by CNAI R12T02, user echuyau
..capabilities BASIC
.utctime 2007-03-20 07:47:23
.subnetwork ONRM_RootMo:MAXIS2G
.domain NREL
.set BSC2:AIA1B10:*:HAFID11
BSC_NAME="BSC2"
CELL_NAME="AIA1B10"
NREL_NAME="HAFID11"
AWOFFSET=5
BQOFFSET=3
BQOFFSETAFR=3
CAND="BOTH"
CS="NO"
HIHYST=5
KHYST=3
KOFFSET=0
LHYST=3
LOHYST=3
LOFFSET=0
OFFSET=0
TRHYST=2
TROFFSET=0
.set BSC2:ANTAM16:BSC2:ANTAM17
BSC_NAME="BSC2"
CELL_NAME="ANTAM16"
NREL_NAME="ANTAM17"
AWOFFSET=3
BQOFFSET=3
BQOFFSETAFR=3
CAND="BOTH"
CS="YES"
HIHYST=5
KHYST=3
KOFFSET=0
LHYST=3
LOHYST=3
LOFFSET=0
OFFSET=0
TRHYST=2
TROFFSET=0

to be this

AIA1B10|HAFID11|BSC2|AIA1B10|HAFID11|5|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
ANTAM16|ANTAM17|BSC2|ANTAM16|ANTAM17|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0

Please help


Nimi


Report Offensive Message For Removal


Response Number 1
Name: nails
Date: April 19, 2007 at 17:10:41 Pacific
Subject: Split row into meaningful records
Reply: (edit)
A way: when the line starts with .set, grab the second and last field. When the .set line is found any line from then on grab the last field if the separator is an equal sign:

#!/bin/ksh

s_set=0
recset=""
# get rid of the # double quotes
tr -d '\"' < myfile |
while read line
do
case "$line" in
.set*)
if [[ -n recset ]]
then
# get rid of the last pipe
recset=$(echo "$recset" |sed 's/|$//g')
echo $recset
fi
recset=$(echo "$line"|awk ' BEGIN { FS=":" } { print $2"|"$NF"|" } ')
s_set=1
;;

*)
if [[ s_set -eq 1 ]]
then
nl=$(echo "$line"|awk ' BEGIN { FS="=" } { print $NF"|" } ')
recset=${recset}${nl}
fi
;;
esac

done
if [[ -n recset ]]
then
recset=$(echo "$recset" |sed 's/|$//g')
echo $recset
fi


Report Offensive Follow Up For Removal

Response Number 2
Name: nimi
Date: April 19, 2007 at 20:21:16 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Hi,

Thank you for the solution. Where do I replace my input file name and where do i write my output file name for the above code.

Thanks
Nimi


Report Offensive Follow Up For Removal

Response Number 3
Name: nimi
Date: April 19, 2007 at 20:38:51 Pacific
Subject: Split row into meaningful records
Reply: (edit)
what i meant was how do i replace the echo to writing to a file.

Nimi


Report Offensive Follow Up For Removal

Response Number 4
Name: nails
Date: April 20, 2007 at 07:41:29 Pacific
Subject: Split row into meaningful records
Reply: (edit)
First, I called the input myfile:

tr -d '\"' < myfile ....

Change myfile to your file.

Second, there are several ways of writing to an output file. The two 'echo $recset' lines can be changed to this:

echo $recset >> outputfile.

Perhaps the easiest is to send the output of the script to a file. If the script is called 'myscript', the command is:

myscript > outputfile


Report Offensive Follow Up For Removal

Response Number 5
Name: ghostdog
Date: April 21, 2007 at 21:18:25 Pacific
Subject: Split row into meaningful records
Reply: (edit)
you can do everything in awk
[code]
#!/bin/sh


awk 'BEGIN {FS = "=";c=0; }
NR <=5 { next}
/\.set/{ c=c+1 ;next}
{ array[c] = array[c]$2"|"}
END {
for (e in array) {
b = gensub(/(\")|(\|$)/,"","g",array[e])
n = split(b,f,"|")
printf("%s|%s|" , f[2] , f[3])

for (i=1; i<=n ; i++) {
printf("%s|", f[i])
}
print ""
}
}' "file" > "outputfile"
[/code]


Report Offensive Follow Up For Removal


Response Number 6
Name: nimi
Date: April 22, 2007 at 19:26:52 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Hi Ghostdog,

Tried your code but I am getting the following error message:

awk: Syntax error near line 7
awk : illegal statement near line 7
awk : new line in string near line 7

FYI, line 7 is END {

I replaced the file with my input file name.


Nimi


Report Offensive Follow Up For Removal

Response Number 7
Name: nimi
Date: April 22, 2007 at 20:17:41 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Nails,

My file size is approximately about 6823009 Bytes. It is taking such a long time to complete if i run using your script. Please suggest another way to improvise this.

Thanks
Nimi


Report Offensive Follow Up For Removal

Response Number 8
Name: nimi
Date: April 22, 2007 at 20:54:24 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Nails,

Please explain this code to me
if [[ -n recset ]]
then
# get rid of the last pipe
recset=$(echo "$recset" |sed 's/|$//g')
echo $recset
fi


Nimi


Report Offensive Follow Up For Removal

Response Number 9
Name: ghostdog
Date: April 22, 2007 at 21:01:33 Pacific
Subject: Split row into meaningful records
Reply: (edit)
use gawk instead of awk.

Report Offensive Follow Up For Removal

Response Number 10
Name: nimi
Date: April 22, 2007 at 22:33:30 Pacific
Subject: Split row into meaningful records
Reply: (edit)
ghostdog,

I replaced gawk instead of awk in your script and i got this


a.sh: gawk: not found

Please help


Nimi


Report Offensive Follow Up For Removal

Response Number 11
Name: ghostdog
Date: April 22, 2007 at 23:14:19 Pacific
Subject: Split row into meaningful records
Reply: (edit)
well, gensub is a gawk externsion. since you don't have gawk, then have to use gsub:try this

[code]
awk 'BEGIN {FS = "=";c=0; }
NR <=5 { next}
/\.set/{ c=c+1 ;next}
{ array[c] = array[c]$2"|"}
END {
for (e in array) {
gsub(/(\")|(\|$)/,"",array[e])
n = split(array[e],f,"|")
printf("%s|%s|" , f[2] , f[3])
for (i=1; i<=n ; i++) {
printf("%s|", f[i])
}
print ""
}
}' "file" > "outputfile"
[/code]


Report Offensive Follow Up For Removal

Response Number 12
Name: nimi
Date: April 22, 2007 at 23:44:19 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Hi Ghostdog/nails,

I still get the same error. Furthermore i just realised that I don't need to take the values following .set. Only take records from BSC_NAME.

old row
AIA1B10|HAFID11|BSC2|AIA1B10|HAFID11|5|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0

new row should be
BSC2|AIA1B10|HAFID11|5|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0

TQ


Nimi


Report Offensive Follow Up For Removal

Response Number 13
Name: ghostdog
Date: April 23, 2007 at 00:06:02 Pacific
Subject: Split row into meaningful records
Reply: (edit)
ok..can you try nawk ? if you are using solaris, another location can be /usr/xpg4/bin/awk.
nawk works for me too.

if you don't need AIA1B10|HAFID11|, then remove printf("%s|%s|" , f[2] , f[3]) from my code.


Report Offensive Follow Up For Removal

Response Number 14
Name: nimi
Date: April 23, 2007 at 00:43:39 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Hi, Ghostdog,

Finally it worked (with nawk) and it was super fast. Thanks for the code. Another help is needed. My eof has the following line [..end]

.set SHTI25:ZIGYMR8:SHTI25:ZIGYMR7
BSC_NAME="SHTI25"
CELL_NAME="ZIGYMR8"
NREL_NAME="ZIGYMR7"
AWOFFSET=3
BQOFFSET=3
BQOFFSETAFR=3
CAND="BOTH"
CS="YES"
HIHYST=5
KHYST=3
KOFFSET=0
LHYST=3
LOHYST=3
LOFFSET=0
OFFSET=0
TRHYST=2
TROFFSET=0
..end

How do i ignore the ..end. I realise that this set of record causes the whole output to go hay wire.

RWNG05|KIRAM13|PLT1B17|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0| (ok)
SHTI25|ZIGYMR8|ZIGYMR7|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0|| (not ok)
RWNG05|KIRAM13|RIS1U11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0| (ok)

Ghostdog, can you explain to me your script. Really interested to pick up awk and how can i be good at it?


Nimi


Report Offensive Follow Up For Removal

Response Number 15
Name: ghostdog
Date: April 23, 2007 at 01:28:05 Pacific
Subject: Split row into meaningful records
Reply: (edit)
if you want to skip lines beginning with ..end, you can use

...
/^\.\.end/ { next }
...


explanation:
FS is field separator. I set it '=' , so that the first field $1 will be BSC_NAME, CELL_NAME etc and second field $2 will "BSC2","ANTAM16" and so on.
I set c=0, so that i can store to array. come to that later.
NR <=5 { next} : If skip reading the first 5 lines.
/\.set/{ c=c+1 ;next} : when awk sees the line with .set, i increment c, so c = 1. and awk read the next record
{ array[c] = array[c]$2"|"} : this means store into array, array[c] will be "" (null) in the first run. since c is 1, then the value of array[1] will be concatenated with $2 (field 2) and then a "|". the result is a line with all $2 concatenated.
END {: means before awk exits after processing every line, it performs the code inside END {}
for (e in array) {: going through the array
gsub(/(\")|(\|$)/,"",array[e]) : substitue every double quotes or the pipe at the end of the line with ""
printf("%s|\n", array[e]) : after substitute, print the results.

finally, the final code can be like this:
[code]
awk 'BEGIN {FS = "=";c=0; }
NR <=5 { next}
/^\.\.end/ { next }
/\.set/{ c=c+1 ;next}
{ array[c] = array[c]$2"|"}
END {
for (e in array) {
gsub(/(\")|(\|$)/,"",array[e])
printf("%s|\n", array[e])
}

}' "file"

[/code]


As for awk reference, you can google for GNU awk. Then look thru the tutorial at its website. happy awking


Report Offensive Follow Up For Removal

Response Number 16
Name: nimi
Date: April 23, 2007 at 01:51:39 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Thanks Ghostdog. I owe you a treat. :)

Nimi


Report Offensive Follow Up For Removal

Response Number 17
Name: nimi
Date: April 23, 2007 at 02:29:45 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Ghostdog,

Is it alright if the contents of the file is not in order?

This is the actually sequence of the input file

BSC2|AIA1B10|HAFID11|5|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|ANTAM17|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|ANTAM18|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|EESTM18|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|FOOYM16|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|HAWPM16|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|JPN1U11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|PIN1U12|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM16|TOAHU15|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM17|ANTAM16|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0
BSC2|ANTAM17|ANTAM18|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0
BSC2|ANTAM17|EESTM18|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM17|FOOYM16|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM17|FOOYM18|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM17|GMB1U11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0
BSC2|ANTAM17|GMB1U12|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0


and the output after running your script
BSC2|AIA1B10|HAFID11|5|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|ANTAM17|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|ANTAM18|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|EESTM18|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|FOOYM16|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|HAWPM16|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|JPN1U11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|PIN1U12|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM16|TOAHU15|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|ANTAM17|ANTAM16|3|3|3|BOTH|YES|5|3|0|3|3|0|0|2|0|
BSC2|YOWCM28|KRU1B11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|YOWCM28|NCSBM11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|YOWCM28|NIN1U10|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|YOWCM28|PIREB11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|YOWCM28|SEPAU11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|
BSC2|YOWCM28|SQU1B11|3|3|3|BOTH|NO|5|3|0|3|3|0|0|2|0|

Nimi


Report Offensive Follow Up For Removal

Response Number 18
Name: ghostdog
Date: April 23, 2007 at 07:16:12 Pacific
Subject: Split row into meaningful records
Reply: (edit)
can you amend the code to something like below and try again.

END {
for (e=1;e<=c;e++) {
gsub(/(\")|(\|$)/,"",array[e])
printf("%s|\n", array[e])
}


Report Offensive Follow Up For Removal

Response Number 19
Name: nimi
Date: April 23, 2007 at 18:03:16 Pacific
Subject: Split row into meaningful records
Reply: (edit)
Ghostdog.

It worked. Explanation please?

Nimi


Report Offensive Follow Up For Removal

Response Number 20
Name: ghostdog
Date: April 23, 2007 at 18:37:56 Pacific
Subject: Split row into meaningful records
Reply: (edit)
With reference to previous:
/\.set/{ c=c+1 ;next}
{ array[c] = array[c]$2"|"}

this says the items are stored into array, with indices denoted by the value of c, which are just numbers starting from 1 to whatver. so basically when you want to call the array values, it is like this:
array[1], array[2]....
however,
when we displayed the array elements using this format of the for loop:
for (e in array) { }, it list out the items in arbitary order. If you want ordered, then we have to go by indices. fortunately we have the array indices stored as numbers starting from 1, so its easy to call them out using this for loop format:
for (e=1;e<=c;e++) { }


Report Offensive Follow Up For Removal



Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: Split row into meaningful records

Comments:

 
  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 


Data Recovery Software




acer 312T BIOS problem

K7 Turbo possible max fsb?

Pc anywher problem

WinFLP & OE/Outlook2003

Computer resets after a few minutes


The information on Computing.Net is the opinions of its users. Such opinions may not be accurate and they are to be used at your own risk. Computing.Net cannot verify the validity of the statements made on this site. Computing.Net and Computing.Net, LLC hereby disclaim all responsibility and liability for the content of Computing.Net and its accuracy.
PLEASE READ THE FULL DISCLAIMER AND LEGAL TERMS BY CLICKING HERE

All content ©1996-2007 Computing.Net, LLC