Computing.Net > Forums > Unix > ksh data file manipulation

ksh data file manipulation

Reply to Message Icon

Original Message
Name: STN
Date: March 1, 2007 at 21:41:19 Pacific
Subject: ksh data file manipulation
OS: SunOS 5.8
CPU/Ram: E250
Model/Manufacturer: Sun
Comment:

I have a file which has patterns of repeated lines. The pattern should be:

a = name
b = var1
c = var2
d = var3

Sometimes though one of these lines (or more) is missing (2nd c line below):

a = name1
b = 100
c = 33
d = 800
a = name2
b = 100
d = 800
a = name3
b = 100
c = 33
d = 800

When this happens I want to inject a default value (say 35) for that missing variable.
I dont necessarily need it to be inserted into the file as it is displayed above because the next step is to produce a file that looks like:

name1 800 100 33
name2 800 100 35
name3 800 100 33

Note above that 35 was inserted to maintain consistancy (and has meaning). So this value could be inserted as the values are put into the column format. This would then allow me to sort the file based on column 2 first, then 3, then 4:

name1 800 100 33
name3 800 100 33
name2 800 100 35

Then using uniq -d I could display any duplicates so the end result would be the script alerting me to duplicate lines (not including checking the name for dups as these will be always be unique):

#script output:
namex 100 33 800

Not sure which dup line would be displayed, but the important thing would be that one of the two dup lines was displayed.

I have some commands to do the last couple parts

# sort based on column2, then column3, then column4
sort +1 -2 +2 -3 +3 -4
# display any duplicates lines not including column 1
uniq -d -1


What I'm struggling with is setting an expected pattern of lines, and then inserting a value when part of the pattern is missing. Any help that could be provided is appreciated.


Report Offensive Message For Removal


Response Number 1
Name: nails
Date: March 2, 2007 at 14:31:58 Pacific
Subject: ksh data file manipulation
Reply: (edit)

The following script creates 3 output lines like so:

name1 100 33 800
name2 100 35 800
name3 100 33 800

This awk script is tailored to fit you data structure and is easily broken if you change the data requirements:

awk ' BEGIN { nb=nc=nd=0; defalt="35" }
{
if($1 == "a" && $3 ~ /^name/)
{
if(NR == 1)
{
print $3
continue
}

if(nb == 0)
print defalt
else
print nb

if(nc == 0)
print defalt
else
print nc

if(nd == 0)
print defalt
else
print nd

na=nb=nc=nd=0
print $3
continue
}

if($1 == "b" && $2 == "=")
nb=$3

if($1 == "c" && $2 == "=")
nc=$3

if($1 == "d" && $2 == "=")
nd=$3

}
END {
if(nb == 0)
print defalt
else
print nb

if(nc == 0)
print defalt
else
print nc

if(nd == 0)
print defalt
else
print nd

}
' data.file|xargs -n 4



Report Offensive Follow Up For Removal

Response Number 2
Name: STN
Date: March 2, 2007 at 16:47:55 Pacific
Subject: ksh data file manipulation
Reply: (edit)

That is fantastic. Thanks nails, it works like a charm.

One follow up on the default, how would I set unique defaults for b,c and d.

Example defaults:
b = 1
c = 35
d = 1

I've tried a couple things but couldnt quiet get it to work.

Thanks again.


Report Offensive Follow Up For Removal

Response Number 3
Name: STN
Date: March 2, 2007 at 16:53:18 Pacific
Subject: ksh data file manipulation
Reply: (edit)

Actually I may have just stumbled on it. Perhaps you could verify it looks like how you would do it:

Assuming b,d default is 1 and c default is 35:

awk ' BEGIN { nb=nc=nd=0; defalt="35"; defalt2="1" }
{
if($1 == "a" && $3 ~ /^name/)
{
if(NR == 1)
{
print $3
continue
}

if(nb == 0)
print defalt2
else
print nb

if(nc == 0)
print defalt
else
print nc

if(nd == 0)
print defalt2
else
print nd

na=nb=nc=nd=0
print $3
continue
}

if($1 == "b" && $2 == "=")
nb=$3

if($1 == "c" && $2 == "=")
nc=$3

if($1 == "d" && $2 == "=")
nd=$3

}
END {
if(nb == 0)
print defalt2
else
print nb

if(nc == 0)
print defalt
else
print nc

if(nd == 0)
print defalt2
else
print nd

}
' data.file|xargs -n 4


Report Offensive Follow Up For Removal

Response Number 4
Name: nails
Date: March 2, 2007 at 21:12:54 Pacific
Subject: ksh data file manipulation
Reply: (edit)

Glad you like it. Yes, I'd say your change is perfectly reasonable.


Report Offensive Follow Up For Removal

Response Number 5
Name: STN
Date: March 3, 2007 at 14:10:37 Pacific
Subject: ksh data file manipulation
Reply: (edit)

Thanks alot. I appreciate your time and expertise.


Report Offensive Follow Up For Removal







Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: ksh data file manipulation

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software




How often do you use Computing.Net?

Every Day
Once a Week
Once a Month
This Is My First Time!


View Results

Poll Finishes In 4 Days.
Discuss in The Lounge