Computing.Net > Forums > Unix > need help on SED script

need help on SED script

Reply to Message Icon

Original Message
Name: esu (by Raj)
Date: July 20, 2006 at 15:03:36 Pacific
Subject: need help on SED script
OS: linux-gnu
CPU/Ram: 64 bit
Model/Manufacturer: linux-gnu
Comment:

how any one can write SED script for following requirement?

Test1.text file has following data :

Che is CLA and XP is //event[tag = cast and line = 29]
Che is CLA and XP is //event[tag = cast and line = 42]
Che is NUL and XP is //event[tag = equal and line = 48]

I wanna chop/delete first half of each line until //. In other words the file should look like :

//event[tag = cast and line = 29]
//event[tag = cast and line = 42]
//event[tag = equal and line = 48]

The other requirement is I have test.txt file which has following data:
28 //event[tag = cast and line = xxx]
41 //event[tag = cast and line = xxx]
47 //event[tag = equal and line = xxx]

I want to replace xxx from each line with line number appeared infront of each line. For instance, on first line xxx should replace with 28. After replacement of each line I wanna chop/delete till //. So the output of above file shoudl look like:

//event[tag = cast and line = 28]
//event[tag = cast and line = 41]
//event[tag = equal and line = 47]

Look forward for ur reply.


Report Offensive Message For Removal


Response Number 1
Name: nails
Date: July 20, 2006 at 22:06:07 Pacific
Subject: need help on SED script
Reply: (edit)

Sorry, but I don't know sed well enough to answer your questions. However, this awk program ansers your first question:

nawk ' BEGIN { FS="/" }
{
printf("//%s\n", $3)
} ' data.txt
# end stub 1

And this one answers your 2nd question:

nawk ' BEGIN { FS="/" }
{
gsub("xxx", $1, $3)
printf("//%s\n", $3)
} ' data1.txt
# end stub2

Since my OS is Solaris, I'm using nawk. You probably have to change that to awk.

Let me know if you have any questions.



Report Offensive Follow Up For Removal

Response Number 2
Name: esu (by Raj)
Date: July 21, 2006 at 09:42:32 Pacific
Subject: need help on SED script
Reply: (edit)

thanks Nail.

Yes, I tried using gawk and it worked.

However I see following output:
//event[tag = cast and line = XXX]
//event[tag = cast and line = 42]
//event[tag = equal and line = 48]

I wonder why xxx didn't replace with line number on first line.

Actually my requirement is "No matter what is at end of each line; whether xxx or any number, it has to replace with (line number +1). For instance, test file has following data:
28 //event[tag = cast and line = xxx]
41 //event[tag = cast and line = 41]
47 //event[tag = equal and line = xxx]

the script should change to following output:
//event[tag = cast and line = 29]
//event[tag = cast and line = 42]
//event[tag = equal and line = 48]


Report Offensive Follow Up For Removal

Response Number 3
Name: nails
Date: July 21, 2006 at 11:50:53 Pacific
Subject: need help on SED script
Reply: (edit)

I didn't have any trouble with the first line. However, the script is case sensitive: 'xxx' is not the same as 'XXX'. change it to this:

gsub("xxx|XXX", $1, $3)

For the change in the spec, create a variable myf by adding 1 to field 1. Perform two global substitutions: the first gets xxx or XXX and the second substitution changes field 3 if $1 exists in it:

nawk ' BEGIN { FS="/" }
{
myf=$1+1
gsub("xxx|XXX", myf, $3)
gsub($1, myf, $3)
printf("//%s\n", $3)
} ' data1.txt



Report Offensive Follow Up For Removal

Response Number 4
Name: esu (by Raj)
Date: July 21, 2006 at 13:35:18 Pacific
Subject: need help on SED script
Reply: (edit)

Neils,

I used this script parseAnnotations.sh which has :

#!/bin/sh

gawk ' BEGIN { FS="/" }
{
myf=$1+1
gsub("xxx|XXX", myf, $3)
gsub($1, myf, $3)
printf("//%s\n", $3)
} ' annotations.out

After I run this shell script, following output I received:

//event[tag = cast and line = 29]
//event[tag = nullptr and line = 30]
//event[tag = cast and line = 42]
//event[tag = equal and line = 48]

The input file I used was:
28 CLASSCAST //event[tag = cast and line = XXX]
29 CLASSCAST //event[tag = nullptr and line = XXX]
42 CLASSCAST //event[tag = cast and line = 42]
48 CLASSCAST //event[tag = equal and line = 48]

So, I wonder why last string(number) on last 2 lines didn't incremented by 1. Am I doing anything wrong here ?

The output I expect should be :
//event[tag = cast and line = 29]
//event[tag = nullptr and line = 30]
//event[tag = cast and line = 43]
//event[tag = equal and line = 49]

The requirement is no matter whats last string in each line, it should replace with
(line number +1)


Report Offensive Follow Up For Removal

Response Number 5
Name: nails
Date: July 21, 2006 at 14:56:44 Pacific
Subject: need help on SED script
Reply: (edit)

You keep changing the structure of the data. With the Field Separator being /, the first field of line 1 is:

28 CLASSCAST

In the other data file, you didn't have CLASSCAST. Add a line to remove the non-numerics from field one:


gawk ' BEGIN { FS="/" }
{
gsub("[^0-9]", "", $1) # remove non-numbers
myf=$1+1
gsub("xxx|XXX", myf, $3)
gsub($1, myf, $3)
printf("//%s\n", $3)
} ' data1.txt


Report Offensive Follow Up For Removal


Response Number 6
Name: esu (by Raj)
Date: July 21, 2006 at 15:08:15 Pacific
Subject: need help on SED script
Reply: (edit)

I'm playing with java, python, sed to get this done but couldn't, so far. Thats why I might have commited mistake while giving you data file. I appolozise for that.

This time scripts worked great and got I what I needed. Thanks for your time , really appreciated your cooperation.

Have a nice day !


Report Offensive Follow Up For Removal

Response Number 7
Name: esu (by Raj)
Date: July 25, 2006 at 12:24:57 Pacific
Subject: need help on SED script
Reply: (edit)

Neil,

I wanna use following same shell script which is wrapper for awk script. The input for this script is DataFile1. The same script should work for other datafiles by providing command line argument as datafile. What changes need to be done.

#!/bin/sh
gawk ' BEGIN { FS="/" }
{
gsub("[^0-9]", "", $1) # remove non-numbers
myf=$1+1
gsub("xxx|XXX", myf, $3)
gsub($1, myf, $3)
printf("//%s\n", $3)
} ' Datafile1


Report Offensive Follow Up For Removal

Response Number 8
Name: nails
Date: July 25, 2006 at 15:12:10 Pacific
Subject: need help on SED script
Reply: (edit)

You could put your script within a for loop and process each argument:

#!/bin/sh

for i in $*
do
gawk ' BEGIN { FS="/" }
{
gsub("[^0-9]", "", $1) # remove non-numbers
myf=$1+1
gsub("xxx|XXX", myf, $3)
gsub($1, myf, $3)
printf("//%s\n", $3)
} ' "$i"
done
# end script

You also might want to verify the command line arguments are actually files that exist.


Report Offensive Follow Up For Removal

Response Number 9
Name: esu (by Raj)
Date: July 25, 2006 at 15:35:43 Pacific
Subject: need help on SED script
Reply: (edit)

I want to run above script in a way to read inputdatafile as command line argument and getting following error.


-bash-3.00$ ./parse2Format.sh ClassCastAnnotations.out
gawk: cmd. line:7: } ClassCastAnnotations.out
gawk: cmd. line:7: ^ syntax error



Report Offensive Follow Up For Removal

Response Number 10
Name: esu (by Raj)
Date: July 25, 2006 at 15:41:08 Pacific
Subject: need help on SED script
Reply: (edit)

I'm executing this script from ant where I need to pass inputdatafile as command line option.

<exec executable="/bin/bash">
<arg line='-c "scripts/parse2Format.sh datafile1.out > formatedDataFile1.out"'/>
<arg line='-c "scripts/parse2Format.sh datafile2.out > formatedDataFile2.out"'/>
<arg line='-c "scripts/parse2Format.sh datafile1.out > formatedDataFile3.out"'/>


Report Offensive Follow Up For Removal

Response Number 11
Name: esu (by Raj)
Date: July 26, 2006 at 11:08:48 Pacific
Subject: need help on SED script
Reply: (edit)

Neil,

how do I read inputdatafile from command line to above shell script? What changes need to made...please suggest

#!/bin/sh

for i in $*
do
gawk ' BEGIN { FS="/" }
{
gsub("[^0-9]", "", $1) # remove non-numbers
myf=$1+1
gsub("xxx|XXX", myf, $3)
gsub($1, myf, $3)
printf("//%s\n", $3)
} ' "$i"
done
# end script


Report Offensive Follow Up For Removal

Response Number 12
Name: nails
Date: July 26, 2006 at 14:55:55 Pacific
Subject: need help on SED script
Reply: (edit)

I don't understand your question. If the script above is called myscript.ss, call it with each data file to be processed:

myscript.ss datafile1.txt datafile2.txt

If you are asking for the user to enter the file name, eliminate the for loop, and enter some test asking the user to enter the filename.

echo "enter the file name to process"
read i
# end stub

Obviously, you'd place this before your gawk script.


Report Offensive Follow Up For Removal

Response Number 13
Name: esu (by Raj)
Date: July 26, 2006 at 16:17:24 Pacific
Subject: need help on SED script
Reply: (edit)

I saved the following script in file called parse2Format.sh and running by entering following copmmand :
./parse2Format.sh ClassCastAnnotations.txt

I got following error message:

-bash-3.00$ ./parse2Format.sh ClassCastAnnotations.txt
gawk: cmd. line:7: } ClassCastAnnotations.txt
gawk: cmd. line:7: ^ syntax error


This is parse2Format.sh file containts:

#!/bin/sh

for i in $*
do
gawk ' BEGIN { FS="/" }
{
gsub("[^0-9]", "", $1) # remove non-numbers
myf=$1+1
gsub("xxx|XXX", myf, $3)
gsub($1, myf, $3)
printf("//%s\n", $3)
} ' "$i"
done
# end script


Report Offensive Follow Up For Removal

Response Number 14
Name: nails
Date: July 26, 2006 at 17:06:35 Pacific
Subject: need help on SED script
Reply: (edit)

Does your awk script work if the filename is called directly without the loop?

I don't know what to tell you. This little stub proves that a loop substitutes the filename correctly:

for i in $*
do
awk ' { } END { print FILENAME } ' "$i"
done
# end stub.

It prints the name of each valid filename on the command line.


Report Offensive Follow Up For Removal

Response Number 15
Name: esu (by Raj)
Date: July 27, 2006 at 08:46:14 Pacific
Subject: need help on SED script
Reply: (edit)

The answer to your question is "No", the awk script didn't work when filename called without and with loop.
I was getting syntax error when I execute

-bash-3.00$ ./parse2Format.sh ClassCastAnnotations.txt
gawk: cmd. line:7: } ClassCastAnnotations.txt
gawk: cmd. line:7: ^ syntax error

Neil, there's slight change in requirement and this is tied to my job. Please help me on this final change.

28 CLASSCAST //event[tag = 'cast' and line = XXX]
29 CLASSCAST //event[tag= 'nullptr' and line = XXX]
41 CLASSCAST //event[tag = 'cast' and line = XXX]
47 CLASSCAST //event[tag = 'equal' and line = XXX]
48 CLASSCAST //event[tag = 'rm' and line = XXX]

Look at input file above, you could see first column is line number, if line number comes in sequence (say 28, 29 in above example) and tag is different(in above example tags are 'cast' and 'nullptr') then both of first XXX should be replaced by 30(which means last line number in sequence + 1, which is 29+1=30
You woudl also see there is another sequence came which 47, 48(and their tags are also different), so last colums should be replaced to 49 for both line. Look at following for expected output:

28 CLASSCAST //event[tag = 'cast' and line = 30]
29 CLASSCAST //event[tag= 'nullptr' and line = 30]
41 CLASSCAST //event[tag = 'cast' and line = 42]
47 CLASSCAST //event[tag = 'equal' and line = 49]
48 CLASSCAST //event[tag = 'rl' and line = 49]

Please help


Report Offensive Follow Up For Removal

Response Number 16
Name: nails
Date: July 27, 2006 at 09:01:23 Pacific
Subject: need help on SED script
Reply: (edit)

Regarding your first question. I don't know why you are getting a syntax problem. The script works correctly on my Red Hat 7.1 system and my Solaris 7 system.

Regarding your second question. I'll see what I can do tonight. What you are asking is not particularly easy.


Report Offensive Follow Up For Removal

Response Number 17
Name: esu (by Raj)
Date: July 27, 2006 at 10:41:40 Pacific
Subject: need help on SED script
Reply: (edit)

thank you, Sir. I really appreciate your time


Report Offensive Follow Up For Removal

Response Number 18
Name: nails
Date: July 27, 2006 at 22:52:02 Pacific
Subject: need help on SED script
Reply: (edit)

Sorry, but your last requirement is extemely difficult. I'll see what I can do over the weekend, but I can't make any promises.



Report Offensive Follow Up For Removal

Response Number 19
Name: lchi2000g
Date: July 28, 2006 at 16:50:35 Pacific
Subject: need help on SED script
Reply: (edit)

nails spent too much time helping already. I would like to help. It's really difficult. There's no simple way to do it. The script 1.sh below is not pretty, but it works.

==> input file: file.txt

/home/oracle/luke/tmp$ cat file.txt
27 CLASSCAST //event[tag = 'cast' and line = XXX]
28 CLASSCAST //event[tag = 'cast' and line = XXX]
29 CLASSCAST //event[tag = 'nullptr' and line = XXX]
41 CLASSCAST //event[tag = 'cast' and line = XXX]
47 CLASSCAST //event[tag = 'equal' and line = XXX]
48 CLASSCAST //event[tag = 'rm' and line = XXX]

==> script: 1.sh

/home/oracle/luke/tmp$ cat 1.sh
#!/bin/ksh

TMPINFILE=tmpinfile.txt
OUTFILE=out.txt
TMPFILE=tmp.txt
INDEX=0

cp file.txt $TMPINFILE
echo -1 >> $TMPINFILE

> $OUTFILE

cat $TMPINFILE | while read LINE
do
INDEX=`expr $INDEX + 1`

if [ "$LINE" = -1 ]; then
if [ ! -z $TMPFILE ]; then
sed "s/$/ `expr $PREV_ID + 1`]/" $TMPFILE >>$OUTFILE
exit
fi
fi

if [ $INDEX -eq 1 ]; then
set $LINE
echo $LINE | cut -d " " -f1-8 > $TMPFILE
PREV_ID=$1
continue
else
set $LINE
ID=$1
TAG=$5
if [ $ID -eq `expr $PREV_ID + 1` ]; then
grep $TAG $TMPFILE 2> /dev/null
if [ $? -eq 0 ]; then
sed "s/$/ `expr $PREV_ID + 1`]/" $TMPFILE >>$OUTFILE
echo $LINE | cut -d " " -f1-8 > $TMPFILE
PREV_ID=$ID
continue
else
echo $LINE | cut -d " " -f1-8 >> $TMPFILE
PREV_ID=$ID
continue
fi
else
sed "s/$/ `expr $PREV_ID + 1`]/" $TMPFILE >>$OUTFILE
echo $LINE | cut -d " " -f1-8 > $TMPFILE
PREV_ID=$ID
continue
fi
fi
done

==> run it

/home/oracle/luke/tmp$ ./1.sh

==> output file: out.txt

/home/oracle/luke/tmp$ cat out.txt
27 CLASSCAST //event[tag = 'cast' and line = 28]
28 CLASSCAST //event[tag = 'cast' and line = 30]
29 CLASSCAST //event[tag = 'nullptr' and line = 30]
41 CLASSCAST //event[tag = 'cast' and line = 42]
47 CLASSCAST //event[tag = 'equal' and line = 49]
48 CLASSCAST //event[tag = 'rm' and line = 49]

Luke Chi


Report Offensive Follow Up For Removal

Response Number 20
Name: lchi2000g
Date: July 29, 2006 at 20:27:50 Pacific
Subject: need help on SED script
Reply: (edit)

a better version of 1.sh:

/home/oracle/luke/tmp$ cat 1.sh
#!/bin/ksh

OUTFILE=out.txt
TMPFILE=tmp.txt
INDEX=0

> $TMPFILE
> $OUTFILE

while read LINE
do
INDEX=`expr $INDEX + 1`

if [ $INDEX -eq 1 ]; then
set $LINE
echo $LINE | cut -d " " -f1-8 > $TMPFILE
PREV_ID=$1
continue
else
set $LINE
ID=$1
TAG=$5
if [ $ID -eq `expr $PREV_ID + 1` ]; then
grep $TAG $TMPFILE > /dev/null 2>&1
if [ $? -eq 0 ]; then
sed "s/$/ `expr $PREV_ID + 1`]/" $TMPFILE >>$OUTFILE
echo $LINE | cut -d " " -f1-8 > $TMPFILE
PREV_ID=$ID
continue
else
echo $LINE | cut -d " " -f1-8 >> $TMPFILE
PREV_ID=$ID
continue
fi
else
sed "s/$/ `expr $PREV_ID + 1`]/" $TMPFILE >>$OUTFILE
echo $LINE | cut -d " " -f1-8 > $TMPFILE
PREV_ID=$ID
continue
fi
fi
done < file.txt

if [ ! -z $TMPFILE ]; then
sed "s/$/ `expr $PREV_ID + 1`]/" $TMPFILE >>$OUTFILE
fi


Luke Chi


Report Offensive Follow Up For Removal

Response Number 21
Name: esu (by Raj)
Date: July 31, 2006 at 10:06:04 Pacific
Subject: need help on SED script
Reply: (edit)

I know, Neil spend good amount of time to run this down on road.

I also wanted to thank you for time taken to make this happen. The script does what it suppose to. Thank you very much. Proud of you guys skills.

After running script we get following output:
27 CLASSCAST //event[tag = 'cast' and line = 28]
28 CLASSCAST //event[tag = 'cast' and line = 30]
29 CLASSCAST //event[tag = 'nullptr' and line = 30]
41 CLASSCAST //event[tag = 'cast' and line = 42]
47 CLASSCAST //event[tag = 'equal' and line = 49]
48 CLASSCAST //event[tag = 'rm' and line = 49]

the only change required in this output file is to chop every single line untill "//", so the resultant output file will look like this :

//event[tag = 'cast' and line = 28]
//event[tag = 'cast' and line = 30]
//event[tag = 'nullptr' and line = 30]
//event[tag = 'cast' and line = 42]
//event[tag = 'equal' and line = 49]
//event[tag = 'rm' and line = 49]


Report Offensive Follow Up For Removal

Response Number 22
Name: lchi2000g
Date: July 31, 2006 at 14:28:28 Pacific
Subject: need help on SED script
Reply: (edit)

cut -d" " -f3- input.txt

Luke Chi


Report Offensive Follow Up For Removal

Response Number 23
Name: xsuo
Date: August 7, 2006 at 11:22:52 Pacific
Subject: need help on SED script
Reply: (edit)

There is an easy way to do it with sed command:

1. Your reqeust #1:
cat <your_first_file> | sed -e 's/.*\(\/\/.*\)/\1/g'

2. Your request #2:
cat <your_second_file> | sed -e 's/\([0-9]*\) \(\/\/.* = \)xxx\]/\2\1]/g'


Report Offensive Follow Up For Removal






Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: need help on SED script

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software




How often do you use Computing.Net?

Every Day
Once a Week
Once a Month
This Is My First Time!


View Results

Poll Finishes In 4 Days.
Discuss in The Lounge