Computing.Net > Forums > Unix > Sed or Awk help

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Sed or Awk help

Reply to Message Icon

Name: tsuhsen
Date: August 26, 2004 at 08:41:38 Pacific
OS: Linux
CPU/Ram: redhat
Comment:

I am a new user of bourne shell scripting. There are two things that I am trying to accomplish within my script.

1) I am assuming I need to use awk or sed to evaluate a file (xyz.xml) for the number of occurrences of <abc>.

2) Then I need to copy from the first occurrence of <abc> to the last occurrence of <abc>.

Can anyone help me with this?

Thanks



Sponsored Link
Ads by Google

Response Number 1
Name: fpmurphy
Date: August 26, 2004 at 22:22:21 Pacific
Reply:

If you want to handle XML files, you are probably better off using a command line XML parser such as xsltproc.

(www.http://xmlsoft.org/XSLT/xsltproc.html)


0

Response Number 2
Name: Wolfbone
Date: August 27, 2004 at 09:15:38 Pacific
Reply:

Can applying a stylesheet do this? - If not you could use this ugly hack:

tac xyz.xml | sed -ne '/<abc>/,$p' | tac | sed -ne '/<abc>/,$p' | awk '/<abc>/ {l=$0; while (match(l,"<abc>")) {sub("<abc>","",l); c++}} {print > "xyz.xml.out"} END {print c}'

If you need to remove any stuff preceeding/trailing the 1st/last <abc>, call in again.


0

Response Number 3
Name: tsuhsen
Date: August 27, 2004 at 11:44:24 Pacific
Reply:

It looks like I have learn more about stylesheets or some other solution. Thanks for all the help.


0

Response Number 4
Name: Wolfbone
Date: August 27, 2004 at 16:54:46 Pacific
Reply:

"It looks like I have learn more about stylesheets or some other solution."

No it doesn't - It looks like you didn't even bother to try the solution I supplied. If you did try it and it didn't work then you should say so and explain exactly what happened.

It is not unusual for people unfamiliar with the shell and cli utilities to fail to specify their problem in a sufficiently comprehensible way for any solution offered to immediately work as required.


0

Response Number 5
Name: tsuhsen
Date: August 30, 2004 at 06:58:04 Pacific
Reply:

Wlfbone...If I offended you, that wasn't my intention. I did try the solution and received a message of:

$ tac xyz.xml | sed -ne '/<abc>/,$p' | tac | sed -ne '/<abc>/,$p' | awk '/<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out") END {print c}'

The response that I received back was.....
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out") END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out") END {print c}
awk: cmd. line:1: ^ parse error

I really didn't know where to go with this, so I was going to see if I could solve this with Java. However, if you can help me with this, I would appreciate it.


0

Related Posts

See More



Response Number 6
Name: Wolfbone
Date: August 30, 2004 at 10:03:54 Pacific
Reply:

Well I wasn't offended - it's just very annoying when people don't respond at all: Try sticking a ';' before the 'END'.


0

Response Number 7
Name: tsuhsen
Date: August 30, 2004 at 11:37:01 Pacific
Reply:

I am getting more parsing errors. Here is the response I received.

awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:2: (END OF FILE)
awk: cmd. line:2: parse error


0

Response Number 8
Name: Wolfbone
Date: August 30, 2004 at 12:32:51 Pacific
Reply:

I didn't notice before but you have got '1=$0' instead of 'l=$0' and have substituted the number 'one' for the letter 'ell' like this elsewhere in the command too.

I have trouble seeing the difference in my browser sometimes too but usually this is easily avoided by using cut and paste rather than reading and copying the code fragments etc.

Next time maybe I'll use 'line' instead of 'l' but for now I hope it's clear what you need to do :-)


0

Response Number 9
Name: tsuhsen
Date: August 30, 2004 at 13:33:00 Pacific
Reply:

Thanks for the correction. I made all those changes and still received some parsing errors. I then replaced all "l" with line and this is what I ran.

tac xyz.xml | sed -ne '/<abc>/,$p' | tac | sed -ne '/<abc>/,$p' | awk '/<abc>/ {line=$0; while (match(line,"<abc>")) {sub("<abc>","",line); c++}} {print > "xyz.out"); END {print c}'

I am getting the following error still. It looks like it's having problems with the END command.

awk: cmd. line:1: /<abc>/ {line=$0; while (match(line,"<abc>")) {sub("<abc>","",line); c++}} {print > "xyz.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {line=$0; while (match(line,"<abc>")) {sub("<abc>","",line); c++}} {print > "xyz.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:2: (END OF FILE)
awk: cmd. line:2: parse error


0

Response Number 10
Name: Wolfbone
Date: August 30, 2004 at 14:46:02 Pacific
Reply:

Somehow you have managed to put a ')' where a '}' should be after the '"xyz.out"'


0

Response Number 11
Name: Wolfbone
Date: August 30, 2004 at 18:34:35 Pacific
Reply:

Tip: Use a text editor capable of coloured syntax highlighting such as vim to store code in temporarily before you try it out. I pasted your code into the terminal after running 'vim temp.awk' and the faulty ')' was highlighted in red.

Of course I didn't think to do this myself before discovering the problem the hard way ;-) but it's a lot quicker and cleaner.


0

Response Number 12
Name: tsuhsen
Date: August 31, 2004 at 08:13:33 Pacific
Reply:

Sorry about that. It's a lot easier to find these mistakes with color output. I made the changes and the xyz.xml.out was created. Thanks for the help. I have one more problem that I hope you can help with. This strips out all lines that don't have <abc>. My xml output is in a file that has one long xml string. Is there any way to strip out everything that is not <abc> in a line? For example. Below is a sample line that has values in the line before <abc> and after <abc>. Can I strip out the <123> and <454> before <abc> and strip out <get>, </get>, <old>, and </old> after the last <abc>

<123> <454> <abc> </abc> <abc> </abc> <get> </get> <old> </old>

Thanks for everything.


0

Response Number 13
Name: Wolfbone
Date: August 31, 2004 at 11:15:13 Pacific
Reply:

sed 's/\(.*<\/abc>\).*/\1/' | awk '{$0=substr($0,match($0,"<abc>")); print}'



0

Sponsored Link
Ads by Google
Reply to Message Icon






Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: Sed or Awk help

Compare two files using awk or sed, add value www.computing.net/answers/unix/compare-two-files-using-awk-or-sed-add-value/8499.html

SED or AWK www.computing.net/answers/unix/sed-or-awk/3344.html

sed/awk word search www.computing.net/answers/unix/sedawk-word-search/5995.html