Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
I am a new user of bourne shell scripting. There are two things that I am trying to accomplish within my script.
1) I am assuming I need to use awk or sed to evaluate a file (xyz.xml) for the number of occurrences of <abc>.
2) Then I need to copy from the first occurrence of <abc> to the last occurrence of <abc>.
Can anyone help me with this?
Thanks

If you want to handle XML files, you are probably better off using a command line XML parser such as xsltproc.
(www.http://xmlsoft.org/XSLT/xsltproc.html)

Can applying a stylesheet do this? - If not you could use this ugly hack:
tac xyz.xml | sed -ne '/<abc>/,$p' | tac | sed -ne '/<abc>/,$p' | awk '/<abc>/ {l=$0; while (match(l,"<abc>")) {sub("<abc>","",l); c++}} {print > "xyz.xml.out"} END {print c}'
If you need to remove any stuff preceeding/trailing the 1st/last <abc>, call in again.

"It looks like I have learn more about stylesheets or some other solution."
No it doesn't - It looks like you didn't even bother to try the solution I supplied. If you did try it and it didn't work then you should say so and explain exactly what happened.
It is not unusual for people unfamiliar with the shell and cli utilities to fail to specify their problem in a sufficiently comprehensible way for any solution offered to immediately work as required.

Wlfbone...If I offended you, that wasn't my intention. I did try the solution and received a message of:
$ tac xyz.xml | sed -ne '/<abc>/,$p' | tac | sed -ne '/<abc>/,$p' | awk '/<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out") END {print c}'
The response that I received back was.....
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out") END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out") END {print c}
awk: cmd. line:1: ^ parse errorI really didn't know where to go with this, so I was going to see if I could solve this with Java. However, if you can help me with this, I would appreciate it.

Well I wasn't offended - it's just very annoying when people don't respond at all: Try sticking a ';' before the 'END'.

I am getting more parsing errors. Here is the response I received.
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {1=$0; while (match(1,"<abc>")) {sub("<abc>","",1); c++}} {print > "xyz.xml.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:2: (END OF FILE)
awk: cmd. line:2: parse error

I didn't notice before but you have got '1=$0' instead of 'l=$0' and have substituted the number 'one' for the letter 'ell' like this elsewhere in the command too.
I have trouble seeing the difference in my browser sometimes too but usually this is easily avoided by using cut and paste rather than reading and copying the code fragments etc.
Next time maybe I'll use 'line' instead of 'l' but for now I hope it's clear what you need to do :-)

Thanks for the correction. I made all those changes and still received some parsing errors. I then replaced all "l" with line and this is what I ran.
tac xyz.xml | sed -ne '/<abc>/,$p' | tac | sed -ne '/<abc>/,$p' | awk '/<abc>/ {line=$0; while (match(line,"<abc>")) {sub("<abc>","",line); c++}} {print > "xyz.out"); END {print c}'
I am getting the following error still. It looks like it's having problems with the END command.
awk: cmd. line:1: /<abc>/ {line=$0; while (match(line,"<abc>")) {sub("<abc>","",line); c++}} {print > "xyz.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:1: /<abc>/ {line=$0; while (match(line,"<abc>")) {sub("<abc>","",line); c++}} {print > "xyz.out"); END {print c}
awk: cmd. line:1: ^ parse error
awk: cmd. line:2: (END OF FILE)
awk: cmd. line:2: parse error

Tip: Use a text editor capable of coloured syntax highlighting such as vim to store code in temporarily before you try it out. I pasted your code into the terminal after running 'vim temp.awk' and the faulty ')' was highlighted in red.
Of course I didn't think to do this myself before discovering the problem the hard way ;-) but it's a lot quicker and cleaner.

Sorry about that. It's a lot easier to find these mistakes with color output. I made the changes and the xyz.xml.out was created. Thanks for the help. I have one more problem that I hope you can help with. This strips out all lines that don't have <abc>. My xml output is in a file that has one long xml string. Is there any way to strip out everything that is not <abc> in a line? For example. Below is a sample line that has values in the line before <abc> and after <abc>. Can I strip out the <123> and <454> before <abc> and strip out <get>, </get>, <old>, and </old> after the last <abc>
<123> <454> <abc> </abc> <abc> </abc> <get> </get> <old> </old>
Thanks for everything.

![]() |
![]() |
![]() |

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |