Urgent Python help PLEASE

January 26, 2010 at 09:17:28
Specs: Windows XP

I am trying to parse an xml file using Python
but not using any parsing modules. The xml
file is not written in proper format and has
multiple elements of the same name though
they do not have the same properties. I'm
parsing the file almost like parsing a text file. I
read the entire file and then look for certain
values and append the lines that has those
values in them. The problem is, when the
script finds a newline in one of those lines, the
script fails. I've looked at re.dotall() but don't
know how to use it. If you want to see my
script I can post it here (I don't know if it's
allowed or not). I really need some help with
this. I would appreciate everyones help.

Thanks in advance,

See More: Urgent Python help PLEASE

Report •

January 26, 2010 at 12:41:48
yeah, it's allowed. you'll prob'ly have to post it (your script) in order to gain any effective assistance.
(I know not of python, but others on here do... or you might consider alternative vbscript or batchscript)

Report •

January 27, 2010 at 06:33:03
My python isn't to good so I'll try to keep this simple.

You can easily take the linebreaks out of play by reading the file(provided it isn't HUGE), kill off the linebreaks, do your parsing and add linebreaks in after every ">", the write to a file.

file = open("yourfile.xml", "r")
content = file.read()
content = content.replace('\n','') 
print content

#your stuff goes here

content = content.replace('>',">\n") 
print content

# could be yourfile.xml but make
# sure you have a backup!

newfile = open("newfile.xml","w")

Batch Variable how to

Report •

January 27, 2010 at 07:42:53
Here's the script:

f = open("Myfile.xml", "r")

line = f.readlines()

xmllist = ['Return1', 'Return2']
linematch = []

#Appending lines that matched with values from 'xmllist'
for l in line:
for x in xmllist:
if x in l:

#Stripping the first appended line linematch
trim = linematch[0]
trim = trim.split('<')

#Stripping the second appended line from linematch
FirstValue = linematch[1].split('>')[1].split('<')[0]

print trim
print FirstValue

The problem is when I print trim I see that that the script
doesn't read the rest of the file when it finds a newline. I have
tried the script with a file that doesn't have newline and it
works well. I don't know how to get it to work with '/n'. Any
help would be greatly appreciated.

Judago, I will definitely give your suggestion a try. Judago and
nbrane, thank you both for your time.

Report •

Related Solutions

January 27, 2010 at 15:37:55
What exactly are you trying to do?

I can see you are trying to strip out tag data, is that all your after? Do you want to replace it an rebuild a file?

I probably should have asked before but you have me a little lost.

Can you show a sample of the file, it would make things so much easier.

Batch Variable how to

Report •

Ask Question