Solved Can't Delete unwanted txt from HTML file

January 11, 2013 at 03:15:06
Specs: Windows Vista, 1.6 gb 4gb
I seem to collect quite a number of HTML files which for one reason or another have additional text which appears at the end of the HTML file. This text is located after the closing tags </body></html>. Currently, to remove it, I have to load each file in Notebook and delete the unwanted text as I have found HTML tag removing software just ignores it as you would expect. Consequently, I am looking for a VBS script to scroll through a directory of such html files and remove any entries added after the </body></html> tags in each of the HTML files.

I have drawn a blank on anything like this having been discussed so far, so I would be grateful for any help. I suspect any script might involve the use of regex a somewhat daunting prospect for me.

Thank you


See More: Cant Delete unwanted txt from HTML file

Report •


#1
January 11, 2013 at 12:43:29
I'll recycle some code from a few posts ago:
'===== begin vbscript
set fso=createobject("scripting.filesystemobject")
set folder=fso.getfolder(".")
for each file in folder.files
n=file.shortname
p=instrrev(n,".")
if p>0 then x=lcase(mid(n,p+1)) else x=""
if left(x,3)="htm" then
c=fso.opentextfile(n,1).readall
p=instr(lcase(c),"</html>")
if p>0 then
c=left(c,p+6)
fso.createtextfile(n).write c
end if
end if
next
'===== end vbscript
be sure to use protection, this is live ammo, it WILL alter files, and it may very well render them totally f.u.b.a.r.

Report •

#2
January 15, 2013 at 11:38:48
nbrane,

Many thanks for your script it did the business just the way I wanted it! I have heeded your caution but so far no disasters!

Many grateful thanks

PS If you can spare the time, could you explain what lines 11 and 12 involve?


Report •

#3
January 16, 2013 at 17:17:21
✔ Best Answer
these?
'this is what hacks off everything after the </html> tag ("</html>" accounts for the 6-byte offset from var. p)
c=left(c,p+6)
'this writes the html content out to file (overwriting the original, instead of using a temp file then renaming it)
fso.createtextfile(n).write c

Also learned this from some code Razor posted, to get file's extension:
x=fso.getextensionname(file)
reducing my scripts footprint by net two lines of code.
(why in hell didn't MS put this under file properties? That's why I could never find it when I looked. I would have assumed: x=file.extension)


Report •

Related Solutions

#4
January 17, 2013 at 03:39:03
nbrane

Thank you for taking the time to reply. This helps enormously as I prefer to learn and understand rather than just slavishly copy someone else's script.

More power to your elbow!


Report •


Ask Question