Solved Can't Delete unwanted txt from HTML file

January 11, 2013 at 03:15:06
Specs: Windows Vista, 1.6 gb 4gb
I seem to collect quite a number of HTML files which for one reason or another have additional text which appears at the end of the HTML file. This text is located after the closing tags </body></html>. Currently, to remove it, I have to load each file in Notebook and delete the unwanted text as I have found HTML tag removing software just ignores it as you would expect. Consequently, I am looking for a VBS script to scroll through a directory of such html files and remove any entries added after the </body></html> tags in each of the HTML files.

I have drawn a blank on anything like this having been discussed so far, so I would be grateful for any help. I suspect any script might involve the use of regex a somewhat daunting prospect for me.

Thank you

See More: Cant Delete unwanted txt from HTML file

Report •

January 11, 2013 at 12:43:29
I'll recycle some code from a few posts ago:
'===== begin vbscript
set fso=createobject("scripting.filesystemobject")
set folder=fso.getfolder(".")
for each file in folder.files
if p>0 then x=lcase(mid(n,p+1)) else x=""
if left(x,3)="htm" then
if p>0 then
fso.createtextfile(n).write c
end if
end if
'===== end vbscript
be sure to use protection, this is live ammo, it WILL alter files, and it may very well render them totally f.u.b.a.r.

Report •

January 15, 2013 at 11:38:48

Many thanks for your script it did the business just the way I wanted it! I have heeded your caution but so far no disasters!

Many grateful thanks

PS If you can spare the time, could you explain what lines 11 and 12 involve?

Report •

January 16, 2013 at 17:17:21
✔ Best Answer
'this is what hacks off everything after the </html> tag ("</html>" accounts for the 6-byte offset from var. p)
'this writes the html content out to file (overwriting the original, instead of using a temp file then renaming it)
fso.createtextfile(n).write c

Also learned this from some code Razor posted, to get file's extension:
reducing my scripts footprint by net two lines of code.
(why in hell didn't MS put this under file properties? That's why I could never find it when I looked. I would have assumed: x=file.extension)

Report •

Related Solutions

January 17, 2013 at 03:39:03

Thank you for taking the time to reply. This helps enormously as I prefer to learn and understand rather than just slavishly copy someone else's script.

More power to your elbow!

Report •

Ask Question