Solved how to read a html file and extract the data from it?

September 9, 2015 at 06:28:15
Specs: Windows 7
I want to read a html file and then extract data from specific tags like 8 0 0 0 from below e.g

See More: how to read a html file and extract the data from it?

Report •

September 9, 2015 at 07:17:26
This should work, assuming the tag is called "td" and the value to be extracted are only one character in length.

Also, if the code structure is as following:


The script will not work. It has to be on one line, as you stated in your example.

@Echo off
setlocal enabledelayedexpansion
for /f "skip=1 tokens=*" %%A in (yourFile.html) do (
	set raw=%%A
	for /f "tokens=1 delims=>" %%B in ("!raw!") do (
		set tag=%%B
		set tag=!tag:~1!
	if !tag!==td (
		set value=!raw:~4,1!

99 little bugs in the code,
99 little bugs.
Take one down, patch it around,
129 little bugs in the code.

message edited by RainBawZ

Report •

September 9, 2015 at 10:52:47
Thanks for your quick reply , but at runtime number can vary from one character to more(3 or 4 or 5 etc) between the tag <td></td> and also could be possible that more than one pair of tag can appear eg:

1) The single line can contain <td>4</td><td>126</td><td>10</td><td>34</td>

message edited by intudvg

Report •

September 10, 2015 at 10:32:49
✔ Best Answer
This might work as a crude html extraction tool, but html and xml are notorious for tanking batchscript due to the ">" and "<" tag delimiters.
:-------- begin batchscript HTMEX.BAT
@echo off & setlocal enabledelayedexpansion
set inp=%1
if not defined inp (
echo USAGE: %0 inputfile [outputfile]
echo outputfile defaults to 'RESULT' if not provided
goto :eof
set outp=%2
if not defined outp set outp=result
:: might want to put file-destruct safeguard here, depending
del %outp% 2>nul
for /f "tokens=*" %%a in (%1) do (
set xx=%%a
set xx=!xx:^<=,!
for %%b in (!xx!) do (
for /f "tokens=1,2 delims=>" %%c in ("%%b") do (
REM debugging display: echo c: %%c d: %%d
if /i "%%c" equ "td" if "%%d" neq "" >> %outp% echo %%d
echo output was stored in %outp%
::======= end batch
This suffers from the same limitation Rbawz mentions where tags cannot be split over text-lines (although it might be tweaked to do so if that situation called for it).
It does have the potential to extract other tags since it gathers both
the tag's label and the tag's value for each tag element. But I would recommend using a better parsing language for critical applications.

message edited by nbrane

Report •

Related Solutions

September 10, 2015 at 23:47:22
Thank you guys for your support ,my script running smoothly without any problem once again thank you .

Report •

Ask Question