Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
Hi All,
Can someone suggest the way to parse the html file to look for specific text using perl? or any programming language
I have abc.html file; which is result of one of automated test suite output. This result file contains status of all tests within suite. All I want to find test name and their status(pass/fail) from html file and redirect to another text file called xyz.txt. xyz should contain test name space seperate by status followed space seperated by description of test failure. The format of this text file is pasted below as xyz.txt.
==================================
abc.html file looks like this:
==================================
<html>
<body>
<table class="details" border="0" cellpadding="5" cellspacing="2" width="95%">
<tr valign="top">
<th width="80%">Name</th><th>Tests</th><th>Errors</th><th>Failures</th><th nowrap="nowrap">Time(s)</th>
</tr>
<tr valign="top" class="Error">
<td>TestSuite</td><td>34</td><td>1</td><td>1</td><td>321.625</td>
</tr>
</table>
<h2>Tests</h2>
<table class="details" border="0" cellpadding="5" cellspacing="2" width="95%">
<tr valign="top">
<th>Name</th><th>Status</th><th width="80%">Type</th><th nowrap="nowrap">Time(s)</th>
</tr>
<tr valign="top" class="TableRowColor">
<td>testLoginInitialLoad</td><td>Success</td><td></td><td>7.938</td>
</tr>
<tr valign="top" class="TableRowColor">
<td>testFailedSignOn</td><td>Success</td><td></td><td>7.156</td>
</tr>
<tr valign="top" class="TableRowColor">
<td>testLoginSignOn</td><td>Success</td><td></td><td>8.078</td>
</tr>
<td>testComponents</td><td>Failure</td><td>null
<code>junit.framework.AssertionFailedError: null at tests.Components.Components(ProductComponents.java:44)</code></td><td>7.625</td>
</tr>
<tr valign="top" class="TableRowColor">
<td>testProductColumnSort</td><td>Success</td><td></td><td>20.484</td>
</tr>
<tr valign="top" class="Error">
<td>testCompare</td><td>Error</td><td>Product missmatch: To compare , please select exactly two comp of the same product.
<code>tests.ProductMissmatchException: Product missmatch: To compare , please select exactly two comp of the same product. at tests.Compare.testCompare(Compare.java:45)</code></td><td>7.141</td>
</tr>
</body>
</html>
===========================================xyz.txt
======
testLoginInitialLoad Success
testFailedSignOn Success
testLoginSignOn Success
testComponents Failure junit.framework.AssertionFailedError: null at tests.Components.Components(ProductComponents.java:44)
testCompare Error Product missmatch: To compare , please select exactly two comp of the same product. tests.ProductMissmatchException: Product missmatch: To compare , please select exactly two comp of the same product. at tests.Compare.testCompare(Compare.java:45)

Try this:
cat abc.html | sed -e 's/<td>/\%/g' -e 's/<[^>]*>//g' | egrep -v '^$' | tr "%" " " | egrep -i '(Suc|Fail|Err)'

Great this is what I'm looking for. Thank you very much.
I want to create this as excutable so I can call from other program/script. Not sure why following snippet gives me error.
#!/usr/bin/perl -w
$File="0_abc.html";
cat $File|sed -e 's/<td>/\%/g'-e 's/<[^>]*>//g'|egrep -v '^$'|tr "%" " " |egrep -i '(Suc|Fail|Err)' >> test.txt

Hi there,
There was typo in my script ...I corrected it and it worked well. However the script produces following output. The things which I don't want to see in out put are first line which is NameTestsErrorsFailuresTime(s).
Then there is white space in beginning of each line which not reuqired. Also there's extra white space after status and before time is printed on each line.In short:
1)get rid of first line
2)remove while space from begining of each line
3) Remove extra white space after status and before time(s).
================================
this is current output from abouve script:
====================================
NameTestsErrorsFailuresTime(s)
testLoginInitialLoad Success 7.938
testFailedSignOn Success 7.156
testLoginSignOn Success 8.078
testHomeTabNone Success 16.469==========================
we want following output:
===========================
testLoginInitialLoad Success 7.938
testFailedSignOn Success 7.156
testLoginSignOn Success 8.078
testHomeTabNone Success 16.469
testDefectProductComponents Failure null
testDefectRunsCompare Error Product missmatch: To compare runs, please select exactly two runs of the same product.

[code]
awk '/<td>/,/<\/td>/ { if ($0 ~ /TestSuite/) {next} ;
gsub("<td>|</td>"," ",$0)
gsub("<code>|</code>"," ",$0)
gsub("^ ","",$0)
}
' "file"
[/code]

![]() |
![]() |
![]() |

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |