Name: esu (by Raj) Date: June 12, 2007 at 14:03:05 Pacific Subject: help: parsing html file using perl OS: linux CPU/Ram: 512 Model/Manufacturer: 64
Comment:
Hi All,
Can someone suggest the way to parse the html file to look for specific text using perl? or any programming language
I have abc.html file; which is result of one of automated test suite output. This result file contains status of all tests within suite. All I want to find test name and their status(pass/fail) from html file and redirect to another text file called xyz.txt. xyz should contain test name space seperate by status followed space seperated by description of test failure. The format of this text file is pasted below as xyz.txt.
<code>junit.framework.AssertionFailedError: null at tests.Components.Components(ProductComponents.java:44)</code></td><td>7.625</td> </tr> <tr valign="top" class="TableRowColor"> <td>testProductColumnSort</td><td>Success</td><td></td><td>20.484</td> </tr>
<tr valign="top" class="Error"> <td>testCompare</td><td>Error</td><td>Product missmatch: To compare , please select exactly two comp of the same product.
<code>tests.ProductMissmatchException: Product missmatch: To compare , please select exactly two comp of the same product. at tests.Compare.testCompare(Compare.java:45)</code></td><td>7.141</td> </tr> </body> </html> ===========================================
xyz.txt ====== testLoginInitialLoad Success testFailedSignOn Success testLoginSignOn Success testComponents Failure junit.framework.AssertionFailedError: null at tests.Components.Components(ProductComponents.java:44) testCompare Error Product missmatch: To compare , please select exactly two comp of the same product. tests.ProductMissmatchException: Product missmatch: To compare , please select exactly two comp of the same product. at tests.Compare.testCompare(Compare.java:45)
There was typo in my script ...I corrected it and it worked well. However the script produces following output. The things which I don't want to see in out put are first line which is NameTestsErrorsFailuresTime(s). Then there is white space in beginning of each line which not reuqired. Also there's extra white space after status and before time is printed on each line.
In short: 1)get rid of first line 2)remove while space from begining of each line 3) Remove extra white space after status and before time(s). ================================ this is current output from abouve script: ==================================== NameTestsErrorsFailuresTime(s) testLoginInitialLoad Success 7.938 testFailedSignOn Success 7.156 testLoginSignOn Success 8.078 testHomeTabNone Success 16.469
========================== we want following output: =========================== testLoginInitialLoad Success 7.938 testFailedSignOn Success 7.156 testLoginSignOn Success 8.078 testHomeTabNone Success 16.469 testDefectProductComponents Failure null testDefectRunsCompare Error Product missmatch: To compare runs, please select exactly two runs of the same product.
The information on Computing.Net is the opinions of its users. Such
opinions may not be accurate and they are to be used at your own risk.
Computing.Net cannot verify the validity of the statements made on this site. Computing.Net and Computing.Net, LLC hereby disclaim all responsibility and liability for the content of Computing.Net and its accuracy.
PLEASE READ THE FULL DISCLAIMER AND LEGAL TERMS BY CLICKING HERE