Ad Widget

Collapse

Grabbing and parsing HTML

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nabberuk
    Member
    • May 2010
    • 82

    #1

    Grabbing and parsing HTML

    I have setup up a simple HTTP item that grabs the HTML from a website (as shown below). I am trying to grab the number of licenses used, so in the HTML below it will be the 7 (in bold).

    HTML Code:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <html>
    <head>
    <title>Licences</title>
    <meta name="vs_defaultClientScript" content="JavaScript" />
    <meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5" />
    <link href="style.css" type="text/css" rel="stylesheet" />
    <meta http-equiv="refresh" content="240" />
    </head>
    <body>
    <div align="right" style="margin-top: 0; margin-bottom: 0; padding-bottom: 0; padding-top: 0">
    <img src="REFRESH.GIF" align="absmiddle" title="Refresh data" alt="Refresh" border="0"
    onclick="javascript:window.location.reload( false );" style="cursor: hand">
    </div>
    <h1 class="Title">
    Current Licences
    </h1>
    <table class="Results" cellspacing="1" cellpadding="3">
    <tr class="ResultHeader">
    <th class="ResultHeader">
    Product
    </th>
    <th class="ResultHeader">
    Licences
    </th>
    <th class="ResultHeader">
    Free
    </th>
    <th class="ResultHeader">
    Used
    </th>
    <th class="ResultHeader">
    Expires after
    </th>
    </tr>
    <tr class=Result><td><img src=excl.gif align=absmiddle>&nbsp;Reporting Services Browser</td><td colspan=5 class=locked align=right>Expired&nbsp;19 Feb 2020</td></tr><tr class=Result2><td><img src=tick.gif align=absmiddle>&nbsp;Emailing Suite</td><td align=center>16</td><td align=center>16</td><td align=center>0</td><td>09 Mar 2022</td></tr><tr class=Result><td><img src=tick.gif align=absmiddle>&nbsp;Company Name</td><td align=center>14</td><td align=center>7</td><td align=center>7</td><td>09 Mar 2022</td></tr><tr class=Result2><td><img src=excl.gif align=absmiddle>&nbsp;MyPortal</td><td colspan=5 class=locked align=right>Expired&nbsp;10 Mar 2021</td></tr><tr class=Result><td><img src=excl.gif align=absmiddle>&nbsp;Data Transformation Tool</td><td colspan=5 class=locked align=right>Expired&nbsp;20 Aug 2021</td></tr>
    </table>
    </body>
    </html>
    <p class=TimeTag>Page refreshed Friday, 17 December 2021 15:56<br>Uptime 0001 12:17<br>Copyright &copy;2006-2021 Company Name.</p>
    I have been playing around with the preprocessors. At first i've added the following javascript, which i was hoping would grab the value using xpath.

    Code:
    var path = "/html/body/table/tbody/tr[4]/td[3]";
      return XML.query(path, value);
    With this method i get the following error
    HTML Code:
      	Error: cannot parse xml value: Start tag expected, '<' not found
    After this i thought the first line or the last line could be causing issues, first because of the error message and last because it is incorrectly formatted. So i used the trim preprocessor to remove the first and last line but i continue to get the error arbout the > tag not being found.
Working...