Parse HTML using jQuery-like syntax in PHP

Quite often there is a need to parse HTML and extract some values from deep-deep nested tables or so. Most front solution is to use regular expressions but they sucks with nested tags. Other way is to use XPath, which performs much better here, but has not simple syntax to use.

Nowadays almost all PHP developers knows jQuery, which became like a standard in front-end development. Why not to use it for HTML parsing using familiar syntax.

For sure it is impossible to use jQuery javascript based for parsing, but there are PHP implementations(!!!). It allows to do all DOM manipulation original jQuery can do.

Lets get to a simple example. In the table below I need to extract values to an array.

<table class="oDescTable oJobClient">
<tr>
<th>Total Spent</th>
<td><strong>
Over $10,000
</strong></td>
</tr>
<tr>
<th>Hours Billed</th>
<td><strong>7,575</strong></td>
</tr>
<tr>
<th>Jobs Posted</th>
<td><strong>136</strong></td>
</tr>
<tr>
<th>Hires</th>
<td><strong>51</strong></td>
</tr>
<tr>
<th>Open Jobs</th>
<td><strong>1</strong></td>
</tr>
<tr>
<th>Current Team Size</th>
<td><strong>0</strong></td>
</tr>
</table>

Using phpQuery

You can get Basics of this library here. All magic is done by one function pq() which acts like $() analog.

// Include library 
include_once 'phpQuery.php';
// Load HTML document
phpQuery::newDocumentHTML($html);
$p = array();
// Call pq() to extract needed values
foreach(pq('table.oJobClient tr') as $tr) {
   $tr = pq($tr);
   // Save values to the array
   $p[ trim($tr->find('th')->text()) ] = trim($tr->find('td')->text());
}

And the result will be an array with values

array (
  'Total Spent => 'Over $10,000'
  'Hours Billed' => '7,575'
  'Jobs Posted' => '136',
  etc..
}

Simple ? This still very small portion of what this library is capable of.

 

1 comment

  1. An impressive reavel, I emphatically known this onto the colleague who had before been doing vaguely psychotherapy for this. With he in piece of evidence bought us breakfast because I came crossways it with regard to him.. smile. Accordingly reasonable, i’ll alter which: Thnx to the treat! Nevertheless yep Thnkx pertaining to spending calculate to focus proceeding this, I come across strongly concerning this and enjoy appraisal added concerning this topic. Proviso probable, as you then become expertise, would an distinctive mind updating your blog with added details? It is really highly of use on behalf of me. Cumbersome twinkle positive with this blog publish!

Leave a Reply

Your email address will not be published.