Parsing Knol's Search Toolkit Results (Page 2)

Using the Class in Knol to code a PHP parsing routine

This article will show you how to parse the Knol Toolkit Search output, by understanding it's use of classes, into a PHP program so you can do whatever other kind of manipulation you want to do with it.


Written 2010 by Will Johnson for Fast Forward Technologies
Email Fast Forward Technologies at
or post your comments for public view far below.
Creative Commons Attribution 3.0 License

Follow Fast Forward Technologies on Twitter!
or use my Knol Public activity feed

<-- Back to Parsing Knol's Search Toolkit Results

In our first chapter, we laid out the format for the Knol Search Toolkit's Results page paragraphs, without explaining the classes.  Now I will tell you what each class tag means, as best I can.  Remembering that our goal here is to be able to parse the values assigned to the classes, so some classes we're going to treat as dummy, simply because they seem to have no associated value.

 <li class="knol-search-bullet"><div class="knol-search-knol">  Start a new results paragraph
 <div class="knol-search-left">  Left side of paragraph
 <div class="knol-search-knol-image-c">  Thumbnail image
 <div class="knol-search-mid">  Middle of paragraph
 <a class="knol-search-knol-title">  Text Title (link embedded here)
 <div class="knol-search-knol-author">  Author (link to author here)
 <div class="knol-search-knol-snippet">  Snipped from article
 <div class="knol-search-right">  Right side of paragraph
 <div class="knol-search-knol-info knol-search-knol-info-pageviews"><span class="knol-search-knol-info-details">  Pageview count
 <span class="knol-search-knol-info knol-search-knol-info-version"><span class="knol-search-knol-info-details">  Article version number
 <span class="knol-search-knol-info knol-search-knol-info-edited"><span class="knol-search-knol-info-details">  Last edit timestamp
 <span class="zzAggregateRating">  Star Rating
 <span class="knol-zipit-count-display">  Number of raters
 <class="knol-badge-small knol-sprite-main-top_viewed_badge">  Top Viewed Badge
 <class="knol-badge-small knol-sprite-main-quality_badge">  Top Quality Badge

Knowing this table, I can now use PHP to build an HTML table of my first 50 Knols, sorted by pageviews, and showing just title with link, author with link, and pageviews.  You can run the PHP code I just wrote at any time, and it will recreate the table of that moment, by clicking this link.  Don't you want a copy of this code ?

Update: I've made an ever better version of this code, which runs through the first 550 of my knols, opening each one and displays the accurate, up to the minute viewcount.  Click this link to see it.  I hope to shortly modify this script so that anyone can use it to view their own total viewcount.

Update: Nov 2010, Well I was planning to mod this so that anyone could use it, but I went off in a new direction, adding to the display a column for Pageviews This Week, which you can see at this link. (I've now moved this link from the slightly irrelevant site to my site.)

There is still something wrong, in those cases where Knol thinks the article is written in a language other than English.  I need to review those cases before I take the next step, and see if I can't figure out what's wrong.


Update: Dec 2010, I've now made a new PHP script which will parse and display up-to-the-minute statistics for my 50 newest Knols.  You can view the results page at this link.  I'm also working on a version of the Pageviews This Week code which can format the output for MediaWiki installations since MediaWiki has it's own unique markup language which only mimics HTML mostly.

Update: May 2011, I've now modded it further so that the columns are sortable provided you have Javascript turned on in your browser.  For this newest version click here.

Update: Sep 2011, I've run into the 800 knol limit on my PHP script.  So I've fixed that temporarily by adding my 50 newest Knols to the bottom, but that will only work sorta, until I reach 850 knols, so I'll have to figure out a better solution before then.