PHP: Lesson 4 Grab a Web Page into PHP

Programming tutorial

Part of a multi-page tutorial on the PHP Programming Language.

Authors

Written 2010 by Will Johnson for Fast Forward Technologies
Email Fast Forward Technologies at fft2001@aol.com
or post your comments for public view far below.
Creative Commons Attribution 3.0 License
fft2001
The Knol Self-Tweeter on the left, is designed to work only for an article's author.

Follow Fast Forward Technologies on Twitter!
or use my Knol Public activity feed

Buy your own Knol Self-Tweeter for only two dollars! I will email it to you, when I receive payment.
What the Knol Self-Twitter does exactly is explained at this link.

<-- Start at PHP: Lesson 1 "Hello World"
<-- Back to PHP: Lesson 3 Forms
--> Forward to PHP: Lesson 5 Send Email using a PHP Script

This Knol was featured on Knol's front page on 22 Feb 2010.

In this lesson we're going to learn two ways to grab any webpage and pull it into our PHP program so we can do something more with it.  There is one caveat to that however.  Some webpages actually pull in pieces of other pages, as they are building themselves.  You see the final result, without realizing that it came from five different places.  We cannot replicate that type of operation in PHP.  What we can do, is pull in pages which don't do this weird stuff.  If you do pull a page, and realize it's missing something, look through it for any type of code which seems to be executing some operation with another page as an argument.  That will tell you how to get the rest of the result.  Okay let's see how to get our simple pages anyway.

There are two ways to get a webpage into PHP.  The most straight-forward way is to use the fopen() function.  Typically fopen() is used to get a local file.  That is, one which is sitting on the same machine as your webserver / PHP interpreter is sitting.  For larger businesses, this will be a file on your computer or network.  For small businesses and individuals this will be a file on the web-host, which is typically going to be some local company you pay thirty bucks a month to, to host your site.  They will have told you how to do FTP or you can ask them.  At any rate, the files you try to open with fopen() must be there on that same server.  Learning that the "w" option on the function means that we're opening this for "writing", then the normal way to use fopen() might be like this:

<?php
$file = fopen("myfile.txt","w");
?>

However fopen() also allows you to open any internet location like so:

<?php
$file = fopen("http://www.penguins.com/mypages/something.html","r");
?>


The "r" option states that we're opening this for "read" only.  You cannot open an internet location for writing, but you can open a local file for writing.  You can then read one line at a time out of the file by using the fgets function.

<?php
$file=fopen("http://www.penguins.com/mypages/something.html","r") or exit("Unable to Open File");
//Output a line of the file until the end is reached
while(!feof($file))
  {
  echo fgets($file). "<br />";
  }
fclose($file);
?>

The "!" operator means "not".  And the feof() function means "We are at the end of the file."  So what this codes says, in English is, open the file for reading or if you can't exit with an error message.  If you can open it, assign the file pointer to the $file variable.  Now, while we are not at the end of the file, echo (to the screen) one line out of the file, and terminate it with a <br> (break-line) HTML command.  When we are finally at the end of the file, we drop out of the while loop, and hit the fclose() function which discards the file pointer, from our program.

That is, in English what the code is doing.  Code is a lot briefer than English isn't it !


The other way to get a Web page into PHP is by using cURL.  Here is the code:

<?php
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, "penguins.com/mypages/something.html");
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $output = curl_exec($ch);
    curl_close($ch);
?>

And remember, you will not be able to change someone else's Web page and have others see your changes.  So don't think you're going to be able to be sneaky and silly here.  In order to change someone else's Web pages, you need the password to login to their FTP server.


<-- Start at PHP: Lesson 1 "Hello World"
<-- Back to PHP: Lesson 3 Forms
--> Forward to PHP: Lesson 5 Send Email using a PHP Script