This piece of code retrieves the HTML code from a page, which you can then process and manipulate as you see fit. In the below example, the source code of my Last.fm page is retrieved.
<?php // This variable will hold the code $content = ""; // You have to divide host and the rest of the url, // so in this example, the full url you wanted to get // would be http://www.last.fm/user/CaseuS_ // As you can see, you don't have to specify the // protocol used. $host = "www.last.fm"; $url = "/user/CaseuS_"; // Open the connection on port 80 // $errno and $errstr can be used to // get the error number and error message // in case something went wrong. // 30 stands for the timeout, in seconds, // that we use for the connection attempt. $fp = fsockopen($host, 80, $errno, $errstr, 30); if (!$fp) { echo "$errstr ($errno)<br />\n"; } else { $out = "GET $url HTTP/1.1\r\n"; $out .= "Host: $host\r\n"; $out .= "Connection: Close\r\n\r\n"; fwrite($fp, $out); while (!feof($fp)) { $content .= fgets($fp, 128); } fclose($fp); } // Do something with $content, e.g. output the whole code: echo $content; ?>