This piece of code retrieves the HTML code from a page, which you can then process and manipulate as you see fit. In the below example, the source code of my Last.fm page is retrieved.
<?php
// This variable will hold the code
$content = "";
// You have to divide host and the rest of the url,
// so in this example, the full url you wanted to get
// would be http://www.last.fm/user/CaseuS_
// As you can see, you don't have to specify the
// protocol used.
$host = "www.last.fm";
$url = "/user/CaseuS_";
// Open the connection on port 80
// $errno and $errstr can be used to
// get the error number and error message
// in case something went wrong.
// 30 stands for the timeout, in seconds,
// that we use for the connection attempt.
$fp = fsockopen($host, 80, $errno, $errstr, 30);
if (!$fp)
{
echo "$errstr ($errno)<br />\n";
}
else
{
$out = "GET $url HTTP/1.1\r\n";
$out .= "Host: $host\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
while (!feof($fp))
{
$content .= fgets($fp, 128);
}
fclose($fp);
}
// Do something with $content, e.g. output the whole code:
echo $content;
?>