There are so far 6 ways of Getting webpage content (full HTML) in PHP are most commonly used. The methods are
- using file() fuction
- using file_get_contents() function
- using fopen()->fread()->fclose() functions
- using curl
- using fsockopen() socket mode
- using Third party library (Such as “snoopy”)
1. file()
<?php $url='http://blog.oscarliang.net'; // using file() function to get content $lines_array=file($url); // turn array into one variable $lines_string=implode('',$lines_array); //output, you can also save it locally on the server echo $lines_string; ?>
2. file_get_contents()
To use file_get_contents and fopen you must ensure “allow_url_fopen” is enabled. Check php.ini file, turn allow_url_fopen = On. When allow_url_fopen is not on, fopen and file_get_contents will not work.
<?php $url='http://blog.oscarliang.net'; //file_get_contents() reads remote webpage content $lines_string=file_get_contents($url); //output, you can also save it locally on the server echo htmlspecialchars($lines_string); ?>
3. fopen()->fread()->fclose()
<?php $url='http://blog.oscarliang.net'; //fopen opens webpage in Binary $handle=fopen($url,"rb"); // initialize $lines_string=""; // read content line by line do{ $data=fread($handle,1024); if(strlen($data)==0) { break; } $lines_string.=$data; }while(true); //close handle to release resources fclose($handle); //output, you can also save it locally on the server echo $lines_string; ?>
4. curl
You need to have curl enabled to use it. Here is how: edit php.ini file, uncomment this line: extension=php_curl.dll, and install curl package in Linux
<?php $url='http://blog.oscarliang.net'; $ch=curl_init(); $timeout=5; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); // Get URL content $lines_string=curl_exec($ch); // close handle to release resources curl_close($ch); //output, you can also save it locally on the server echo $lines_string; ?>
5. fsockopen()函数 socket模式
<?php $fp = fsockopen("t.qq.com", 80, $errno, $errstr, 30); if (!$fp) { echo "$errstr ($errno) n"; } else { $out = "GET / HTTP/1.1rn"; $out .= "Host: t.qq.comrn"; $out .= "Connection: Closernrn"; fwrite($fp, $out); while (!feof($fp)) { echo fgets($fp, 128); } fclose($fp); } ?>
6. snoopy library
This library has recently become quite popular. It’s very simple to use. It simulates a web browser from your server.
<?php // include snoopy library require('Snoopy.class.php'); // initialize snoopy object $snoopy = new Snoopy; $url = "http://t.qq.com"; // read webpage content $snoopy->fetch($url); // save it to $lines_string $lines_string = $snoopy->results; //output, you can also save it locally on the server echo $lines_string; ?> ?>