Using cURL with PHP

  

Today,  I’ll give you some hint about how to use cURL in PHP.

cURL is a library that allow a webserver to transfer files with a remote computer using a variety of Internet protocols. cURL stands for “Client URLs”, it is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, HTTP, HTTPS,GOPHER, LDAP, DICT, TELNET and FILE.

Once you’ve compiled PHP with cURL support, you can begin using the cURL functions.

The basic idea behind the cURL functions is that you initialize a cURL session using the curl_init(),  then you can set all your options for the transfer via the curl_setopt(), then you can execute the session with the curl_exec() and then you finish off your session using the curl_close().

Using cURL from the command line is easy.  Here is an example that uses PHP cURL functions to fetch the data from PowerOptions Website  and prints it’s output. The data that I’m going to harvest would be the Company Name and Symbol.

Here is the function  I’ve used:

public function scrape() {
$ch = curl_init();
 
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.10) Gecko/2009042523 Ubuntu/9.04 (jaunty) Firefox/3.0.10");
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 60);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_URL, "http://www.poweropt.com/optionable.asp?fl=A");
curl_setopt($ch, CURLOPT_TIMEOUT, 120);
curl_setopt($ch, CURLOPT_PROXY, "192.168.20.206:3128");
$html= curl_exec($ch);
if (!$html) {
echo "cURL error number:" .curl_errno($ch)."\n";
echo "cURL error:" . curl_error($ch)."\n";
}
 
curl_close($ch);
 
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodeLists = $xpath->evaluate("//*[contains(text(), 'Company Name')]/ancestor::table[1]//tr");
for ($i = 1; $i < $nodeLists->length; $i++) {
        $node = $nodeLists->item($i);
	$node = $node->firstChild;
	$tempNode = $node->firstChild;
	$name = $tempNode->nodeValue;
	$node = $node->nextSibling;
	$node = $node->nextSibling;
	$symbol = $node->nodeValue;
	//echo $name." | ". $symbol ."";
	//echo "<span style="font-family: Tahoma; color: black;">".$name. "|". $symbol ."</span>";
	//Saving the data autpmaticaly after scraping.
	mysql_query("INSERT INTO company (company_name, symbol) VALUES('$name','$symbol')");
	}
}

In the above example, curl_exec($ch) execute the returned data from the remote server and stored in the variable $html.

For more advance technique,  I used  PHP DOMDocument() Class for extracting the harvested data instead of using php preg_match_all function.

Source:  php.net

Comments (4)

steffan January 30th, 2010 at 3:28 AM    

hello pwedi mag patutor sau ? mag kano bayad?

steffan January 30th, 2010 at 3:29 AM    

pano kita ma contact?

admin January 30th, 2010 at 3:38 AM    

Hi Steffan, thanks for your interest but I’m not accepting any tutorials po. You can learn programming on the internet, their are lots of sources. ^^

Science and Religion March 27th, 2010 at 7:29 PM    

I never would have believed I would need to know this thank goodness for the web

Leave a reply

Name *

Mail *

Website