PHP for Web scraping and bot development

Web ScrapWeb scraping is a computer science technique for extracting information and data from web sites. In data mining research scraping and analysing of information is discussed. Practically web scraping is necessary if you want to develop a web application where you want to show customised information from various websites.  For this you’ve to first scrap data from the sites and then apply some logic to filter the information.

Practically you can use different languages to write the program that will automatically search and collect the information. But if you’re PHP experts and want to use PHP for this kind of stuff here I am referring a book with PHP library. Practically I found this book is very helpful to learn the topic and their library is easy to use.

Checkout the following book

 Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL

51U9gJgD5bL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_SX225_SY300_CR,0,0,225,300_SH20_OU01_

Checkout the library:

Download Code

Here I am describing a simple example of web scraping idea. Suppose you want to develop a web application where you want to show classified information about current news from different newspaper. So that people who are interested on a particular news can get all the newspaper link in your classified category.

News Classification

Web scraping for news classification

In the following diagram it is describing that, the application will retrieve all the news from different websites, then classify and categorised information based on interest, like politics, sports, entertainment etc. The above book and the library will help you to make this kind of application.

You can also extend the thing to develop a iPhone or Android news application. On that case your scraping script should be installed in a server where it automatically will collect and classify information and in the smart phone application you just retrieved the data from your server.

About mahmud ahsan

Founder And Lead Programmer at iThinkdiff.net

, , , , , ,

2 Responses to PHP for Web scraping and bot development

  1. graham July 11, 2013 at 10:48 am #

    you can also check PHP web scraping ebook from here:

    https://leanpub.com/web-scraping

  2. Jasmine May 9, 2014 at 7:58 am #

    Thanks for sharing your thoughts on bot. Regards