RSS crawler for different languages and categories fetched from database

  • Status Closed
  • Budget €30 - €250 EUR
  • Total Bids 16

Project Description

I have mysql database where we need to feed it with news from RSS websites.

Considering the RSS links are in database, linked to categories and the categories are linked to countries so for example:

-USA

-- Politics

--- [url removed, login to view]

---- [url removed, login to view]

---- [url removed, login to view]

---- [url removed, login to view]

--- [url removed, login to view]

---- [url removed, login to view]

---- [url removed, login to view]

---- [url removed, login to view]

etc...

We need to create a well clean and clear pure PHP script based on PHP7 where it will read the RSS and insert them into the database in the table article. I will provide you with the database script where we have there the following tables:

===Database news

== Table structure for table article

|------

|Column|Type|Null|Default

|------

|//**article_id**//|int(255)|No|

|category_id|int(11)|No|

|language_id|int(11)|No|

|rss_id|int(100)|No|

|article_title|tinytext|No|

|article_date_time|datetime|No|

|article_details|text|No|

|article_image|varchar(255)|No|

|article_video|varchar(255)|No|

|article_page_name|varchar(255)|Yes|NULL

|article_importance_level|int(10)|No|1

== Table structure for table category

|------

|Column|Type|Null|Default

|------

|//**category_id**//|int(10)|No|

|category_name|varchar(255)|No|

|language_id|int(4)|No|

|category_color|varchar(10)|No|

|category_order|int(3)|No|1

== Table structure for table language

|------

|Column|Type|Null|Default

|------

|//**language_id**//|int(4)|No|

|language_name|varchar(100)|No|

|language_flag|varchar(255)|No|

|language_country_name|varchar(255)|No|

|language_order|int(11)|Yes|NULL

== Table structure for table rss

|------

|Column|Type|Null|Default

|------

|//**rss_id**//|int(100)|No|

|rss_link|tinytext|No|

|category_id|int(10)|No|

|language_id|int(4)|No|

|rss_frequency_minutes|int(4)|No|

Once we agree I will send you mysql script to create the database on your pc.

The script should crawl as well the video links, the images links, article title, description, etc.. (mainly everything in the table "article" should be filled by your script).

1- First script will be called from the cron job which will loop over the languages/countries (country and language are the same to us) and call the second script and pass for it the language_id as parameter.

2- Second script will be called from via the first script, get the language_id as parameter, fetch relevant categories from database, and foreach category, fetch the relevant RSS links and send each RSS link to a third script as request param.

3- Third script will take the rss link (should accept all kind of links like .xml and .rss) and will crawl the data, filter them and insert them into article (that's the objective here)

The scripts must be optimized for speed, comments should be on the code for easy editing later on, not too many nested function to avoid losing connectivity of the task, no frameworks should be used, just simple PHP classes, no complex OOP required, MVC is not needed.

The project should be done within couple of days from the moment we agree on.

I would like to see sample of the code via a hangout with the candidate.

Thank you.

Get free quotes for a project like this
Awarded to:

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online