Find Jobs
Hire Freelancers

write 2 python web crawlers using scrapy framwork to read wikipedia data

$30-250 SGD

Completed
Posted about 8 years ago

$30-250 SGD

Paid on delivery
Implement two web crawlers in python using the scrapy (1.0.5) framework 1) Get the full list of countries and territories from here [login to view URL] write a. Country / territory name b. Wikipedia URL c. Status (Membership) d. Dispute status e. Further information f. Polling date into a mariadb based database db schema: id(auto-incement), createdate(timestamp), all other fields are type text 2) Get the list of URLs in 1b) and crawl each one of the countries websites to extract information of each one of them: a. Abstract b. VCard data: i. Name ii. URL for flag iii. URL for emblem iv. Motto v. Anthem vi. URL to location on globe vii. URL to map viii. Capital(s) 1. Name 2. URL ix. Official language(a) 1. Name 2. URL x. Religion(s) 1. Name 2. URL xi. Demonym(s) 1. Name 2. URL xii. Government 1. Name 2. URL xiii. Establishment(s) 1. Name 2. Date xiv. Area 1. Total km2 2. Water km2 xv. Population 1. Total estimate 2. Date of counting / estimate xvi. GDP 1. Total 2. Per capita xvii. HDI index 1. Total 2. Rank xviii. Currency 1. Name 2. 3 letter code xix. TimeZone(s) 1. Name 2. Deviation from GMT 3. URL to timezone xx. Driving on left or right? xxi. Calling code(s) xxii. ISO code xxiii. Internet TLD(s) c. Date of polling Write the data above into a mariadb based database db schema: id(auto-incement), createdate(timestamp), all other fields are type text In case of multiple entries (e.g. languages) write a comma separated list in the db text field. Make sure the original text is comma free. Requirements: - Running on Ubuntu 14.04 lts (x64) standard installation (scrapy 1.0.5 installation [login to view URL]) - Mariadb 5.5 - Needs to run failsafe with correct results for all countries – especially for countries with several entries for capitol, timezones, languages etc Copyright: - All code belongs to employer Delivery: - 2 web crawlers with pipeline into MariaDB - Code with comments / documentation to be maintained - After uploading the full code, I will run it on my system and proof-read code before payment - No milestones, payment in full after successful test
Project ID: 10207457

About the project

13 proposals
Remote project
Active 8 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
Hi, I can do this job for you. Message me if you want me to get started on it ill do it right away .
$83 SGD in 4 days
4.9 (4 reviews)
2.7
2.7
13 freelancers are bidding on average $284 SGD for this job
User Avatar
Hello Sir, We've done a number of web scraping projects for our clients. We have scraped many directory websites including yellowpages, yelp and e-commerce websites including amazon, walmart etc and many more. We can deliver the data very quickly. We use proxies with IP rotation to avoid being detected as bots. We use python with wget, scrapy, urllib and other tools to fetch webpages and parsers like HtmlXPathSelector, regular expressions etc to extract information from the html. We have the right skill set to do this job effectively and within time and would like to discuss more about this opportunity. Looking forward to hear from you. Thanks, Shiv Agrawal SuiGen Solutions
$421 SGD in 3 days
4.8 (88 reviews)
6.6
6.6
User Avatar
Hi there. I would be glad to help you out with this project. I am a professional data scraper, with experience creating easy to use scripts to extract data from the web. I can guarantee you an excellent job and deliver asap. However, I do not work without milestone creation, I only ask for them to be created but no payment in advance. Once the job is completed you can release them. Thanks, Daniel
$336 SGD in 3 days
4.7 (94 reviews)
6.9
6.9
User Avatar
Hi, I have read the description & would like to discuss.. I have good web scraping experience & reviews. & can develop web scraping scripts in Python & C# Hope we can discuss details..
$200 SGD in 2 days
5.0 (137 reviews)
6.2
6.2
User Avatar
Hello! I'm web scraping expert and i can done your project in 3 days. I use python language and scrapy framework. My scripts works on windows, mac or linux, but linux is preferably. I can schedule scripts on server if it is required. I have more 200 finished projects (google scraping, facebook scraping, yellow pages, linkedinIn, amazon, webshops and other sites with lists of any items). I can export data into json, xml, csv (excel), or any database (mysql, mongodb, mssql, etc). Message me, if you have any questions!
$299 SGD in 5 days
4.8 (110 reviews)
6.5
6.5
User Avatar
hi, I am an expert with python/scrapy, and have many scrapy project done here. Your project looks OK for me at first glance, please contact me to discuss more detailed requirement, Thanks
$222 SGD in 5 days
5.0 (25 reviews)
5.2
5.2
User Avatar
Hi, I'm a frequent user of Scrapy and I've already written solutions that integrate with MySQL. I have a few questions regarding your project: 1- Are you looking to run the script periodically? If so, it should be aware that the data may already exist to avoid duplication or unique contraint violations. 2- If you plan to run it periodically, will you do it manually or do you need an additional solution to schedule the spiders? 3- The installation of the required software is part of the project or you will be doing that yourself? Thanks.
$200 SGD in 10 days
5.0 (19 reviews)
4.6
4.6
User Avatar
Hey there ! We're 2 developers with vast and wide knowledge in Python and scripting specializing in Web scraping. We'll gladly do your project as it seems like something we can pull-off with a script. You can look in our profile for previous projects we've done regarding Web Scraping. Contact us for further details.
$277 SGD in 3 days
5.0 (7 reviews)
4.0
4.0
User Avatar
I got 7+years work experience in Data Collection,Bulk Email Campaign,Excel VBA and Internet Research in IT companies here.I can do create crawler and scrap datas from Directory and yellowpages using C++,Python and Perl coding as per your requirements in excel with multiple ip rotations.I have dealt with US,UK and Australia companies President,Directors and Managers for web design and development projects successfully and I have Good Communication with writing skills.I am well versed in Internet,MS Office Applications and Phone Etiquette manners with latest Technologies.I can accept your payment terms.
$188 SGD in 2 days
3.9 (6 reviews)
4.4
4.4
User Avatar
I am a seasoned python programmer with past experience using scrapy. I can write this software for you and even test it on Ubuntu 14.04 LTS before you receive the code. I frequently comment my code and try to write code that is simple and easy to understand. I require no milestone until you are satisfied with the work. Please let me know if you have any questions. Regards, Daniel
$277 SGD in 15 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of SINGAPORE
Singapore, Singapore
4.9
2
Payment method verified
Member since Apr 10, 2016

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.