Find Jobs
Hire Freelancers

web data extraction/web scraping

$10-30 USD

Closed
Posted about 10 years ago

$10-30 USD

Paid on delivery
Hi -- I need to extract some information from a few different websites, to put it into a 'nice' easy to read format for myself. It's a relatively easy job - but providing this is done well, I'll have a number of additional sites for you to scrape. I have some sample code that I can send you (from a previous completed project, which is a sample of the data extraction. It works for the first two sites I will be asking you to do, however I am going to be asking for a bit more information to be extracted, as well ). Criteria: ---------- 1. Please only bid if you are American, Canadian, European or Phillipino. All other bids will most likely be ignored (simply a matter of coding/quality issue). 2. It should be done in PHP/mySQL. You may need to use CURL (as some sites will have basic password/login authentication). 3. Preferably a good english speaking/reading/writing level. 4. If you have some code samples, that would help. (I am looking for someone that implements 'good' coding standards). 5. For this bid, it is for 'two' scrapes/websites (in my existing code). However, should you do a good job, I'll be extending this to another 10-15 sites. Some sites will have data scraped from 'wordpress' type websites, while others will simply be 'directory' style websites. I'm estimating probably 3-5 hours to complete this. (especially because I have some sample code I'm sending you, and it mainly requires tweaking/making it look good/etc). Other technical details: Ideally someone will have experience in the following. (I had a previous fellow working on it, but he cancelled due to other commitments) -------------------------------------- - PHP version 5.4 or newer - Framework: Yii - Scraping library: Goutte - Database: MySQL -------------------------------------- I have existing code that you can work off of if you wish. Actual project: ------------------ 1. Please see the attached ms word document for "complete" details, but basically you will be scraping data from websites via php. I'll start off with one site, and providing you do a good job, this will most likely be a job of about 10-15 sites, and maybe more. 2. You'll go to the webpage, download all applicable pages, and scrape the data. You will then 'reformat' this data, and insert it into a mySQL table. As it is "extracting", it would be nice to have some kind of counter (i.e., processing page 1/50) as it works, as well as making sure the script doesn't time out. (I.e., it's possible some scripts may take say 5-10 minutes to process). 3. I'd like a separate link included (php) that simply does a 'database' dump in HTML format. 4. For future (separate job from this), it will most likely be a 'maitenance' job. So for the future (which of course would be arranged in a separate project), probably 1-2x per month I'd want you just to go through the code to ensure everything is working a-ok. 5. Bonus - if you know how to use online .pdf to text pages (and/or can do that via curl/etc), that is a bonus. I'll have a separate project for you for that. Thanks!
Project ID: 5712392

About the project

3 proposals
Remote project
Active 10 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

About the client

Flag of BANGLADESH
dhaka, Bangladesh
0.0
0
Payment method verified
Member since Aug 16, 2013

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.