Closed

Craigslist RSS feed scrape

This project was awarded to msbatechie for $250 USD.

Get free quotes for a project like this
Employer working
Awarded to:
Skills Required
Project Budget
$30 - $250 USD
Total Bids
7
Project Description

I need a Regex script that will parse out US postal addresses from a Craigslist RSS feed. Sample data looks like this:

[url removed, login to view]

These fields are not standardized on their site, addresses could exist in Title or Description fields. I know you are up for the challenge! The [url removed, login to view] script needs to return as many addresses as possible (as some posts may not be able to find a match) in an array of result objects.

The script also needs to find start and end dates within the text. These would also be in a non-standardized format (Nov 20, 11/20 through 11/22, 11-20-2009, November 20th, Friday 11/20 through Sunday 11/22, etc).

Output would be an array of result objects with the following properties (from each "item" xml node and children):

title
description
start date
end date
address
url to the source post (from "rdf:about" attribute on the "item" node)

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online