I need a Regex script that will parse out US postal addresses from a Craigslist RSS feed. Sample data looks like this:
[url removed, login to view]
These fields are not standardized on their site, addresses could exist in Title or Description fields. I know you are up for the challenge! The [url removed, login to view] script needs to return as many addresses as possible (as some posts may not be able to find a match) in an array of result objects.
The script also needs to find start and end dates within the text. These would also be in a non-standardized format (Nov 20, 11/20 through 11/22, 11-20-2009, November 20th, Friday 11/20 through Sunday 11/22, etc).
Output would be an array of result objects with the following properties (from each "item" xml node and children):
url to the source post (from "rdf:about" attribute on the "item" node)
additional data source sample:
Obviously we wont be able to successfully scrape all posts due to the non-standardized user input but hopefully 80% or more will be usable.
6 freelancers are bidding on average $175 for this job
Hi, We are quite comfortable to work on your Project with our excellent team of Programmers & Designers. Please see PMB for our company Profile & Experience. Thanks
Hi! I have gone through your requirement and i am glad that i can accomplish this task, i would be more interested to speak to you on IM. Pls give us an opportunity to work with you.
I have done a such kind of project you can consider me for this project. I have done the scrapping of sites using C#. I can show you the task if you wish to see. Thanks