Closed

Scrape Data from Websites

This project received 10 bids from talented freelancers with an average bid price of $86 USD.

Get free quotes for a project like this
Employer working
Project Budget
N/A
Total Bids
10
Project Description

I need Xpath selection templates/paths or C# or VB code to use in a website scraping software that I use.

Here are a few things I need to do:

1:
Find the link to the contact us page. This page is usually called...
-> Contact Us
-> Contact
-> Get In Touch
So it is almost always some variation of the words above.

This can easily be done through Xpath. The two elements below I am having a bit of trouble with.

2:
Find a phone number on the contact us page. The phone numbers posted typically follow this format...
(XXX) XXX-XXXX
XXX XXX-XXXX
XXX-XXX-XXXX
[url removed, login to view]
And sometimes are prreceeded by a 1 like the examples below...
1 (XXX) XXX-XXXX
1 XXX XXX-XXXX
1 XXX-XXX-XXXX
1 [url removed, login to view]
Words like Tel, Telephone, Phone, Contact Number, Local, Toll Free etc maybe right before the phone number.

3:
And finally extract an e-mail address. The e-mail address usually has these words in it...
@, {at}, [at], (at), at, dot, ., com, .com (of course in addition to the actual e-mail address).
Words like e-mail, Email, E-mail, email, Email Address, Toll Free etc maybe used right before the email address as well.

The program I use visual web ripper has support for [url removed, login to view], regex and c# if you can find a better way of getting the required info.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online