[url removed, login to view] crawler

I need a custom crawler that can accept a range of documents from [url removed, login to view]

example: [url removed, login to view] to [url removed, login to view]

and return a csv file with these fields: Could also be mongodb, open for suggestions..

Clinical Study ID



Study Status

Start Date

start Enroll

End Date

Primary Comp Date

Study Completion Date




# of Sites

Enrollment (Actual #s where available)

List of Countries

Study Design

# of Study ARMs

Can be written in python or java or can be based on an opensource crawler like Nutch, Hetrix, Bixo web mining toolkit, Mechanize for Python, Crawler4j, etc

Skills: Java, NoSQL Couch & Mongo, Python, Web Scraping, XML

See more: where to study web design, study design web, python to java, mongodb c++, Nutch, web crawler example python, python csv xml, java date range, mongodb fields, csv xml python, python opensource, web crawler python, nutch mongodb, python mongodb, java crawl, python xml file, python web crawler mining, xml csv java, csv file xml java, countries xml file, python web mining, python http status, java web start example, mongodb java web, title crawler

About the Employer:
( 7 reviews ) Raleigh, United States

Project ID: #4174668