Closed

extract data from web database and paste it to excel - Repost - open to bidding

This project received 18 bids from talented freelancers with an average bid price of $4 USD / hour.

Get free quotes for a project like this
Employer working
Skills Required
Project Budget
$2-$8 USD / hour
Total Bids
18
Project Description

I want the terms to be searched in the expert search field on www.clinicaltrials.gov. This expert search is difficult to find: first go to advanced search, type in anything, click on search, then click on modify the search results. You'll find the link for expert search in the upper right part of the website.
Expert search is characterized by a large search field...
Just delete the prior search string and copy & paste the search strings from the attached excel file (input data sheet). There are about 500 search strings of them. There is no data for every year, as you'll notice.

Then please download the entire results list. You'll find the download link above the results list.
I want to have all 21 data fields.
Select the number of studies to download: click on the option at the bottom of the dropdown menu, where you'll find the number of total hits.

There are multiple download formats. Please choose comma-separated (csv) and xml). I want to have XML as a backup, only csv shall be processed. Please name the downloaded xml files as follows:
Combination of firmname from column A and year from line 1 (years 2003-2010),
(what you'll do with csv depends on you - if you directly extract the content, then it's fine. If you prefer to download all files first, then please make sure that you name them properly!)

I want the data to be in columns as with the csv files.
Then I want the data to be in the following format:
Column A contains the firm name, column B contains the year (2003-2010). Columns C, D, E.... are the data you downloaded.
Please have a look at the example which, at the same time, is the file to be completed...

Just in case there should be too many entry terms in the first field (where you get an error message), please proceed with the others and notify me.

You may notice that the data is somewhat overlapping (actually I look for 2001-2003, 2002-2004, 2003-2005, so entries for years 2003 appear three times - that's correct).

Summarizing:
Go to that website and use the expert search
Copy the search terms for the companies and year listed in the attached excel file (input data sheet) and paste them to the web database search field.
Then download the files (csv + xml) for ALL hits and ALL fields (attention, you may forget to adjust that when you download hundreds of files).
Open the csv file, copy its content and paste it to the file (results sheet) that I attached.
Rename the xml files as described.
Finally, please provide the excel file plus the xml files (the latter all inside one or a few ZIP files).

Please provide an estimate how many hours you'll need and on which day you'll be able to complete the task.
Thank you very much.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online