Data Mining of Wikipedia Pages


Basically this project is to grab information about historic events & people from Wikipedia & Freebase (which has all of the Wikipedia data but in database form) and import it into a spread sheet .

I hope to utilize Google Refine for the project, but I am open to suggestions

[url removed, login to view]

After you see the 3 video clip on Google Refine you will see exactly what we are doing in using Freebase to gather the Wikipedia data by searching on the name, or events, or Wikipedia links found in the respective base resources.

There are 3 basic groups of data:

#1) 1,000 of the most important people (see attached text doc)

#2) Universal history from the 4 timeline graphics from [url removed, login to view]

#3) Events: From these wiki pages below

- [url removed, login to view]

- [url removed, login to view]

- [url removed, login to view]

- [url removed, login to view]

I all cases we are collecting, names, alphabetic names, titles, Wikipedia page, birth/start date, death/end date, an image, the Wikipedia summary, a few tag words, and a couple other items depending on group.

I'm sure this will generate some detailed questions, but this should be enough to decide if this is something you are interested in.

I will want a copy of your work at various steps to make sure we are not missing anything, or wrongly filtering items. (ie a copy of your Google Refine projects)

It is a plus if you have worked similar project etc. Please let me know if you have an suggestions or ideas that my be a better.


Skills: Data Mining, Web Scraping

See more: data mining wikipedia, wikipedia data filtering, video data mining wiki, mining wikipedia pages, work for wikipedia, web scraping wiki, the p.i.c. group, start a wikipedia page, scraping web for ideas, scraping data from web database, p&a group, make a wikipedia page, make a spread sheet, i want wikipedia, what is data scraping, what is a google doc, google spread sheet, wikipedia, open all pages, missing people

About the Employer:
( 52 reviews ) san diego, United States

Project ID: #1488379

Awarded to:


Dear Sir, i'm expert in web scraping and data processing. Please check details in pm. Thank You.

$100 USD in 7 days
(2 Reviews)

7 freelancers are bidding on average $150 for this job


Dear gutekunstb, Greetings! Here is a place holder bid with default values. I shall get back to you soon with any possible queries. Thank you.

$48 USD in 1 day
(19 Reviews)

Dear Sir/Madam I am Tushar from Ozone Solutions India. Ozone Solutions is providing administrative supports since last 3 years. We are one of the top service provider in Admin Support on one of the big freelancer webs More

$200 USD in 10 days
(11 Reviews)

see my pm.

$150 USD in 7 days
(0 Reviews)

Please check PMB

$225 USD in 6 days
(0 Reviews)

Hi, I am BTech Computer Science Graduate. I believe I have the necessary skills to execute the project. Please check the PMB for additional details

$125 USD in 2 days
(0 Reviews)

Hi, I have over 5 years of exp in Data Warehousing and Data Mining. Please share the details of the project. Thanks, Amit

$200 USD in 7 days
(0 Reviews)