Closed

Categorizing 40,000 Wikipedia Articles

Greetings. I have accumulated a collection of around ~40,000 random uncategorized Wikipedia article's URLs.

I'd like to sort these URLs and assign them to its respected category.

I have established some general parent-categories which I feel the articles should fall under.

Architecture, Arts, Film and Music

Communication, Education and Literature

Companies and Organizations

Economics and Finance

Energy and Environment

Food and Drink

Geography and Places

Health and Medicine

Law and Politics

Mathematics

Media (Books, Movies and TV)

People

Philosophy, Religion and Spirituality

Psychology

Recreation and Sports

Science and Technology

Social Science (Anthropology, History and Sociology)

These are just the parent categories; each article should then be sorted by its sub-categories as well (Example: in Mathematics - Probability, Geometry, etc.; in Geography - Cities, National Parks, Islands, etc.; in Religion - Buddhism, Judaism, etc.; in Technology - Networking, AI, etc.; in People - Business, Sports, Politics, etc.)

The URLs are in a plain text format (.txt) and the output can be the same.

------------------------

Example

Uncategorized:

[login to view URL]

[login to view URL](philosophy)

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

------------------------

Categorized:

[Communication - Journalism]

[login to view URL]

[Communication - Literature]

[login to view URL]

[Companies - Financial]

[login to view URL]

[Companies - Technology]

[login to view URL]

[Companies - Transport]

[login to view URL]

[Finance - Foreign Exchange]

[login to view URL]

[Finance - Insurance]

[login to view URL]

[Finance - Options]

[login to view URL]

[Finance - Taxation]

[login to view URL]

[Health - Diseases]

[login to view URL]

[Health - Sleep]

[login to view URL]

[Geography - Parks]

[login to view URL]

[Geography - Salt Flats]

[login to view URL]

[Geography - Valleys]

[login to view URL]

[People - Artist]

[login to view URL]

[People - Businessmen]

[login to view URL]

[People - Philosopher]

[login to view URL]

[People - Politics]

[login to view URL]

[Philosophy]

[login to view URL]

[Philosophy - Concepts]

[login to view URL](philosophy)

[Religion - Buddhism]

[login to view URL]

[Religion - Hinduism]

[login to view URL]

------------------------

The above example has to be applied to 40,000 URLs. Avoiding overcategorization is a must. Strive to keep the sub-categories broad.

I came across a few links which may be useful:

[login to view URL]

[login to view URL]

[login to view URL]

Skills: Data Analytics, Data Processing, Natural Language, Web Scraping, Wikipedia

See more: wikipedia subcategories, list of wikipedia articles, wiki add article to category, category and subcategory example, approximately how many articles are there on wikipedia?, wiki list pages in category, wikipedia article categories, wikipedia article classification, parsing wikipedia articles, write wikipedia articles, wikipedia articles rated, wikipedia articles template markup, wikipedia articles php script, parse wikipedia articles, copy wikipedia articles wiki, java parse wikipedia articles, add wikipedia articles website, writer wikipedia articles, writing wikipedia articles, writing wikipedia articles money

About the Employer:
( 0 reviews ) United States

Project ID: #18827984

14 freelancers are bidding on average $651 for this job

sobujprantor

HI, I have a big team. So i think this work will be easy for me. IF you are interested to work with me so message to me. Thanks.

$250 USD in 15 days
(243 Reviews)
7.5
zhangyingtai

Hello sir I have 9 years of experience of web scraping. I have extremely knowledge of NLP and am very familiar with categorizing the wikipedia articles. I can make a python script to categorize the articles using th More

$250 USD in 3 days
(30 Reviews)
6.3
devi222

Hi Sir, I’m expert in categorisation and I have scraped wiki a lot of times. I can finish 40k urls into respective main and sub categories. Thanks

$750 USD in 6 days
(143 Reviews)
6.7
anashcisoft

Hello, I have more than 10 years experience in machine learning, natural language processing, data mining and other AI related fields. I have worked on many previous similar projects and can do this project in a perfec More

$555 USD in 10 days
(17 Reviews)
6.3
EmmaWat

Hi, I hope you are doing great. I just came across your project stated that you are looking for a Wikipedia expert to create and publish the page for you. I would like to tell you over here that we are a team of expert More

$1000 USD in 10 days
(3 Reviews)
5.6
TaffyAU

Hi, I'm an active Wikipedia editor and aware of all Wikipedia's policies and terms of reference. Wikipedia is more than just throwing the pages up, it requires a proper strategy and methods to bring them live. I have More

$2500 USD in 60 days
(13 Reviews)
5.3
revival786

Greetings! I hope you are doing great. I am highly professional in managing Wikipedia projects. Please contact so I may assist you. Wikipedia Sample Work: [login to view URL] More

$250 USD in 5 days
(9 Reviews)
5.2
Alexod

Hello, I can build a program to categorize all your articles. The results will be fast and accurate ! I know what I'm talking about, because I have experience in this field.

$555 USD in 10 days
(4 Reviews)
3.6
MissCrissy

Hi! I am an IT-consultant and a virtual assistant so this project would fit me perfect. I can start asap and work flexible hours according to your needs. Please contact me for further information and lets get this pro More

$250 USD in 10 days
(3 Reviews)
2.9
tabonk82

Hello I would really like to work with you on this one if possible! I do have a couple of questions, but first I would like to make you an offer and some background so you can check my work out. I am a profession More

$250 USD in 30 days
(5 Reviews)
0.9
jmtamez2203

HI there! My partner and I would love work on your project. I offer you a calssification article based on Machine Learning algorithms such as Random Forest Classifier with Natural Processiong Language form NLTK all thi More

$700 USD in 13 days
(0 Reviews)
0.0
vakarsh

Hello, Hope you are doing well. Just wanted to share that i have a good hand with probability and statistics. I am fairly comfortable with Python & R to support variety of Data science and Statistical Analysis tasks. More

$250 USD in 14 days
(1 Review)
0.0
distilledinfo

Hi I am ready to Categorizing 40,000 Wikipedia Articles. Please initiate chat to discuss more. Please check articles done by me- [login to view URL] [login to view URL] [login to view URL] http More

$1000 USD in 30 days
(0 Reviews)
0.0
expertswriting45

I will be happy to help you, 100% original work, has been working as an academic for the last 15 years. I deliver high standard and willing to carry any changes till you are fully satisfied.

$555 USD in 10 days
(0 Reviews)
0.0