PHP script to build database of most common words on internet

In Progress

Implement a PHP class that takes as an input a single web page address (URL). When called, download and read the web page's text content (i.e. remove all html tags, javascript etc) and update to a database table information about the most common words found, with relation to the web page's TLD. For example, if the system is called with input URL "[url removed, login to view]", the TLD would be ".com", or if it is called with "[url removed, login to view]", the TLD would be ".[url removed, login to view]" The list of possible TLD's can be found e.g. from: [url removed, login to view]

In other words, the idea is that the system will generate a huge database table that contains information about the most common words found in web pages of different TLD's. The common word matching is, of course, not case sensitive. A word is defined as a string that has 2-100 characters, only characters from A to Z.

The table must contain information how many times any of the words have been found, and when last time (i.e. date).

So, the fields of the table could be:

id - integer - auto increment

word - varchar(100)

hits - integer (number of times this word has been found in different web pages)

last_hit - date (the date when this word was last found)

country - varchar(2) - the two letter ISO code of the country from which this web page containing this word was found).

The hits value gets increased every time the same word is found from different pages. In other words, if a web page contains word "foobar" 10 times, it is still added to the table with a hit count of 1. When the word "foobar" is found from some other web page of the same country, the hits counter is increased by one.

I will use the hits and last_hit data to prune the database table so it does not grow too big. I want to build a table of all the most common words found online, not all words. The job must be implemented using object oriented design using PHP classes. For database, use MySQL. You must develop the script in your own server, you are not given any server access to the production use server.

Skills: PHP

See more: www php code for web design org, web page php develop, web page design wikipedia, web design tags list html, web design generate using php, string matching in c, php string to html, php job uk, matching string, html code to develop a web page, how to develop web content, how to develop own web page, how to develop database in access, how to design web page with html, how to design a web page using html, how to build web pages, how to build web page, how to build a wikipedia page, how to build a web page, how can i build web page, example of letter, example of a job letter, example letter job, data oriented design, database one word or two

Project ID: #5071936

Awarded to:


Hello Jouni, I'm interested in building this script. My php guy would love to do it :) Thanks, Looking forward ;) Adnan

$247 USD in 3 days
(409 Reviews)

7 freelancers are bidding on average $232 for this job


Hello sir, Kindly have a look on our portfolio: We, Moons Micros Systems find pleasure to introduce ourselves as a leading offshore IT Development company in Rajasthan, INDIA. I am gla More

$247 USD in 11 days
(15 Reviews)

Hi, i understood what you want and i have 1 question: "build a table of all the most common words found online, not all words", so do you you have list of word ? please talk to me. I'm fast and expert. Thank you !

$236 USD in 5 days
(208 Reviews)

Hello, I have experience developing similar scripts. As far as i can see you will need a module to manage the "words". right? I am ready to start working on it right now. Thanks for reading my proposal, Bes More

$205 USD in 3 days
(108 Reviews)

Hello sir. I am very interested on your job and i have very experienced with these skills. If you wanna know my skill,please visit my portoforlio site. I am waiting for your offer. Thank you.

$154 USD in 10 days
(36 Reviews)

Hi, How do you exactly know to prune the words which not frequently used? Maybe one will increase significantly later but you already remove it from DB, so it could not be counted anymore, that way you just decrease More

$250 USD in 8 days
(9 Reviews)

Hi jv, So you need to start in one page and get all valid TLD order by relevance and save to db, follow these urls and same step, right? is easy. (I can do it in PHP+mySQL+bootstrap + AJAX) I'm perfect for the jo More

$200 USD in 7 days
(0 Reviews)

I can do this fast and I also have a server to try the code. I will also provide you with an admin to see all the data you requested in my server.

$333 USD in 2 days
(0 Reviews)

Hi, i have already worked on a similar project before and can deliver as u have mentioned we already have similar work experience and have worked on similar projects in the past and can deliver u as u have s More

$242 USD in 12 days
(0 Reviews)