web filter algo - index and classify


We need an algorithm that in real time scan any web page URL that is not in our

"Bad" List Db.

Your Algo

Needs to scan page content on the fly based on a set of search criteria( the heart fo your algo)

for example: You need to detect an adult web site

So you need to build and algo that can tell this is an adult site and classify page as adult

Idea: Scan page title, URL, Textual content, Images( some idea to define and read no Adult Images)

U can see how safe search google, bing classify bad images..So basically you look at the "Strict" level

and any URl of Web page of web site you can classify site in real time if It is an Adult Site or not Adult Site

We want to something close to 99.999

Your algo can work in 2 modes: as an offline crawaler that index and build for our company

all Adult web sites out there in DB under category adult. Run daily and add more adult web sites it finds

and also works in real time when user search for a URL- before we bring URL(if not in blocked list) we send your Algo

the URL and u preform some quick analysis to decide if Adult site or not

if adult - you send BLOCKED message to our calling script and also push the URL to the DB your crawler build offline on a daily basis

We want to build like that quickly massive adult database for quick access cashing without calling your algo in real time

so we can bring faster results.

Later we wnat to do this for 10 more categories like Gambling, P2p, Social Networking, etc

But this project is for 99.99% Adult detection including images and videos (like safe search results )

on page in real time and as offline process

We want someone who can truly gets it and build something great

Give us some thoughts , method, time to create and bid


related ideas/resources:

P.s. Possibly to use gOOGLe APi for indexing if helps- suggestion

Bayesian filtering - optional to implement

[url removed, login to view]

Wikipedia entry on Bayesian classifiers

[url removed, login to view]

Skills: Algorithm, PHP, Software Architecture, Web Security

See more: www web page create com, we heart it, we do web sites, web search images, web page build, web crawler wiki, videos web, us algorithm, set algorithm, search algorithm example, search algo, push algorithm, index search algorithm, how to use algorithm, how to read index, how to do algorithm, how to create a wikipedia page, how to create an algorithm, how to build web page, how to build a wikipedia page, how do you create a web page, how algorithm works, google web site analysis, give an example of an algorithm, example of an algorithm

About the Employer:
( 333 reviews ) Valencia, United States

Project ID: #1271204

Awarded to:


Hello edangeller. I'm a software engineer and Software Architect. Please see my private message for justification of my bid. Thanks in advance and good luck with your project. Best Regards! Esteban

$4800 USD in 20 days
(5 Reviews)

Hello there ! I've sent you a message with the solution I propose for your problem. Hope to hear back from you !

$3000 USD in 30 days
(3 Reviews)

I have written high performance crawlers. Also have a background in email delivery and spam filtering techniques that could be applied for this project. In Africa but 10 years programming experience in the US - Boston More

$3000 USD in 30 days
(3 Reviews)

15 freelancers are bidding on average $4007 for this job


Let's start!

$4500 USD in 45 days
(13 Reviews)

Hi, I am confident to handle your project. Please check your inbox, thank you.

$4000 USD in 10 days
(52 Reviews)

Everyone bids here .. but whats the difference? We have coded complete search engines and we are opensource coders who released many projects. [url removed, login to view] nearly 4 Million videos indexed there.. that was More

$5000 USD in 15 days
(3 Reviews)

Hello Sir we are well clear with the given requirement, we got PHP/CAKEPHP EXPERTS, Kindly check PMB to see our previous works details Thanks Sajin

$4000 USD in 60 days
(6 Reviews)

Please check PM ..... Work with us and you have a high standard project and on future you have free consulting for next project and a small price . In present my company is partnered with : Cpanel , Dell , Kaspersk More

$4200 USD in 45 days
(2 Reviews)

Please Check private message.

$4000 USD in 45 days
(2 Reviews)

interesting project, i can help

$3500 USD in 21 days
(0 Reviews)

hi, please see pmb.

$5000 USD in 40 days
(0 Reviews)

hi, this project is suited to my knowlege

$3600 USD in 25 days
(0 Reviews)

Dear Edangeller, I am a freelancer specializing in Natural Language Processing. I have attached a detailed bid for your project on classifying text and image content. Please let me know your thoughts. Regards, Sande More

$5000 USD in 60 days
(0 Reviews)

Hi. I have a strong experience at text mining & pattern recognition technologies. I am interested in to participate at your project. Thank you.

$3500 USD in 60 days
(0 Reviews)

Dear Sir, We have made something similar with indexing of images from google images, but not for adult. On the background many indexing algorithms are running, and creating an inverted index. Please contact us t More

$3500 USD in 15 days
(0 Reviews)

Hi, I am ready to get started on this project. I have already created similar software to this, and had a spider and indexing engine that scanned over many millions of adult photos and images on the web and coolllated More

$4000 USD in 30 days
(0 Reviews)