PHP Spider - Auto Learning AI approach

We wish to build an intelligent spider which can learn from drag and drop user actions. Our objective to provide a GUI interface for learning spidering rules.

The rules will be based on a combination of knowing the HTML page structure and being able to extract elements e.g. a table of values, therafter fields within the table which may repeat.

Extracted data Ashould be written to a database, i.e. we should be able to drag and drop the extracted field values to a database structure. (auto map).

There may be several rows of data and HTML commits per HTML page parsed.

Rules of extraction are two fold :

1) The HTML structure itself and tagging this for drag and drop and slecting groups of elements capability

2) Regex auto pattern learning based on examples within 1) or combined with 1)

Given a URL the parge should be parsed using To Parse pages ready for drag and drop of HTML elements in the gui.

One of the key concerns is to be able to detect any ambiguity in the rules, for example a table may occur 20 times on a page, we may need only the 4th table, or we may need only the table with a special style associated. (<table columns=30 style=ā€¯example">)

Its important that recurring elements can be supported.

Grouping and nested elements:

It should be possible to first group elements using rubber banding gui technique (e.g. select a table) drag and drop from the resulting page into a box and that the system would be able to visit the page URL and follow the drag and drop example i.e. extract the table.

It should be possible for the system based on an example to know what HTML elements To extract form a page based on a users drag and drop action. Regex rules and wildcard patterns should be self learning to extract elements from within a table.


If you have strong experience in some of the above coupled with solid GUI development, please respond.

PHP Simple HTML DOM Parser

[url removed, login to view]

Or similar domain parser to tag elements for drag and drop.

Other Blurb : Related

1. Ontology learning and population: bridging the gap between text ... - Google Books Result

[url removed, login to view] Buitelaar, Philipp Cimiano - 2008 - Computers - 273 pages

However, it relies on a set of manually written regular expressions, ... critical requirement of the method is the availability of sound core ontologies, ...

2. Ontology learning for the semantic Web - Google Books Result

[url removed, login to view] Maedche - 2002 - Computers - 244 pages

Figure 7.12 depicts the view of pattern engineering that allows to development and debugging of regular expression patterns for ontology learning. ...

Skills: AJAX, Javascript, Perl, PHP, Script Install

See more: spidering rules, auto learning, php learning, what is a regular expression, web development learning, using regular expressions, using regular expression, using regex in c, using expressions, the learning experience, structure books, spider web data extraction, regular expression with examples, regular expression using, regular expressions in c, regular expressions examples, regular expressions example, regular expressions c, regular expression or example, regular expression no, regular expression in c, regular expression example, regular expression c, regular expression a, regular expression 0

About the Employer:
( 82 reviews ) karachi, Pakistan

Project ID: #1298146

7 freelancers are bidding on average $657 for this job


We can help in your project, please check PMB and our ratings/reviews to get idea of our experience.

$500 USD in 10 days
(239 Reviews)

Just click to the attachment here our best projects i have linked. We are having 7 years experience in web development. Please check PMB for more details.

$750 USD in 20 days
(11 Reviews)

i work a many complex spiders for my company in the real estate and automative domains ,so i think that i m qualified to help you

$750 USD in 10 days
(2 Reviews)


$750 USD in 7 days
(0 Reviews)

Hello Sir, We can confidentially complete the project.. Please check PMB for listing.. Warm Regards

$600 USD in 7 days
(0 Reviews)

Dear Sir, Thank you for posting your project. We have more than 14 years of experience in Software development, Web development and Multimedia industry. We have skilled developers having expertise web 2.0 d More

$700 USD in 17 days
(0 Reviews)

please check PMB for our portfolio

$550 USD in 8 days
(0 Reviews)