Visual web scraping framework in delphi - repost

CLOSED
Bids
3
Avg Bid (AUD)
$597
Project Budget (AUD)
$250 - $750

Project Description:
Visual web scraping framework in delphi.

I would like a generic scraping framework written in delphi, which I can use as a basis to build web scrapers for various websites.

An implementation for a specified website will be required to demonstrate the functionality of the framework.

Framework functionality must include:
* full programmatic navigation of a website (via a visual recognition of components, as specified by myself (programmatically) through the framework)
* handling/navigation of pop up windows, java and otherwise. (visual)
* emulate website clicks/actions. (within the application, rather than emulating clicks on the computer would be preferred)
* ability to translate text data from a visual area. (I.e. visual translation of the text).
* non visual translation of text data would be useful too. (if it can be done in a simple way with highlight copy to clipboard)
* simple framework to store the data, which reflects the structure of the data on the websource.

More description will be added soon.

Additional Project Description:
09/26/2013 at 23:30 CLST
Scope:
- solution must be in delphi.
- expect developer to make use of available delphi libraries.
- open to ideas and suggestions from the developer. But want to keep it simple.


Objective: to build a visual web scraper framework in delphi, so that webscraper applications can be built using the framework.

I want the ability to build the webscraper applications myself. Though if I am happy with the project, the likelihood of future work it high.

**To clarify, by "visual" I mean: text data is extracted "visually", i.e. translated from a visual image, jpg screen grab or equivalent. Website navigation is also to be "visual".


Core requirements:

general:
- working delphi source code of the framework.
- compatible with delphi XE4.
- extracts text data visually**.
- navigates website visually**.
- emulates basic website interaction. i.e. mouse clicks, typing, etc.
- parts of website to click or type are identified visually**.
- handles navigation of pop up windows visually**, java/flash and otherwise.
- it is important that the design maximises speed of scraping, multithreading would be good if you have experience.
- design must also aim to be efficient. (i.e. computing resources).
- working delphi source code of a scraper for a given website, using the framework (to demonstrate functionality).

specific:
- I suggest framework allows you to specify areas, where text data, a navigation node, or interaction area exists.
- I suggest a navigation tree or equivalent be dynamically generated, where any data etc can be related to the relevant navigation node.
- framework allows all data extractions, navigations, and interactions to be saved and loaded. (to/from a mapping or equivalent).
- any saved mapping/commands can be replicated programmatically to emulate website use.
- a method to test that the visual area for a mapping/command is correct.
- the framework should compile to a working executable, where you can create/save new mapping/commands through the framework by manually clicking and interacting with a website.
- there must be a visual representation of an individual mapping/command (e.g. highlight area specified).
- navigation includes scrolling.
- emulation of website clicks/actions should be done within the application, rather than emulating clicks on the computer.

- data should be stored according to where it exists on the website, i.e. data structure must be created dynamically with the navigation of the website.
- store history of extracted data in working memory with timestamps.
- have a simple way to look at history data while scraper app is running, i.e. using a string grid or equivalent.
- I suggest (for example) you use generics TList & TObjectList where the navigation and storage of data can be dynamic and generic. Also data, navigation or interaction nodes could have their own classes which would inherit from a generic node class. And sit as objects in the navigation tree lists.

working website scraper source code:
- built using framework.
- will give you a specific website to base this on.
- I suggest it is built as an extension on the framework application, where a new unit contains the code specific to the website. I.e. it is essentially the same application, where the core framework saves and executes commands, & stores the data.

Skills required:
Delphi
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


$ 750
in 7 days
$ 515
in 10 days
$ 526
in 12 days