Cancelled

QtWebKit URL rect extractor shell script

Task:

A command line shell script that given a URL of a web page, returns a CSV list of URLs and bounding rectangles (using page, not screen, coordinates) for every link on the page (whether text or image), or an approriate error status and message.

Expected approach:

Load the web page in a server-side browser (QtWebKit); inspect the DOM to retrieve link and bounding rectangles, output the list of rectangles as 5-element CSV consisting of URL and 4 rectangle coordinates (x,y,w,h).

Code:

Program may be written in perl, python or PHP, with as few external dependencies as possible, must use the QtWebKit library (it needs to have pixel-perfect compatibility with other tools built on it), and must be callable from a shell script.

Environment:

Target server runs ubuntu linux and has the QtWebKit python libraries available. The program must run without any graphical output (it's not a desktop app), use of xvfb is allowed. If you have particular requirements please check with me first. You may find the code in the [url removed, login to view] tool useful.

Deliverables:

The script file and example output.

Other notes:

I'd expect this to be doable in a few hours by someone familiar with these technologies.

See: [url removed, login to view]

Example call and output:

# ./[url removed, login to view] [url removed, login to view]

[url removed, login to view],200,200,200,84

[url removed, login to view],200,400,120,20

Skills: Javascript, Perl, PHP, Python, Shell Script

See more: qtwebkit php, qtwebkit python, python qtwebkit, python qtwebkit example, qtwebkit perl, python qtwebkit browser, xvfb qtwebkit, www dom com, web notes technologies, run shell script from php, rectangle line, line rectangle, php code qtwebkit, url extractor shell script, qtwebkit url rectangle, qtwebkit shell, qtwebkit command, perl use qtwebkit, url perl extractor, qtwebkit command line, w-9 from, server side technologies, line desktop app, in linux shell script, example shell script

About the Employer:
( 0 reviews ) Brighton, United Kingdom

Project ID: #594347

4 freelancers are bidding on average $120 for this job

vic116

Hello, Thank you for link on sources provided. I understand your requirements and can code solution, but since i am newbie in crawling using WebKit, it will take several days to deliver. Thanks.

$150 USD in 7 days
(7 Reviews)
4.1
pivanush

I have made a script similar to that, just perhaps some modifications required

$200 USD in 5 days
(7 Reviews)
3.9
crezmer

hi sir, i am ready to work in this project. i have made many sites in php. so please give me a chance to work with you. thanks.

$100 USD in 2 days
(0 Reviews)
0.0
toronto0011

Hi We have done a similar job. See pm Thank you Project manger More

$30 USD in 2 days
(0 Reviews)
0.0