Closed

need fast script to parse html using wget or curl

Hello,

The script should;

1- Crawl the webpage given

2- Parse all the urls in page with different regular expressions. (don't have to start with a href or http even)

for example: parse all urls with rar,zip,mp3 etc. extensions. parse all mediafire, rapidshare etc. urls.

3-It should be able to login or load cookies to login to specific webpages such as forums etc. to get the links

4-Must be fast as much as possible and stable :).

it can be shell script, perl, c etc. important part it should be fast and not use much resources. advices about platform or techics welcome.

below is an example which I can do till here, I need so many improvements

wget -q -U "Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:1.9.0.2) Gecko/20121223 Ubuntu/9.25 (jaunty) Firefox/3.8" [url removed, login to view] -e robots=off -O - | tr "\t\r\n'" ' "' | grep -i -o '"\(ht\|f\)tps\?:[^"]\+\(.gif\|.apk\|.rar\|.mkv\)"' | sed -e 's/^.*"\([^"]\+\)".*$/\1/g' | uniq

thanks in advance

Skills: Anything Goes, Engineering, Linux, Shell Script

See more: wget parse html, shell curl parse html, rapidshare, wget parse html shell, wget grep html, use wget parse html, using regular expressions, using expressions, use regular expressions, r.f. engineering, regular expressions in linux, regular expressions example, linux regular expressions, html 5 pl, example regular expressions, c++ parse html 5, fast script rar html, fast script rar, shell, welcome gif, in linux shell script, example shell script, shell script or, webpage improvements, uniq

About the Employer:
( 32 reviews ) Istanbul, Turkey

Project ID: #4072695

17 freelancers are bidding on average $154 for this job

SigmaVisual

I can help in your project, please check PMB and our ratings/reviews to get idea of our experience. Please let me know if you have any queries.

$225 USD in 5 days
(48 Reviews)
6.1
ebson

I checked the code, If u only want the regular expression I can give it in less than an hour

$30 USD in 0 days
(44 Reviews)
5.5
srinichal

I can deliver the project using regex

$180 USD in 5 days
(14 Reviews)
4.4
kandamunlabs

Please see private message.

$250 USD in 4 days
(11 Reviews)
3.7
asmodej

Hello! I would be glad to complete your project using Python. As you can see in my "Past Work" page, I have done very similar projects before (webpage scraping, automated posting). Your application will support adding More

$150 USD in 4 days
(2 Reviews)
3.6
mlambrichs

Obviously it's cool to write a oneliner, but I don't think it's wise reading your requirements. Read my PM and see if you can stand all my insults. ;-)

$225 USD in 3 days
(6 Reviews)
3.3
programer22

PHP5 standalone script Might be call from cron tab

$250 USD in 10 days
(1 Review)
2.8
morissette

I can do this within 24 hours of bid acceptance.

$120 USD in 1 day
(2 Reviews)
2.2
nithi87cool

Hi Dude, i have enough years of exp to fix your issue and rewrite the script. please see private message.

$100 USD in 1 day
(1 Review)
2.1
coderz1

I have good exposure to Linux (wget,curl, scripting) and C scripting and I can code your problem in a maximum of 2 days and can deliver it with all the features. I can also provide future support/changes free of cost.

$40 USD in 2 days
(1 Review)
1.2
apwaytechnology

I have worked with regex. I can solve your problem.

$200 USD in 15 days
(0 Reviews)
0.0
Paddy0

Hi, I have written similar scripts to this before, so I'm fully aware of the requirements and potential issues that would arise - Although I would obviously like to communicate with you before I begin coding to ascert More

$225 USD in 4 days
(0 Reviews)
0.0
PerlSQLMaster

Many years of Perl programming experience. Can do the job.

$150 USD in 3 days
(0 Reviews)
0.0
zigler

An expert of shell/linux/regex from search engine company. I can finish this job. Please tell me details. Thanks

$100 USD in 2 days
(0 Reviews)
0.0
ppan279

Hello, I am a seasoned webscrapper using Perl and have an extensive experience in shell scripting too. Perl is the easiest and most efficient technology for webscrapping jobs because of its robust regular expression s More

$180 USD in 5 days
(0 Reviews)
0.0
tmrlvi

I have written several web scrappers, including logging in using browser cookies. With some Python work, it is possible to finish this project in 4 days.

$150 USD in 4 days
(0 Reviews)
0.0
dudytz

I am a professional with 7 years of experience with web extraction and data processing.

$40 USD in 1 day
(0 Reviews)
0.0