The script should;
1- Crawl the webpage given
2- Parse all the urls in page with different regular expressions. (don't have to start with a href or http even)
for example: parse all urls with rar,zip,mp3 etc. extensions. parse all mediafire, rapidshare etc. urls.
3-It should be able to login or load cookies to login to specific webpages such as forums etc. to get the links
4-Must be fast as much as possible and stable :).
it can be shell script, perl, c etc. important part it should be fast and not use much resources. advices about platform or techics welcome.
below is an example which I can do till here, I need so many improvements
wget -q -U "Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:188.8.131.52) Gecko/20121223 Ubuntu/9.25 (jaunty) Firefox/3.8" http://rapidog.com -e robots=off -O - | tr "\t\r\n'" ' "' | grep -i -o '"\(ht\|f\)tps\?:[^"]\+\(.gif\|.apk\|.rar\|.mkv\)"' | sed -e 's/^.*"\([^"]\+\)".*$/\1/g' | uniq
thanks in advance