I need a software for crawling the net (can specify top domain - such as .com / .in / .net / .us etc.) and search through meta-tags / site content based on text (regular expression).
The purpose is for example get Wordpress based sites (including their version). The output will be - Site Name, Site URL, Site Version (must extract from web - regular expression example), Date and Time of the extract.
6 freelancers are bidding on average $87 for this job
Hi. I'm experienced in programming crawlers that, given a certain toplevel site, can search the site and look for links and content within these links to extract the data requested. Regards
I am an expert Perl programmer and I have a good knowledge of regular expressions (regex). I already planned similar tools (using Google results to find websites that match the regex). Greetings!