I need a software to extract all cached pages from google cachefor an specific domain.
Example: I input: [login to view URL]
It has to check one by one each cached result at google and extract it into an html or any other extension file. File must be saved using folders for every slash and ending with a filename according with its original url.
Example cached page:
[login to view URL]
So saved copy at root folder would be: anyfolder/anyotherfodler/[[[login to view URL]]]
Example cached page:
[login to view URL]
So saved copy at root folder would be: anyfolder/anyotherfolder/[[[login to view URL]]]
[[ ]] = filename
It has to create folders for every slash " / " and then save the full file name ( [login to view URL] for the last example; or [login to view URL] in the first example because it is ending with an slash, so no filename is given, [login to view URL] must be created on that folder)
What I want to do is just recover sites from google cache, saving cached paths.
Of course Google cache top frame must be removed for any file.
Also it needs proxy support to avoid google captchas after some requests.
If you don't get anything please contact me.
Hi sir,
I am scraping expert, I have did too many similar projects, please check my feedback then you will know.
Can you tell me more details? then I will provide demo data for you.
Thanks,
Kimi