I need a simple script/program that goes through a list of URLs (stored in a .txt file, one URL per line), and extracts two pieces of data from the source code on each URL. The bits of source code that need identifying/extracting look like this:
2. "url":"https:\/\/[login to view URL]\/ID\/[login to view URL]
NAME, ID and FILENAME in the examples above will be different for each URL. NAME and FILENAME are the values that should during extraction be saved, preferably to a .csv file, along with the source URL, with the following column order:
URL, NAME, FILENAME
One URL+ extracted values per line.
There may be times a URL doesn't exist, or gives an error message, in which case it should simply be skipped, or output an error message in the .csv file for that particular line.
I'd also like there to be a an adjustable time limit on crawl frequency, if there are request limits for the site. One URL per 0.5 seconds by default.
This script/program needs to run on Windows 7 and later, 64-bit. A graphical interface isn't necessary as long as it's easy to run, it's more important the script/program's lightweight and fast, and budget not too high.
The list of URLs in the .txt file will be roughly 10,000. Can divulge the specific site name for the extraction (will need to be changed in the source code example) upon project start. It's fully legal. Open resource site.
Looking forward to your proposals!
14 freelancers are bidding on average $33 for this job
Hello, As an experienced software developer, I can provide you a C# based solution to extract the required values from the URLs. Would you please share some example URLs so that I can check? Thanks
I am well experienced with these scripts and can complete this in today. payment will be after completion of work only. rest we can discuss on chat. thanks for your time