I have a scraping file that scrapes data from avaialble timeslots on a website, and saves the result into a database.
1) For some reason, the first timeslots scraped are done in 2/3 seconds per timeslot, whereas the last timeslots scraped take up to 15 seconds per timeslot. This makes not much sense and I believe there is a fix for that.
2) Second, the scraper is adding the timeslots into the target website, creating a cart with a huge amount. I believe this is not very good either and may eplain, maybe, the first issue. I believe we should be able to fix that as well.
3) Last, the scraper runs every hour via cron. It deletes the previous database info and adds the new results. The problem is that when it deletes the info, the scraper takes up to 30/40 minutes (probably less with fixing 1st issue) to scrap the data, so during this time I am loosing data on the database. I would need to save in the database only when scraper has finished its job or any other alternative solution.