The project is a scraper which is able to continuously crawl different vendors websites and find various kinds of IoT/connected devices firmware files available on vendors web sites and download the files as well as their details in a centralized database. It should be able to download only the binary firmware files not text files such as release notes or firmware installers, be able to download new files as soon as they are available on the vendors web site( the time to send a new request to sync should be able to be configured per vendor), and be able to support adding credentials or the other required info by the vendor to be able to download the firmware file or get its details. In case a vendor provides an installer for the specific firmware, and not the firmware itself, if the download link can be extracted from the installer, it should be found out and the firmware should be downloaded. If there is no way to receive the firmware without installation, the installer should be simply downloaded and marked as “installer” in the database ( which needs another binary field called "installer" to be added to the database design ) It should also be clear in the DB that for which vendor & firmware this installer is. Note that:
1. The code should be written in Python
2. The system which the script needs to be executed does not have any GUI, so it should not need installing any GUI app as a dependency.
3. The list of vendors will be provided by us and it must be able to download all of the firmwares for all of the devices as well as all of the different versions of a specific firmware available on the vendors websites without ignoring any of them. We are open to any suggestions provided by the Developer regarding new vendors but developing the scraper for them requires confirmation from us.
The output generated by the system should be able to download the firmware files in the path we define and save their details in a SQLite database. The mandatory database fields include ( Manufacturer, Model, Version, Type, Name, Release Date(if available) ) i.e. ( Cisco, Video Surveillance 6030 IP Camera, 2.7.0, IP Camera, [login to view URL], 21/08/2015 ) There is a non-mandatory binary field which indicates if the device is discontinued or not depending on the fact that vendor mention that on the website or not.
1) Python Source Code including the comments in the code explaining each function & its details. We should be able to give any required input as an argument and execute it as one line command in the Linux terminal.
3) Manual to install, configure and use the scraper
54 freelancers are bidding on average €1087 for this job
Hi, I'm very interested in your project as I have rich experience in web scrapping for such features based on Python language. I would like to discuss more details via chat. Thanks
hi, there. I have read your description carefully. I am very interested in your project. I have high Python skill and have experience with web scraping. Please contact me and discuss further. Thanks.