Project Description:
I need a MP3 crawler / spider that would crawl websites from around the web for audio files and add them into a database with all of the information associated with the file such as title, artist, size, bitrate, length, and source, etc.
It must also contain all the ID3 tag information as well (this has to be done remotely, we will NOT be saving audio files to the server).
The script must also have the ability to check the links in the database and ensure the files still exist in a cron during off-peak traffic hours so it wont choke the server.
This script should be like Mp3skull.com, very fast and efficient.
The script would be hosted on a high-end dedicated server, so server resources shouldn't be a concern, but I still expect the script to be perfectly optimized, commented, indented and easy to extend(multiple databases, more features etc).
People having previous experience will be preferred, will also require a portfolio or examples of previous jobs.
This will be integrated in a high-traffic music site so it should be able to handle huge amounts of traffic.