Currently a user visits the site to upload music (mp3s in a zip file) and submit. The site is based on Wordpress so these become a pending post for the admins. We log in and either approve (publish) the content or delete it. So at the moment "publishing" causes the server to take the zip file, unzip it extracting only MP3s (so people can't upload nasties) makes a new zip with only the mp3s in and then uploads the individual mp3s and the zip file to Amazon S3 for serving on the site in our streaming player.
So that all works, the data is fed back to our streaming player and the zip files go to the download buttons - but it is currently a little too simplistic. For a start off all the processing happens on our web servers serving up pages to the public it uses unnecessary resources to do so. Also we want to incorporate FFmpeg into the script to encode the streaming mp3's to 128k (and remove the artwork in the process as this makes streaming on mobile take anything up to 40 seconds to start!) leaving the original mp3s intact for individual downloads via the player and to make the new zip file for download the way that happens now.
Most of that script is built but it could do with perhaps being optimised, but the biggest thing we need to change is that this happens on the web server itself. We're running micro EC2 instances which will expand and contract with demand for the end user. We dont want to install FFmpeg on these, which means we can't keep a spare one for publishing the uploads (as when we publish the processing happens on that server) to keep the uploads in sync across all the servers we've got GlusterFS running on another instance with a 12gb disk and fall over setup, the uploaded files go here and its automatically mounted on all the EC2 instances that launch, this coupled with the RDS database mean the same data is on every instance - but when we publish we'd like to trigger the mentioned unzip, ffmpeg, rezip and upload to happen from the server with the gluster file system on (and thus where the zip files are actually located) so all the processing is happening away from the server issuing the web requests.
We need to make the script do this (via Curl? - or any other method that works reliably) and have it so we can easily add extra functions once done, for instance there are plenty of demo scripts that use FFmpeg to make a video from the audio source and provided images and upload to Youtube - if this could happen on publish along with auto tweets, pins, etc then promotion happens as and when a post is published)
We also need the script to run as a background task rather than clicking submit and having the php timeout settings set to max while Wordpress waits for the post to go live...if we could submit, and have jobs queue up on the server (eg we could submit 30 posts at once and have all the jobs queue) and then the posts automatically go live once the processing is complete, this would be the ideal setup.
The basic functions are written and working, so it requires writing nothing to little from scratch and the servers are setup to put into bullet points we need the follow changing in our process.
Optimise current script.
Excuse script on the GlusterFS instance so processing happens on servers away from the web servers
Incorporate FFmpeg into the script to encode 128kbps files for the streaming player
Have script work in background enabling us to publish multiple posts at once which go live when the processing is completed on the other server and the info fed back to the wordpress database.
Uploaders e-mail is collected to Mailchimp database on upload - they should be informed by e-mail once their post goes live.
We imagine for someone with the required skills this would take no more than 3-4 hours for work adapting the script.
For more information just e-mail. Please quote "Script Optimisation" in your application so we know this has been read through.