We have an S3 server where our customers can upload various files, including plists and pdfs.
We need to perform on a regular basis the following operations, using python on an EC2 instance:
1) Step 1
- Read pending S3 log files, and in each log file:
* List all plist files uploaded by customers
* List all pdf files uploaded by customers
- Archive the read S3 files (so that they are not processed a second time later)
2) Step 2
- For each listed pdf file, do the following
* create an image for each page, in 3 different sizes (sizes and naming conventions to be specified later) and store the images on the S3
* list all internal and external links in the pdf, with the position and size of the "rect" of the link; put this list in a json file on the S3
3) Step 3
- For each plist file, convert it to json, and store the json file on the S3.
The deliverable is the python code. We will ask you to use the git repository on Github, and to publish your work regularly for us to monitor the progress of the work.
Please avoid generic bids such as " "Hello, Let me do it for you." (we will not read them).
In order to show that you have read and understood our requirements, please start your bid with "Python Project"; if your bid concerns only 1 or 2 of the steps specified above, please let us know clearly.