I have a custom data feed (CDF) that gets posted to a secure FTP site every hour by my technology partner. Each file has approximately 2 million rows and 30 headers.
1)Read and understand the CDF file and its columns. CDF documentation will be provided.
2)Create a MySQL database schema
3)Create a table RM_CDF_EVENT_LOG. This table should have all the columns defined in the CDF. We need to add couple of extra columns called FILENAME, MODIFIED_ON (DEFAULT SYSDATE) to the table
4)Partition the table by DATETIME column in CDF and FILENAME
5)Create a script that would do the following
a) Download the file from a FTP server
b) Run GPG on the file to decrypt it (can stubbed for now)
c) Run unzip/untar on the file to extract the file
d) Create a new partition in the tbale for the DATETIME and FILENAME
e) Break apart column data (separated by spaces) into separate rows
f) Insert rows from the file into the table
g) Set this script to be executed as a cron job to be executed every 30 minutes
6)Script can be written in any language (Java, PHP, C, etc)
7)If possible we need to use Apache Hadoop to break apart the rows that are delimited by spaces inside a column. (This is not a requirement but is considered a bonus if you can do it)
17 freelancers are bidding on average $475 for this job
Hello We Understand the project and we can create a Data Feed Reader and Update the Database as cron Job written in PHP or ASP.NET Please check PMB fore more details Thanks Raj
Hi! I have gone through your requirement and i am glad that i can accomplish this task, i would be more interested to speak to you on IM. Pls give us an opportunity to work with you.