Build a data merging system - python + celery + redis + mongodb
Paid on delivery
I am looking for a skilled Python developer to build a database update system using RabbitMQ and MongoDB. The purpose of the system is to update our database on demand to ensure that it is always up to date.
The objective to merge candidate application data from three data sources - linkedin jobs, naukri and naukri rms. Each job will have a unique job_id and all the applicants from each data source will be tagged to one unique job_id.
[login to view URL] 1- A service will capture the data from each data source and upsert into each collection. eg. linkedin_jobs, naukri, naukri_rms. Since this service can run any number of times, the operation will always be upsert so that duplicates are removed
2. Stage 2- Upon completion of the upsert, there will be a collection called merged_data that will merge data from all three collections for a given job_id. This script will be based on a logic that we will provide. At the end of each upsert into stage1, the system will start a script to update the merged_data.
Your job will be to build and test these python+flask+celery+redis+mongodb scripts locally and then deploy and test them on AWS EC2.
Ideal Skills and Experience:
- Strong proficiency in Python programming
- Experience with Celery and RabbitMQ/ Redis and MongoDB
- Ability to design and implement an efficient and scalable system for small databases
- Familiarity with database optimization techniques
Frequency of Updates:
- The updates will be on demand, so the system should be able to handle updates as soon as they are requested.
Size of Database:
- The database being updated is small (less than 1GB), so the system should be optimized for efficient updates of smaller databases.
Project ID: #36684826