Hello,
I have worked with large databases with over 5 billion rows in various database systems (PostgreSQL, MySQL, MongoDB, MSSQL Server) and large data files (for example CSV files of over 10 GB in size).
I have experience with many data formats and structures both open source and proprietary.
My programming language of choice is Python and I believe it is the right tool for this job (it has lots of built-in modules as well as separate packages that enable it to do just about anything).
It is difficult to estimate the amount of work needed for your project without knowing the data file format, the validation & data mining process details so I just chose a value that seemed reasonable for the budget you advertised.
If you can provide more details about the tasks that need to be performed it would be very helpful.
Thank you for your time!
Best wishes,
Ionut