I want to create a solution to load a dynamic csv file to SQL table. Following requirements need to be met:
1. The input CSV/TXT file could have any number of columns.
2. Column Delimiter of input file could be comma(,), pipe (|), tab (<tab>), basically it could be different for different file but will be standard in a single file
3. Row delimiter could be [CR][LF] or [LF] or [CR] etc
4. The solution should load all the contents of this file to a SQL server table. (sql version could be 2012/2014 etc)
5. The destination table will be created based on the column names in the header row of the input file
6. All the destination table columns should have sql data type nvarchar(max)
7. Destination table should have PAGE type compression in SQL server
8. Preferably the solution should be a SSIS package or a programatically generated and executed SSIS that loads any files into a table it creates.
9. Solution should just accept a path to the CSV/TXT file and should be able to figure out everything else automatically (number of columns, create table and then load)
10. I should be able to use this to load to SQL 2012, SQL 2014, SQL 2016
1. The input files I have are very heavy 50GB, 70GB, 80GB etc, so the solution needed should be able to work with such files and also load the the data quickly
2. I already have a few solutions:
a) Load csv into a single column table and then parse the column names and data to a table will multiple columns, this process is too slow, so its not acceptable
b) use C# script to read header row from csv file, create a table and then load the data using C#. This process is also slow
So the requirement is to have the solution as fast as a BCP task in SSIS. I am able to build a static BCP where filename is provided, column structure is static, this works fast but need is to make this dynamic.
1. Cannot use any thing that is patented to someone
2. Cannot used a paid library, SSIS addon, every thing should be developed in house
3. you can simply use C#, SSIS, script component etc.... microsoft provided features, so there is no question of using anything copyrighted
1. Everything should provide should be documented, so it can be easily reproduced by me on a new installation of SQL/SSIS/Visual Studio etc
2. Code needs to be commented, so every line to code has an explanation