Python script to automate data combination of CSVs

Closed

Description

This project is to write a python script and turn it into a .exe that can run on windows 7 or later, and a .app that can run on mac OSX 10.4 or later. The python script, the .exe, and the .app are all deliverables.

OVERVIEW

We are a renewable energy company that works in buildings. In every building we work in, we get dozens of .CSV, .XLS, and .XLSX files that need to be combined into a single CSV. Each file has differing header information in the top in multiple rows (not necessarily just 2, it could be 20 rows of header info), then multiple columns of data beneath the header information. Some files may only have two columns of data, whereas other files may have 20 or more columns of data.

Each file also has a column (or two) of date/time stamps, showing the date and time each measurement in a row was taken. The time stamps from different files do not necessarily start at the same time – e.g. one may start at 3pm on July 24th, whereas another may start at 7pm on July 27th. Furthermore, the time stamps do not increment in the same step size – e.g. one may increase by five minutes per row, one may increase by 15 minutes per row, and a third may increase randomly per row.

Here is what the program must do:

• Read every CSV, XLS, and XLSX file in whatever folder the .app or .exe is placed. This way we can copy the app to a new folder full of csvs to combine.

• Combine all the CSVs, XLS, and XLSX files into one CSV, with header information from each individual file and the name of that file preserved above the columns from that file.

• Make a master time stamp column on the left side of the document:

o Find the smallest increment of any date/time stamp column – e.g. if there are time stamps incrementing at 1 minute, 5 minutes, and random minutes, make the master column of date/time stamps on the left increment at 1 minute per row.

o The master date/time stamp should start with the date/time of the earliest data point from any file, and end at the date/time of the latest data point from any column.

• Line up and space out all data columns so they correspond correctly to the master time stamp on the left.

• Delete redundant date/time stamp columns so there is only the master date/time stamp column on the left and no other date/time columns

• Make sure all header information is preserved over the proper columns

o Add in the name of the original file above the columns!

• Write everything to a new CSV called “[url removed, login to view]”

I’ve uploaded examples of data files of the type that will need to be combined, as well as an example of the final output file needed so you can see what I’m talking about. The examples are "14836...", "18102", and "condensate pump". The example output file is "combined_data". (Please note, in “[url removed, login to view]", columns AD through the end are NOT empty. There is data in row 6615, for example.)

Thank you for your help, and I look forward to working with you!

Best,

Brenden

Skills: Data Processing, Excel, Python, Software Architecture

See more: xls header, what to program, thank-you note examples, row 44, python look for file, python get type, python find, o 1 space, just energy, i need help to write an ad, find python, find master program, find a master, examples of deliverables, example of a note, app script examples, ad script examples, ad copy write, ad copy examples, python data combination, python end, final data, best way to get out of work, xlsx, write a python script

Project ID: #2645409

Awarded to:

networls

We have a group of python developer, who can complete your task within few days. On freelancer we have done similar project related to csv files(you can check our python reviews) Thanks, i hope you will give us t More

$150 USD in 3 days
(3 Reviews)
2.7

8 freelancers are bidding on average $164 for this job

srinichal

I can deliver the script to your specs

$200 USD in 8 days
(56 Reviews)
6.5
gangabass

I can do this for you. See PMB for details.

$250 USD in 2 days
(118 Reviews)
5.9
sofina2006

Hi Sir, it is a simple parsing project. Of course the details must be considered in order to do a good job, but the core is quite easy to implement. Do not hesitate to ask me for any question. best regards

$150 USD in 4 days
(3 Reviews)
1.8
theonedev

Hi! I've read your (very detailed) project description and I believe I can leverage Python and a few vendor libraries to accomplish what you need in the shortest amount of time necessary. I have sent you a PM about my More

$230 USD in 3 days
(0 Reviews)
0.0
euripedesrocha

Hi, I can solve your task!

$100 USD in 5 days
(0 Reviews)
0.0
mrezam

I have a lot of experience with Python, Windows Scripts and or Linux Shell Scripts. I have also done a very similar work for a previous employer (which was under linux but with a much more complex text file structure). More

$200 USD in 2 days
(0 Reviews)
1.0
anisur91

Please see PMB skype id The Administrator removed this message for containing contact details which breaches our Terms of Service

$30 USD in 1 day
(1 Review)
0.0