Enhance Bash Script: CSV parser
This project was successfully completed by PerlIsFun for $59 USD in a day.Get free quotes for a project like this
Project Budget$30 - $250 USD
Completed In1 day
I have a bash script that accepts an input csv file, allows for several command line options such as delimiter and rolls up details on the level of duplication of fields on each column of data into a report. It can handle files with millions of rows by pulling one column of data into memory at a time, writing a temporary file of the most duplicated fields (e.g it does something similar to this for each field in the input file: cut -d\, -f1 [url removed, login to view] | sort | uniq -ci | sort -nr | head -20)
The script requires some modifications and enhancements including:
* Better parsing of csv files. It handles files with rows like: foo,bar,baz or "foo","bar","baz". But it has issues parsing "foo",123,"bar".
* Certain fields require special parsing. e.g. I would like the option to treat [url removed, login to view] and [url removed, login to view] and all it's variants as the same URL and therefore get counted as a dupe.
There are a few other tweaks of similar nature that I would like to get incorporated into the script which we can discuss. Looking forward to hearing from a bash guru.
Looking to make some money?
- Set your budget and the timeframe
- Outline your proposal
- Get paid for your work
Hire Freelancers who also bid on this project
Looking for work?
Work on projects like this and make money from home!Sign Up Now
- The New York Times
- Wall Street Journal
- Times Online