Update 3/7/13: I forgot do a collision search on Google for “csvlib” before releasing this. Apparenlty everyone calls their CSV handling utilities package “csvlib”. I’ve renamed mine CCSVLIB.
I keep finding myself in situations where I’ve got data in spreadsheet form but I want to perform some analysis or transformation on the data beyond my capabilities with LibreOffice Calc or Microsoft Excel. Then sometimes, I want to go the opposite direction, and have my software do output in a format that I can convert into a spreadsheet. Fortunately, both Calc and Excel can read and write comma-separated-values (CSV) files. CSV files are nice to work with. They’re plain text and thus easy to read or write from my own software.
Well, easy in theory. In practice, CSVs that come from different sources may use different formats (quotes vs. no quotes vs. optional quotes, are commas allowed in the data, etc.), which makes reading CSVs a little too painful for use in small one-off scripts and programs. In addition to inconsistent formats, the code necessary to correctly parse a CSV file is often larger than the code that performs whatever analysis or transformation I want.
I deal with CSV files enough that I decided to write my own CSV parsing library. I suppose I could have searched the Internet for someone else’s solution, but I needed a project, and I think parsing is fun. I also decided that if I was going to implement a CSV parsing library then I was going to do it right. I started by looking for a standard for the CSV format and found RFC 4180. After a couple of hours at my keyboard, I had a working parser and a decent data structure for pulling data from RFC 4180 CSVs into memory. My library came up in conversation with my supervisor a few days back (I’m a TA for CS 115 at UK), and she mentioned that she wanted a copy. I decided I wanted to release the software, so I added the capability to write CSV files, polished the API, and wrote some documentation.
What makes CCSVLIB a better choice than any of the other CSV parsing libraries for C/C++? Objectively speaking, nothing, or at least nothing that I know of. I haven’t taken a close look at any of the other stuff that’s available. I can say, based on a cursory Google search, that there aren’t many implementations of RFC 4180. CCSVLIB implements RFC 4180 (well, at least mostly), so it should be able to consume most sane CSV files. Also, CCSVLIB is simple, short, and well documented. The current version is 1051 lines of C, about 400 of which are comments.