Node CSV version 0.2 with streaming API
By David WORMS
Jul 2, 2012
Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.
The Node CSV parser in its version 0.2 has just been released. This version is a major enhancement as it aligned the parser with the best Node.js practice in respect of streams. The CSV parser behave both as a Stream Writer and a Stream Reader.
Be carefull, to achieve this goal, a few changes in the API were required which make the compatibility slightly broken.
Migration
I’m trying to remember all the changes in the API. I will keep this section updated with your suggestions in case I forget.
The functions ‘from*’ and ‘to*’ are now rewritten as ‘from.’ and ‘to.’. The ‘data’ event is now the ‘record’ event. The ‘data’ now recieved a stringified version of the ‘record’ event.
The new Stream API
This is the most important enhancement which was announced in my last post. This little schema illustrates the structure of the stream architecture from Node.js applied to the CSV parser:
|-----------| |---------|---------| |---------|
| | | | | | |
| | | CSV | | |
| | | | | | |
| Stream | | Writer | Reader | | Stream |
| Reader |.pipe(| API | API |).pipe(| Writer |)
| | | | | | |
| | | | | | |
|-----------| |---------|---------| |---------|
As you can see, this new version is fully compliant with the stream API. It is both a Stream Writer to send input data and a Stream Reader to access output data.
Example:
fs.createReadStream( './in' )
.pipe( csv() )
.pipe( fs.createWriteStream('./out') )
Convenient functionnalities
Alternatively, it comes with convenient functions accessible by the from
and to
properties. Some of those functions were already present in the 0.1 release and are simply renamed. For exemple, the csv.fromPath()
function is now csv.from.path()
. New functions have been added such a csv.to.string
.
Example:
csv()
.from.path( './in' )
.to.string( function(data){ console.log(data) } )
Documentation
Like I have done in the past in many projects like Mecano, now Nikita, the readme content has been reduced to a minimum and the documentation is generated directly from the source code. A small script was written specifically from that purpose. The idea is to document each function with comments written in a markdown syntax. A simple regexp parser reads each files, extracts the comment and writes markdown file inside a “./doc” folder. The doc folder is finally copied into the Jekyll directory of the website.
Note, at the time of this writing, the script needs some improvements and the API documentation needs to be reviewed and enhanced (check the markdown syntax, typos). Not being a native english speaker doesn’t help as well. As always, your contributions are appreciated.
Conclusion
Please try the new version and let me know how you feel with it.