I’m really excited to be able to talk about CSVTL, because I believe it’s the next big thing, and it’s been in the works for several years.
Basically, I had been in contact with one of my old high-school friends high up in the W3C working group. I had always wanted to join one of the specification groups, because I believe that is where the cutting edge work in the computer field, but they were always full or filled with industry empty suits or something.
But in one of my weekly phone calls to my friend, I was waxing really enthusiastic about CSVTL, and he said something to the effect that I should pursue it. I was really excited — finally, the chance to get in on an industry standards board! I talked for a while about writing books, and starting a CSVTL web site, and discussion groups, and such, and then realized that maybe I was jumping the gun and abusing my power as the soon-to-be-member on an industry standard board. But my friend was very understanding, and said that whatever I did and “whomever you can convince to be on the board is fine with us,” which means I can take it away! He also mentioned that since it is my brainchild, I should get all the credit.
I’m been studying all the classic standardization efforts, the really successful ones — Ada, C90, Java and JSRs — and think that I can take their fine examples to heart.
CSVTL is the Comma-Separated-Value Template Language. It allows for the specification of transform operations on a CSV file… and the amazing thing is that these transforms can be specified in CSV format themselves! It allows for massive gains in productivity. Let’s say that you had a CSV file with the following:
First name,Last name, Address, City, State, ZipCode… and you’d like to transform it into the following:
First name + Last name, Address, City, State, ZipCode… but only for zip codes 90000-99999.
How many lines of code would that take? Hundreds, right? You’d have to load a CSV parser, maybe a CSV pull parser or a CSV-DOM or one of the hundreds variants of CSV parsing, and iterate through… but imagine that instead you could write:
/in/, firstname, lastname,city,state,state,zip /filter/, zip <= 99999 && zip >= 90000 /iterate/, {firstname concat lastname}, city, ...imagine the productivity gains! Basically, each line of a CSVTL program starts with a command that determines the format of the rest of the line. Some commonly used commands:
/in/, label, label2, label3, ... specify the input map /out/, label, label2, label3, ... specify the output map /filter/, filter1, filter2, filter3, ... specify a filter /iterate/, column 1 calc, column 2 calc, column 3 calc, ... run an interation with the last specified /in/, /out/, and /filter/ /iterate [, in=name] [, out=name] [, filter=out]/, column 1 calc, column 2 calc, ... run an iteration using named parametersI’m currently trying to find people for the standards committee, and then I’m hoping that it will only take a year or two of discussion before we can come out with our first specification. I realize this is pretty quick work for a standards committee, but I am fully possessed of the importance of this standard.
XML is a very “trendy” and “popular” format nowadays. Many people that I corner and talk with about CSVTL bring up the question, “Why not use XML?” This is an excellent, if uninformed, question. As part of the CSV libraries we plan to provide conversion utilities to map from XML to CSV, allowing you to migrate your legacy XML data into the latest CSV specification. For example:
5 80 80 aaa bbbrun:
xml2csv client-setup.xml client-setup.csv proxy_info/socks_version, 5 proxy_info/socks_server_port, 80 proxy_info/http_server_port, 80 proxy_info/aliases/alias, aaa proxy_info/aliases/alias, bbbObviously, the expressive power of CSV is much greater, as equivalent information is expressed in much less space. Note also that CSV format lends itself to manipulation by standard Unix tools – for example, a hierarchical search that would require an XPath processor in XML can instead be done by grep.
As an extension of CSVXML, we also plan to provide a reference implementation of a CSVXHTML server. Imagine a web server where you don’t have to specify pages in old, kludgy HTML or XHTML, but instead can use CSVXHTML modified by CSVTL templates! This CSV-buttressed technology stack will no doubt prove an amazingly effective solution, as elegant and compelling as CORBA.
If you’ve read this far — this is a joke! It’s not as if someone would take a data format and try to make a language out of it. Why, such an abomination would be as bad as… XSLT… oh, never mind.
Comments are moderated whenever I remember that I have a blog.