Wednesday, December 23, 2015

rsyslog and liblognorm will switch to libfastjson as replacement for json-c

We have been using json-c for quite a while now and had good success with it. However, recent problem reports and analysis indicate that we need to replace it in the future. Don't get me wrong: json-c is a solid piece of software, but we most probably use it much more intensely as the json-c developers ever anticipated. That's probably the actual root cause why we need to switch.

A main problem spot is performance: various experiments, profiler runs, code review and experimental implementations have prooven that json-c is a severe bottleneck. For example, in the evaluation of liblognorm v2 performance, we found out that json-c calls dominated processing time by more than 90%. Once we saw this, we dug down into the profiler and saw that the hashtable hash calculation as well as memory allocations took a large amount of overall processing time. We have submitted an initial performance enhancement PR to json-c which also got merged. That already removed processing time considerably. We continued on that path, resulting in a quite large second enhancement PR, which I withdrew due to disagreement with the json-c development lead.

A major problem for our application of json-c is that the hash table implementation beneath it is not a good match to our needs. We have been able to considerably speed it up by providing a new hash function (based on perl's hash function), but to really get the performance we need, we would need to replace that system. However, that is not possible, because json-c considers the hash tables part of its API. Actually, json-c considers each function, even internal ones, as part of the API, so it is very hard to make any changes at all.

Json-c also aims at becoming fully JSON compliant. It currently is not due to improper handling of NUL bytes, but the longer-term plan is to support NUL bytes this. While this is a good thing to do for a general json library, it is a performance killer for our use case. I know, because I faced that same problems with the libee implementation years ago, where we ditched it later in accordance with the CEE standards body board. I admit I also have some doubts if that change in json-c will actually happen, as it IMHO requires a total revamp of the API.

Also, the json-c project releases far to infrequently (have a look at recent json-c releases, the last one was April, 2014). And then, it takes the usual additonal timelag for distros to pick up the new version. So even if we could successfully submit further performance-enhancing PRs to json-c, it would take an awful lot of time before we could actually use them. I would definitely not like to create private packages for rsyslog users, as this could break other parts of a system.

Finally, json-c contains a real bad race bug in reference counting, which can cause rsyslog to segfault under some conditions. A proposed fix was unfortunately not accepted by the json-c development lead, so this is an open issue. Even if it were, it would probably take a long time until the release of the fixed version and its availability in standard packages.

In conclusion and after a lot of thinking, we decided that it is best to fork json-c, which we than did. The new project is named libfastjson. As the name suggests, it's focus is on performance. It will not try to be 100% JSON compliant. We will not support NUL characters in a standards-conformant way. I expect this not to be a big deal, as json-c also never did this, and the number of complaints seem to be very low. So libfastjson will not aim to be  general purpose json library, but one that offers high performance at some functionality cost and works well in highly threaded applications. Note that we will also reduce the number of API functions and especially remove those that we do not need and that cost performance. Also, the data store will probably be changed from the current hashtable-only system to something more appropriate to our tasks.

Libfastjson already includes many performance enhancement changes and a solid fix for the reference counting bug. Up until that bug, we planned to release in the Feb..April 2016 time frame, together with liblognorm v2. Now this has changed, and we actually did a kind of emergency release (0.99.0) because of the race bug. The source tarball is already available. We are working on packages in the rsyslog repositories (Ubuntu is already done). Rsyslog packages are not yet build against it, but we may do an refresh after the holiday period.

Rsyslog 8.15.0 optionally builds against libfastjson (it is preferred if available). Due to the race bug, we have decided that rsyslog 8.16.0 will require libfastjson.

A side-note is due: we have been thinking about a replacement for the variable subsystem since summer or so. We envision that there are capabilities even beyond of what libfastjson can do. So we still consider this project and think it is useful. In regard to liblognorm, however, we need to provide a more generic interface, and libfastjson is a good match here. Also, we do not know how long it will take until we replace the variable system. We don't even know if we actually can do it time-wise.

No comments:

Automating Coverity Scan with a complex TravisCI build matrix

This is how you can automate Coverity Scan using Travis CI - especially if you have a complex build matrix: create an additional matrix en...