Monday, February 12, 2018

simplifying rsyslog JSON generation

With RESTful APIs, like for example ElasticSearch, you need to generate JSON strings. Rsyslog will soon do this in a very easy to use way. The current method is not hard either, but often looks a bit clumsy. The new way of doing things will most probably be part of the 8.33 release.

You now can define a template as follows:

template(name="outfmt" type="list" option.jsonf="on") {
         property(outname="@timestamp"
                  name="timereported" 
                  dateFormat="rfc3339" format="jsonf")
         property(outname="host"
                  name="hostname" format="jsonf")
         property(outname="severity"
                  name="syslogseverity-text" caseConversion="upper" format="jsonf")
         property(outname="facility"
                  name="syslogfacility-text" format="jsonf")
         property(outname="syslog-tag"
                  name="syslogtag" format="jsonf")
         property(outname="source"
                  name="app-name" format="jsonf")
         property(outname="message"
                  name="msg" format="jsonf")

     }

This will generate JSON. Here is a pretty-printed version of the generated output:
{
  "@timestamp": "2018-03-01T01:00:00+00:00",
  "host": "172.20.245.8",
  "severity": "DEBUG",
  "facility": "local4",
  "syslog-tag": "app[1666]",
  "source": "app",
  "message": " this is my syslog message"
}
Note: the actual output will be compact on a single "line", as this is most useful with RESTful APIs.

Future versions of rsyslog may see additional simplifications in generating the JSON. For example, I currently think about removing the need to give format="jsonf" for each property.

The functionality described here is being added via this pull request.

Wednesday, January 31, 2018

experimental debian rsyslog packages

We often receive requests for Debian packages. So far, we did not package for recent Debian, as the Debian maintainer, Michael Biebl, does an excellent job. Other than us, he is a real expert on Debian policies and infrastructure.

Nevertheless, we now took his package sources and gave the Suse Open Build Service a try. In the end result, we now seem to have usable Debian packages (and more) available at:


I would be very interested in your feedback on the first incarnation of this project. Is it useful? Is it something we should continue? Do you have any problems with the packages? Other suggestions? Please let us know.

Please node: should we decide that the project is worth keeping, the above URL will change. However, it we will give sufficiently advance notice. The current version is not suggested for production systems, at least not without trying it out on test-systems first!

Saturday, January 27, 2018

docker group security risk

The Docker doc spells out that there are security concerns of adding a user to the docker group. Unfortunately, they do not precisely give what the concern is. I guess that is a "security-by-obscurity" approach trying to avoid bad things. Practice show this isn't useful: the bad guys know anyways, and the casual user has a bad time understanding the actual risk involved.

It is considerable, so let me explain at least one risk (I have not tried exhaustively check security issues):  The containers are usually defined to run as root user. This permits you to bypass permission checks on the host.

Let’s assume a $USER is inside the docker group and otherwise has just installed docker. So he can run
$ docker run -v/etc:/malicious -ti --rm alpine
# cd /malicious
# vi sudoers
.... edit, write ...
# ctl-D
As such, the user can modify system config that he could not access otherwise. It’s a real risk. If you have a one-person “personal” machine/VM where the user has sudo permissions in any case … I’d say it’s no real issue.

The story is a different one on e.g. a CI machine.  It's easy to inject bad code into public pull requests, and so it'll run on the CI platform. Usually (before spectre/meltdown...), this was guarded by the (low) permissions of the CI worker user (if you run CI with a sudo-enabled user ... nothing changed). When you enable it to use docker, you now get this new class of attack vector. Don't get me wrong: I do NOT advocate against using docker in CI. Right the opposite, it's an excellent tool there. I just want to make you aware that you need to consider and mitigate another attack vector.

Tuesday, November 28, 2017

rsyslog 8.31 - an important release

Today, we release rsyslog 8.31. This is probably one of the biggest releases in the past couple of years. While it also offers great new functionality, what really important about it is the focus on further improved software quality.

Let's get a bit down on it. First let's mention some important new features:

And then we have a set of several hundred commits concerned with improved software quality:
  • testbench dynamic tests have been extended
  • coverage of different compilers and compiler options has been enhanced
  • more modules are automatically scanned by static analysis
  • daily Coverity scans were added to the QA system, which have proofen to be a very useful addition
  • more aggressive and automated testing with threading debuggers (valgrind's helgrind and clang thread sanitizer) has been added, also with great success
  • as a result of these actions, we could find and fix many small software defects.
  • and there also have been some big and important fixes, namely for imjournal, omelasticsearch, mmdblookup and the rsyslog core
Many users were involved in finding and fixing bugs, many of which all I do know from is there github handle. So rather than highlighting just some of them, I would like to refer to the github milestone issue tracker where all can be found. My sincere thanks to everyone for all their support!

I consider rsyslog release 8.31 a major step forward in our QA policy. We already had improved quite a bit and the state was already pretty good. However, with the changes introduced in 8.31, we make a big step forward, into a kind of next-gen QA policy. Of course, QA is a journey and not a "do once" target. So expect more good stuff upcoming in the next releases. We are not done yet!

I would like to use this opportunity to express a personal "thank yo" to Thomas Deutschmann, also know as Whissi for all his good advice and sometimes strong words that drive use towards better quality. While many folks have helped us with this, Whissi is consistenytly insisting on good QA policies for a long time now, and he always was persistent in fighting with me when I did not understand the value or did not want to. Whissi, I admit I sometimes wasn't too fond of you, but believe me, at the end of the day I *really* value what you are doing. You have my deepest respect. Looking forward to many more years of discussions!

Thursday, November 23, 2017

The clang thread sanitizer

Finding threading bugs is hard. Clang thread sanitizer makes it easier. The thread sanitizer instruments the to-be-tested code and emits useful information about actions that look suspicious (most importantly data races). This is a great aid in development and for QA. Thread sanitizer is faster than valgrind's helgind, which makes it applicable to more use cases. Note however that helgrind and thread sanitizer sometimes each detect issues that the other one does not.
This is how thread sanitizer can be activated:
  • install clang package (the OS package is usually good enough, but if you want to use clang 5.0, you can obtain packages from http://apt.llvm.org/)
  • export CC=clang // or CC=clang-5.0 for the LLVM packages
  • export CFLAGS="-g -fsanitize=thread -fno-omit-frame-pointer"
  • re-run configure (very important, else CFLAGS is not picked up!)
  • make clean (important, else make does not detect that it needs to build some files due to change of CFLAGS)
  • make
  • install as usual
If you came to this page trying to debug a rsyslog problem, we strongly suggest to run your instrumented version interactively. To do so:
  • stop the rsyslog system service
  • sudo -i (you usually need root privileges for a typical rsyslogd configuration)
  • execute /path/to/rsyslogd -n ...other options...
    here "/path/to" may not be required and often is just "/sbin" (so "/sbin/rsyslogd")
    "other options" is whatever is specified in your OS startup scripts, most often nothing
  • let rsyslog run; thread sanitizer will spit out messages to stdout/stderr (or nothing if all is well)
  • press ctl-c to terminate rsyslog run

Note that the thread sanitizer will display some false positives at the start (related to pthread_cancel, maybe localtime_r). The stack trace shall contain exact location info. If it does not, the ASAN_SYMBOLIZER is not correctly set, but usually it "just works".
Doc on thread sanitizer ist available here: https://clang.llvm.org/docs/ThreadSanitizer.html

Saturday, November 18, 2017

Automating Coverity Scan with a complex TravisCI build matrix

This is how you can automate Coverity Scan using Travis CI - especially if you have a complex build matrix:
  • create an additional matrix entry you will exclusively use for submission to Coverity
  • make sure you use your regular git master branch for the scans (so you can be sure you scan the real thing!)
  • schedule a Travis CI cron job (daily, if permitted by project size and Coverity submission allowance)
  • In that cron job, on the dedicated Coverity matrix entry:
  • cache Coverity's analysis tools on your own web site, download them from there during Travis CI VM preparation (Coverity doesn't like too-frequent downloads)
  • prepare your project for compilation as usual (autoreconf, configure, ...) - ensure that you build all source units, as you want a full scan
  • run the cov-int tool according to Coverity instructions
  • tar the build result
  • use the "manual" wget upload capability (doc on Coverity submission web form); make sure you use a secure Travis environment variable for your Coverity token
  • you will receive scan results via email as usual - if you like, automate email-to-issue creation for newly found defects
Actual Sample Code: rsyslog uses this method. Have a look at it's .travis.yml (search for "DO_COVERITY"). The scan run can be found here (search for "DO_CRON"), a script that calls another one ultimately doing the steps described above. These two scripts are called from .travis.yml.


Why so complicated? After all, there is a Coverity addon for Travis CI. That's right, and it probably works great for simple build matrices. Rsyslog uses Travis quite intensely. At the time of this writing, we spawn 8 VMs for different tests and test environments plus the one for cron jobs (so 9 in total). With the Coverity addon, this would result in 9 submissions per run, because the addon runs once per VM requested. Given that rsyslog currently has an allowance of 3 scans per day, a single run would use up all of our allowance plus 6 more which would not succeed. Even if it would work, it would be a big waste of Coverity resources and be pretty impolitely both to Coverity and fellow users of the free scan service.

With the above method, we have full control over what we submit to Coverity and how we do it.With our current scan allowance, we use exactly one scan per day, which leaves two left for cases where it is useful to run interim scans during development.

Why do you use master branch for the scan? Coverity recommends a dedicated branch.
In my opinion, software quality assurance works well only if it is automated and is applied to what actually is being "delivered". In that sense, real CI with one Coverity run per pull request (before we merge!) would be best. We can't do this due to Coverity allowance limitations. The next best thing is to do it as soon as possible after merge (daily in our case) and make sure it is actually run against the then-current state. Introducing a different branch just for scan purposes sounds counter-productive:

  • you either need to update it to full master branch immediately before the scan, in which case you can use master directly
  • or you do update the "Coverity branch" only on "special events" (or even manually), which would be counter the idea to have things checked as early as possible. Why would you withhold scanning something that you merged to master, aka "this is good to go"? Sounds like an approach to self-cheating to me...
In conclusion, using scanning master is the most appropriate way to automatically scan your project. Be bold enough to do that.

Does your approach work for non-Travis, e.g. Buildbot?
Of course, it works perfectly with any solution that permits you to run scheduled "builds". We also use Buildbot for rsyslog. We could have placed it there. Simply because of ease of use we have decided to use Travis for automating the scan as well. We could also have used a scheduled Buildbot instance. So the approach described here is universal. The only thing you really want is that the scan is initiated automatically at scheduled intervals - otherwise you could simply use the manual submit.

Wednesday, October 25, 2017

Time for a better Version Numbering Scheme!

The traditional major.minor.patchlevel versioning scheme is no longer of real use:

  • users want new features when they are ready, not when a new major version is crafted
  • there is a major-version-number-increase fear in open source development, thus major version bumps sometimes come very random (see Linux for example)
  • distros fire the major-version-number-increase fear because they are much more hesitant to accept new packages with increased major version
In conclusion, the major version number has become cosmetic for many projects. Take rsyslog as an example: when we switched to rsyslog scheduled releases, we also switched to just incrementing the minor version component, which now actually increments with each release - but nothing else. We still use the 8.x.0 scheme, where x is more or less the real version number. We keep that old scheme for cosmetic reasons, aka "people are used to it".

Some projects, most notably systemd, which probably invented that scheme, are more boldly: the have switched to a single ever-increasing integer as version (so you talk of e.g. version 221 vs 235 of systemd). This needs some adaption from the user side, but seems to be accepted by now.

And different other approach, known from Ubuntu, is to use "wine-versioning": just use the date. So we have 14.04, 16.04, 17.04 representing year and month of release. This also works well, but is unusual yet for single projects.

What made me think about this problem? When Jan started his log anonymizer project, we thought about how it should be versioned. We concluded that a single ever-incrementing version number is the right path to take. For this project, and most probably for all future ones. Maybe even rsyslog will at some time make the switch to a single version digit...

simplifying rsyslog JSON generation

With RESTful APIs, like for example ElasticSearch, you need to generate JSON strings. Rsyslog will soon do this in a very easy to use way. ...