Tuesday, November 22, 2016

Would creating a simple Linux log file shipper make sense?

I currently think about creating a very basic shipper for log files, but wonder if it really makes sense. I am especially concerned if good tools already exists. Being lazy, I thought I ask for some wisdom from those in the know before investing more time to search solutions and weigh their quality.

I've more than once read that logstash is far too heavy for a simple shipper, and I've also heard that rsyslog is also sometimes a bit heavy (albeit much lighter) for the purpose. I think with reasonable effort we could create a tool that

  • monitors text files (much like imfile does) and pulls new entries from them
  • does NOT further process or transform these logs
  • sends the resulting file to a very limited number of destionations (for starters, I'd say syslog protocol only)
  • with the focus on being very lightweight, intentionnally not implementing anything complex.
Would this be useful for you? What would be the minimal feature set you need in order to make it useful? Does something like this already exist? Is it really needed or is a stripped-down rsyslog config sufficient?

I'd be grateful for any thoughts in this direction.

Tuesday, August 23, 2016

rsyslog error reporting improved

Rsyslog provides many up-to-the point error messages for config file and operational problems. These immensly helps when troubleshooting issues. Unfortunately, many users never see them. The prime reason is that most distros do never log syslog.* messages and so they are just throw away and invisible to the user. While we have been trying to make distros change their defaults, this has not been very successful. The result is a lot of user frustration and fruitless support work for the community -- many things can very simple be resolved if only the error message is seen and acted on.

We have now changed our approach to this. Starting with v8.21, rsyslog now by default logs its messages via the syslog API instead of processing them internally. This is a big plus especially on systems running systemd journal: messages from rsyslogd will now show up when giving

$ systemctl status rsyslog.service

This is the place where nowadays error messages are expected and this is definitely a place where the typical administrator will see them. So while this change causes the need for some config adjustment on few exotic installations (more below), we expect this to be something that will generally improve the rsyslog user experience.

Along the same lines, we will also work on some better error reporting especially for TLS and queue-related issues, which turn out high in rsyslog suport discussions.

Some fine details on the change of behaviour:

Note: you can usually skip reading the rest of this post if you run only a single instance of rsyslog and do so with more or less default configuration.

The new behaviour was actually available for longer, It needed to be explicitly turned on in rsyslog.conf via

global(processInternalMessages="off")

Of course, distros didn't do that by default. Also, it required rsyslog to be build with liblogging-stdlog, what many distros do not do. While our intent when we introduced this capability was to provide the better error logging we now have, it simply did not turn out in practice. The original approach was that it was less intrusive. The new method uses the native syslog() API if liblogging-stdlog is not available, so the setting always works (we even consider moving away from liblogging-stdlog, as we see this wasn't really adopted). In essence, we have primarily changed the default setting for the "processInternalMessages" parameter. This means that by default, internal messages are no longer logged via the internal bridge to rsyslog but via the syslog() API call [either directly or
via liblogging). For the typical single-rsyslogd-instance installation this is mostly unnoticable (except for some additional latency). If multiple instances are run, only the "main" (the one processing system log messages) will see all messages. To return to the old behaviour, do either of those two:

  1. add in rsyslog.conf:
    global(processInternalMessages="on")
  2. export the environment variable RSYSLOG_DFLT_LOG_INTERNAL=1This will set a new default - the value can still be overwritten via rsyslog.conf (method 1). Note that the environment variable must be set in your startup script (which one is depending on your init system or systemd configuration).

Note that in most cases even in multiple-instance-setups rsyslog error messages were thrown away. So even in this case the behaviour is superior to the previous state - at least errors are now properly being recorded. This also means that even in multiple-instance-setups it often makes sense to keep the new default!

Friday, May 13, 2016

rsyslog's master-candidate branch gone away

Thanks to the new improved CI workflow, we do no longer manually need to do a final check of pull requests. I have used the new system for roughly two weeks now without any problems. Consequently, I have just removed the master-candiate branch from our git (with a backup "just in case" currently remaining in the adiscon git repository).

Anyone contributing, please check the CI status of your PRs, as we can only merge things that pass the CI run. Note, though, that there still is a very limited set of tests which may falsly fail. Their number is shrinking, and I usually catch these relatively shortly and restart them. If in doubt, please add a comment to the PR and I'll investigate.

Saturday, April 23, 2016

Improvements in CI environment and workflow change

Roughly one and a half year ago we at the rsyslog project started to get serious with CI, that time with travis only. Kudos to Thomas D. "whissi" for suggesting this and helping us to setup the initial system. In aid of CI, we have changed to a purely Pull Request (PR) driven develpoment model, and have made great success with that.

Over time, we have added more CI ressources (thanks to Digital Ocean for capacity sponsorship!) and begun to use Buildbot to drive those. Buildbot is a great tool, and has helped us tremendously to further improve software quality. Unfortunately, though, it does not offer as close integration into (guthub) PRs as Travis does. This resulted in a workflow where we had all PRs initially checked by Travis and, if all went well, I manually merged them to master-candidate branch, which Buildbot monitored. In those infrequent cases where the buildbot tests detected problems, I needed to manually contact the PR submittors. This worked well, but required some effort on my part.

The past two week we designed and implemented a small script that integrates github with buildbot much like Travis does. In essence, a new PR (or an update to an existing one) now automatically initiates the buildbot build AND the result is shown right on github inside the PR. That's pretty sweet as it a) keeps submittors informed of everything, b) provides even better coverage of multiple platfrom testing and c) saves me from a lot of manual labor. Note that at the moment we see some infrequent quirks from this system (like some buildbot slaves not reporting, probably due to temporary network issues), but it already works much better than the old manual system. Also, I still have the capability to check things manually if there is a quirk.

As a consequence, we will change the workflow once again, removing master-candidate branch from it. Now that each and every PR is checked with all checks we have, there is no need to have an interim step when finally merging.