Friday, December 18, 2015

rsyslog release policy issues

The usual end of the year release policy discussion has begun on the rsyslog mailing list and I wanted to post some thoughts here for broader audience and easy access in the future. Enjoy ;)

Up until ~15 month ago, we released when there was need to. Need was defined as

- important enough (set of bugfixes)
- new functionality

This resulted in various releases. We had the stable/devel releases. Stable releases were rare, devel frequent.

Now, we have scheduled releases. Actually, a release is triggered when we hit a certain calender date, irrelevant of whether or not there is need to release (there is always one or two minor fixes, so we will probably never exprience a totally blank release). We also have switched to stable releases only, and done so without grief (basically because a) we have improved testing and b) users didn't use devel at all).

I just dug into the old discussion. A good entry point is probably this here, where we talk about patches:

The new system works reasonably well. It has it's quircks, though. Let's look at a concrete example:

8.14.0, to me, was an absolutely horrible release. The worst we have done in the past 2 to 3 years. I worked hard on fixing some real bad race issues with JSON variables. Friday before the release I was ready to release that work, which would be really useful for folks that make heavy use of those variables. Then, over the weekend and Monday, it turned out that we may get unwanted regressions that weren't detected earlier (NO testbench can mimic a heavy-used production system, so let's not get into "we need better tests" blurb). The end result was that I pulled the plug on release day, and what we finally released was 8.13.0 plus a few small things. All problems with variables persisted. If I had have half a week to a week (don't remember exactly) more, we could have done a real release instead of the 8.13 re-incarnation. But, hey, we run on a schedule.

Now 8.15.0 fixes these problems (except for the json-c induced segfault, which we cannot fix in rsyslog). I also has all other "8.14" enhancements and fixes and so is actually worth 3 month of work. It is a *very heavy* release. Usually, I'd never released such a fat release shortly before the holiday period. Not that I distrust it, and we really got some new testing capabilites (really, really much better), so it is probably the most solid release for a longer time (besides the small quirk with the missing testbench files). But in general I don't like to do releases when I know there is very limited resources available to deal with problems. That's the old datacenter guy in me. But, again, hey, we run on a schedule.

There have similiar occasions in the past 14 month. That's the downside. And due to the 6-week cycle things usually do not get really bad.

The scheduled model has a lot of good things as well. First of all, everyone (users and contributors) know when the next release will be. This also means you can promise to include something into a specific release. However, usually users know when the release happens, but not what will be part of it, so in a sense it's not much better than before IMO. The new model has advantages for me: less releases mean less work. Also, I do not longer really need to think about when to do a release, which feature is important engouh and so. I just look at the calender and know that, for example, in 2016, November 15th we will have a release, no matter if I am present, no matter what is done code-wise etc (we actually had, for the first time everm a release while I was in vacation and it went really well as I learned later). That really eases my task.

All of this bases on the "we release every 6 weeks, interim releases happen only for emergencies and anything else may be pulled as patches" policy. If we now begin to say "this problem is inconvenient to ..{pick somebody}", we need to do a re-release we get into trouble. I wonder which groups of "sombody" are important enough to grant non-emergency releases. Are only distro maintainers important enough? Probably not. So enterprise users? Mmmm.. maybe small enterprises as well? Who judges this? So let's assume every user is as important as every other (an idea I really like). If I then look at my change logs, I think I would need to release more frequent. In essence, I would need to release again when it is needed, which is, surprise, the as-needed schedule).

Rsyslog is not a project big enough to do an even more complex release schedule. To keep things managable to me, I need to release either

a) as-needed

b) on schedule (except for *true* emergencies)

And *that all* is the reason for my reluctance to break the release policy because this time distro maintainers experience the bug versus end users.

I am currently tempted to switch back to "as-needed" mode, even though this means more work for me. 

No comments: