Monday, February 08, 2010

The typical logging problem as viewed from syslog

I run into different syslog use cases from time to time. So I thought it is a good idea to express what I think the typical logging problem is. As I consider it the typical problem, syslog (and WinSyslog and rsyslog in specific) address most needs very well. What they spare is the analysis and correlation part, but other members of the family (like our log analyzer) and third parties care well for that.

So the typical logging problem, as seen from the syslog perspective, is:

  1. there exists events that need to be logged
  2. a single "higher-level" event E may consist of a
    number of fine-grained lower level events e_i
  3. each of the e_i's may be on different
    systems / proxies
  4. each e_i consists of a subset of properties
    p_j from a set of all possible common properties P
  5. in order to gain higher-level knowledge, the
    high-level event E must be reconstructed from
    e_i's obtained from *various* sources
  6. a transport mechanism must exist to move event
    e_i records from one system to another, e.g., to
    a central correlator
  7. systems from many different suppliers may be involved,
    resulting in different syntax and semantic of
    the higher-level objects
  8. there is potentially a massive amount of events
  9. events potentially need to be stored for
    an extended period of time
  10. quick review of at least the current event data
    (today, past week) is often desired
  11. there exists lots of noise data
  12. the data needs to be fed into backend processes,
    like billing systems


John Moehrke said...

In Healthcare this is the problem behind the "Account of Disclosures" problem. I am constantly blogging to help people understand this. Too many influential people think that one can capture at the E level.

Rainer said...

Hi John,

I feel (and share ;)) your pain.

Eric Fitzgerald (then of the Microsoft Event Log "Division") gave a pretty good -and brief- description of this "thin vs. fat log" problem on the loganalysis public mailing list. While the list archive has long gone away, I have quoted him in one of my (unfinished...) papers. It probably is a good read:

Search for "thin vs. fat".

I've also visited your blog and it contains some very good posts. I will keep an eye on it.

Thanks for sharing,

simplifying rsyslog JSON generation

With RESTful APIs, like for example ElasticSearch, you need to generate JSON strings. Rsyslog will soon do this in a very easy to use way. ...