However, it also seems useful to have the ability to serialize this data in the form of a text string. That string could be stored in a file for later reference, forensics or to feed some other tool capable of understanding the file format. And as the in-memory object model will be CEE based, and CEE defines such serialization formats, it seems obvious that the library should be able to generate serialization based on the CEE-defined and supported formats (note that does not necessarily means XML, it may be JSON or syslog structured data as well).
Looking at all this, the normalization library seems to consist of two largely independent (but co-operating) parts:
- the parser engine itself, that part that is used to actually normalize the input string according to the provided sample base and CEE definitions
- a CEE support library, which provides the plumbing for everything that is defined in CEE (like tags, field types and serialization formats)
The more I think about it, the more I think it is useful. So I'll probably split the core normalization library from the CEE part. This is not much effort, but opens up additional uses. I'll call the normalization part then liblognorm (or libeventnorm) and the CEE part libcee -- or something along these lines. Under this light, liblognorm may actually be a better name, because the parser part is more concernd about logs and log files instead of generic events (which often come in other format).
Again, feedback is appreciated!