![]() |
|
| Daemon News Ezine | BSD News | BSD Mall | BSD Support Forum | BSD Advocacy | BSD Updates |
Logging Syslog to a DatabaseZbyszek Sobiecki, <kodz@slonce.com>
0. Problem.When there's a problem on your system, or in your network, the first thing you check are system logs. You identify which system you should check, then locate the logfile. Sometimes you even have to check your syslog configuration, only to discover that what you are looking for is not even logged, due to a misconfiguration. You may also run 'less', 'more' and 'grep', to start digging into it. It's nice when you find the answer to your problem in the last few lines of log, but what if you can't? What if what you're looking for is a bit more complex, and you have to analyze more and more data, combined with log files from other hosts? What if you have no time to waste sitting and digging through the many "useless" syslog messages? What if you have to find a backup, gzipped into many files by newsyslog? This is a horrible waste of your valuable time. 1. Idea.It would be a nice solution to have all network logs located in a central place, like a SQL database. This could solve many of the problems mentioned above: we have one place where we gather all logs; we have a powerful way to find interesting messages (i.e., complex queries, as SQL supports), and it's fast. Such a solution has many security aspects. For secure communication between the loghost and systems we can use SSL (e.g. using stunnel[1]), IPsec or any other similar method. We can also use something more advanced, like serial ports and a multi-port serial card on the loghost. Logging via serial port is totally independent from the network layer as it provides a transport layer itself -- we can have logs even when TCP/IP is broken, or the network is down. You should also consider using VLANs on operating systems supporting it. 2. Solution.It's rather impossible to realize such an idea with standard (included in a system distribution) tools. However, there are many programs we can use. The two most extended and powerful syslog daemons I know are syslog-ng[2] and msyslog[3]. Msyslog has built-in support for MySQL and PostgreSQL output (as modules). In syslog-ng you need to use some external application like sqlsyslogd[4]. Using external programs should be avoided where possible because of ineffectiveness. There is also a syslogd+mysql[5] package available, which is a patched version of FreeBSD's syslogd. It's a good idea to place SQL procedures only on a central loghost and feed it with raw data through TCP, UDP, or another protocol, using an encryption scheme, as forementioned. 3. Implementation.In this chapter, I'll describe how to configure everything to work the way we expect it to. I assume that a smart administrator won't have any trouble implementing solutions described in this article. I would like to point out that I'm just giving some ideas and hints, feel free to contact me if you would like to discuss any of them. Let's say that we have an imaginary network with some servers - "sv1", "sv2" and "sv3" and one central loghost called... "loghost". sv1 is running IRIX, sv2 - FreeBSD, and sv3 is some other device, not capable of doing anything more advanced than simply sending logs in udp packets.
------sv1--------------
syslogd -pipe-> stunnel
-ssl-tcp->
stunnel -loopback-tunnel-> msyslog(im_tcp) -> sql
----------------------loghost--------------------
4. Management.Log management is a very important thing. This includes log backups, rotation, and tools for browsing log messages. It's not good for a database to continuously grow. We should dump it every now and then, backup and leave the possibility to dig into them everytime. Default table structure in msyslog for SQL database is simple and consists of: date/time of receiving, the host that logged the message and the message line itself. I've added one more field - the self incrementing index, which is useful when we need to locate something by message number, offset, or other time independent criterion. Generally it's quite a good idea to write a backup script (you could use pg_dump for postgres or something similar for other databases) and run it periodically from cron as often as you need to fit your local backup policy. You should store log backups in some reasonable way, to have the possibility of importing them back into the database for browsing, but this is not the subject of this article. You can always have two instances of the database: primary, for gathering current logs in realtime and searching through recent messages; and secondary, for feeding older records and digging in them. This can look a bit complex, but with some simple perl/shell scripts the whole thing gives us a fast and powerful log management tool. 5. Security.While realizing the centralized loghost project, you must remember that logs contain very important information. Not only for you as a system/network administrator, but also for an intruder trying to break in. There's even a possibility to write automatic network activity and log gathering agents to perform distributed metastasis, using information from collected log messages. You should turn off any other service at loghost, leaving only these needed to receive log messages, filter ports, give access to send logs only from specified hosts, setup MAC filtering on switches, etc. - but that's a different story. 6. Notes.I'm working on set of scripts and tools to automate log management, but they are not yet finished. You can always contact me for more information at kodz@slonce.com. I would like to thank Kamil Andrusz and Maciej Kozak for support.
[1] - stunnel - SSL tunnel - http://www.stunnel.org/
|