Managing your IT infrastructure can be incredibly difficult. This is why we have centrally managed systems like Active Directory, VMware’s vSphere, and so on. Despite managing them centrally, though, they all generate their own log data corresponding to their own events. The larger your infrastructure, the more nodes whose logs you will have to go through. Even if you filter through the logs, you’re still stuck logging into each machine to do so. Unless, of course, you send the logs to one central location with a SIEM.
Centralized logging has been around for quite a while (see RFC 3164 and RFC 5424). On Linux and other UNIX and UNIX-like systems, we’ve had syslog-ng and rsyslog for a while. This does the trick, we get everything in one place for analysis, but it’s plain-text. We can do much better on both the sending and receiving ends.
You might be wondering why collecting your log data is important. Many insights can be extracted from logs, even mundane ones. An increase of log data within a period of time indicates higher loads or usage, errors can inform you of issues within your applications and infrastructure, and logs can help with post-factum investigations of system failures. Logs are an amazing resource that is only limited by the verbosity you configure (or is supported in the particular platform).
A Security Information and Event Manager (SIEM, pronounced like ‘seem’ or ‘seam’) is a suite that combines the centralization of the log data with analysis. More capable products will even incorporate automated security analysis for intrusion detection/prevention, incident correlation, load forecasting, and even fancy visualizations of your logging data.
We’ll take a quick look at a few different SIEM products, but we’ll stick with the hard-hitting open source ones. There are many different SIEM stacks, all with different approaches and focuses. Some will focus on security, others on incident-analysis, or even real-time statistics and reporting. In any case, the open source platforms often require a bit of configuration to get meaningful information out of them, but they’re incredibly flexible.
Graylog is a good place to start. Thanks to its OVA, you can get up and running in just a few minutes without having to worry about installing all the dependencies. Graylog 3 just came out about two weeks ago, so it’s hot off the press, with an abundance of features. Graylog is packaged for Linux, and I assume the tarball would run fine on other UNIX and UNIX-like systems, but despite being written in Java, Graylog cannot run properly on Windows due to the way Windows handles file locking.
Graylog accepts a wide variety of log senders and provides many methods for extracting the data you need from log messages, even if they’re only in plain-text. Additionally, Graylog allows for the creation of ‘pipelines’ to further parse your logs and glean even more information from them.
There is a learning curve to Graylog, but it’s very easy to get started.
ELK isn’t a single project, but rather, ELK is ElasticSearch (which Graylog also uses), Logstash, and Kibana. They are all from the same family of projects under Elastic.co. ElasticSearch is the storage backend that underpins the incredible indexing capabilities of both Graylog and the ELK Stack. Logstash is the actual log receiver, which is also capable of receiving a variety of formats. And finally, Kibana is an analysis and reporting engine capable of generating very appealing visuals.
Unlike Graylog, the ELK Stack is not easy to get going in. It requires significant initial setup. The ELK Stack is capable of running on Windows, though, which may appeal to some.
The ELK Stack might appear to be a huge time investment with respect to learning and configuration, but it can be highly beneficial considering the scalability and flexibility of the stack. Companies like Netflix and eBay use the ELK Stack for logging. This flexible platform has a lot to offer and is only benefitting from its ongoing development.
SIEMonster (pronounce ‘sea monster’) is a newer SIEM and is interesting in that it brings a wide variety of independent open source logging and security projects together into an integrated package. SIEMonster also runs on ElasticSearch (seeing a pattern?) but doesn’t stop there. SIEMonster actually uses the whole ELK Stack, which makes for a very powerful base to build atop. In addition to ELK, SIEMonster uses Wazuh for threat intelligence and security analysis, Wazuh for host-based intrusion detection, and several components that extend the functionality of ElasticSearch.
Right out of the box, SIEMonster might appear to be a clear winner given that it aims to be a turn-key security analysis suite. It’s an incredibly feature-rich package. While SIEMonster is packaged specifically for Linux, it’s hard to turn your nose up at a package this robust simply because of the platform it does or doesn’t run on.
Shipping Your Logs
So, once you have your server up and running, how do you actually get your logs to it? Most UNIX and UNIX-like operating systems have the ability to send their logs over the network right in their logging daemons. What if your OS doesn’t support the logging you want? Well, you can probably use one of Elastic.co’s Beats, and if none are suitable, you can use their libbeat and create a logging client that perfectly meets your needs.
Additionally, there are OSSEC’s HIDS clients and many, many more clients available, even for Windows.
As you’ve probably noticed, there is a lot of security focus here. If an assailant obtains entry to your server and does so with sufficient privilege to delete your logs, you may have little to investigate with afterward. This poses, not only a potential blind spot but also a very big security hole. By sending your logs off to centralized logging server you can still have a copy of those logs for later analysis. Additionally, you can monitor those logs in real-time to detect intrusions and alert security staff or even automatically take precautionary measures.
A centralized logging system or SIEM package is an invaluable addition to your IT or application infrastructure and can save you a lot of time and headache.
Also Read: The Value Of A Proper Lab Computer — Why You Should Get Started With Virtualization?