De-Coder’s Ring

Software doesn’t have to be hard

Suricata Stats to Influx DB/Grafana

For everyone unfamiliar, Suricata is a high performance network IDS (Intrusion Detection System), IPS (Intrusion Prevention System) and NSM (Network Security Monitor).  Suricata is built to monitor high speed networks to look for intrusions using signatures.  The first/original tool in this space was Snort (by Sourcefire, acquired by Cisco).

NSM mode for Suricata accomplishes some pretty fantastic outputs.  In the early days of nPulse’s pivot to a network security company, I built a prototype ‘reassembly’ tool.  It would take a PCAP file, shove the payloads of the packets together, in order by flow, and extract a chunk of data.  Then I had to figure out what was in that jumbo payload.  Finding things like FTP or HTTP were pretty easy…. but then the possibilities became almost endless.    I’ll provide a link at the bottom with suricata and other tools in the space.  Suricata can do this extraction, in real time, on a really fast network.  It’s a multi threaded tool that scales.

Suricata can log alerts, NSM events and statistics to a json log file, eve.json.  On a typical unix box, that file will be in /var/log/suricata.  The stats event type is a nested JSON object with tons of valuable data.   The hierarchy of the object looks something like this:

Suricata Stats Organization

Stats layout for Suricata eve.json log file

For a project I’m working on, I wanted to get this Suricata stats data into a time series database.  For this build out, I’m not interested in a full Elasticsearch installation, like I’m used to, since I’m not collecting the NSM data, but only this time series data.  From my experience, RRD is a rigid pain, and graphite (as promising as it is) can be frustrating as well.  My visualization target was Grafana and it seems one of the favored data storage platforms is InfluxDB, so, I thought I’d give it a shot.  Note, Influx has the ‘tick’ stack, which included a visualization component, but, I really wanted to use Grafana.   So, I dropped Chronograf in favor of Grafana.

Getting Telegraf (machine data, CPU/Memory) injected into Influx and visualized within Grafana took 5 minutes. Seriously.  I followed the instructions and it just worked.  Sweet!  Now it’s time to get the suricata stats file working..

snippet of the above picture:

As you can see, the nested JSON is there.  I really want that “343734” “ipv4” number shown over time, in Grafana.  After I installed Logstash (duh), to read the eve.json file, I had to figure out how to get the data into Influx.  There is a nice plugin to inject the data, but unfortunately, the documentation doesn’t come with good examples. ESPECIALLY good examples using nested JSON.   Well, behold, here’s a working document, which gets all the yummy Suricata stats into Influx.

WHOAH! That’s a LOT of fields.  Are you kidding me?!  Yep, it’s excellent. Tons of data will now be ‘graphable’.  I whipped together a quick python script to read an example of the JSON object, and spit out the data points entries, so I didn’t have to type anything by hand.  I’ll set up a quick gist in github to show my work.

Let’s break it down a little bit.

This snippet tells logstash to read the eve.json file, and tell it that each line is a JSON object:

This section tells logstash to drop every event that does not have “event_type” of “stats”

Suricata has a stats log file, that could probably be used by Logstash, but I may do that on another day. It’s way more complicated than JSON.

The last section is the tough one.  I found documentation that showed the general format of “input field” => “output field”… but that was it.  It took a ton of time over the past working day to figure out exactly how to nail this.  First, fields like ‘host’, ‘user’,’password’,’db’,’measurement’ are very straight forward Influx concepts.  DB is much like a name spacing or a ‘table’ in a traditional sense.   A db contains multiple measurements.  The measurement is the thing we’re going to track over time.  In our case, these are the actual low level details we want to see, for instance, ‘stats-capture-kernel-drops’.

Here’s an example:

On the left ‘stats-decoder-ipv4’ is the measurement name that will end up in InfluxDB.  The right side is how Logstash knows where to find the value based on this event from eve.json.  %{ indicates the value will come from the record.  Then Logstash just follows the chain down the JSON document.  stats->decoder->ipv4

That’s it.   The logstash config, eve.json, little python script, and here’s a picture!

Grafana Screen Shot showing CPU, Memory and Suricata Information

Grafana Screen Shot showing CPU, Memory and Suricata Information

Pretty straight forward.  Example configuration on one of the stats:

Configure graph in grafana

Configure graph in grafana

Please leave me a comment!  What could be improved here?   (Besides my python not being elegant.. it works.. .and sorry Richard, but, I’m  a spaces guy, tabs are no bueno)

  1. GITHUB:  https://github.com/chrisfauerbach/suricata_stats_influx
  2. Suricata: https://suricata-ids.org/
  3. Snort Acquired Sourcefire:  http://www.cisco.com/web/about/ac49/ac0/ac1/ac259/sourcefire.html
  4. Grafana – http://grafana.org/
  5. InfluxDB/TICK stack: https://influxdata.com
  6. Chronograf: https://influxdata.com/time-series-platform/chronograf/
  7. Telegraf: https://influxdata.com/time-series-platform/telegraf/
  8. Logstash:  https://www.elastic.co/products/logstash

Short URL: http://bit.ly/28ZmQ6w

1 Comment

  1. Morgan Kisienya

    July 9, 2016 at 12:40 PM

    Hi Chris, thanks,

    This is very good work.

    Just wondering how you managed to get JSON data to influxDB because the current version of InfluxDB doesn’t support the JSON . The predominant reason it was deprecated was that decoding JSON was the largest performance bottleneck in the system.

    See https://github.com/influxdata/influxdb/pull/2696#issuecomment-106968181

Leave a Reply

Your email address will not be published.

*

© 2017 De-Coder’s Ring

Theme by Anders NorenUp ↑