Building High Performance low latency rsyslog for Splunk

This is a brief followup on my earlier post in a very large scale environment write -> monitor –> read between a log appending source such as rsyslogd and Splunk can impact the latency of log data entry into the destination environment. Last week I stumbled onto a feature of Rsyslog developed a couple of major versions ago that has been very under appreciated. OmProgram allows a developer to receive events from rsyslog using any program without first waiting for disk write. I’ve developed a little bit of code allowing direct transfer of events to Splunk using the http collector download and try it out.

What the output module allows for is direct scale-able  transfer between rsyslog and splunk in native protocols. Ideal use cases include dynamically scaling cloud environments and embedded devices where agents are not acceptable.

Credits

  • Rsyslog dev team for making this possible and Rainer for this presentation that inspired me
  • Splunk dev team for the really awesome http event collector and George who developed the python class interface
  • Splunk Stream team who added direct event collector usage in stream 6.5 proving significant scale.

Setup

  • Setup http event collector behind a load balancer
  • Note your token
  • Install requests using apt,yum or pip http://docs.python-requests.org/en/master/user/install/
  • If using certificate verification setup what is required for requests
  • “git” the code https://bitbucket.org/rfaircloth-splunk/rsyslog-omsplunk/src
  • place omsplunkhec.py and splunk_http_event_collector.py in a location executable by rsyslog
  • Setup rsyslog rule set with  an action similar to the following
    module(load="omprog")
    action(type="omprog"
           binary="/opt/rsyslog/hecout.py --source=rsyslog:hec --sourcetype=syslog --index=main" 
           template="RSYSLOG_TraditionalFileFormat")

Leave a Reply