RSYSLOG read the docs

Ok, I said posts in threes so here it is. We all know RYSLOG config is much more painful than syslog-ng but for reasons beyond all of our control, it is readily available for more customers than syslog-ng is today. Thanks to Splunk users I want to share a couple links to better doc to make this not so awful

  • RedHat https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/System_Administrators_Guide/s1-basic_configuration_of_rsyslog.html
  • Usenix https://www.usenix.org/system/files/login/articles/06_lang-online.pdf

thank you @mattymo and @lowell via Splunk Slack chat

Unbelievably simple (ipfix|(net|j|s)flow) collection

Do blog posts come in threes, keep watching to find out? Yesterday I gave you the run down on a new way to collect syslog. Today I’m going to spend some time on a simple low cost and performant way to collect flow data.

  • At least two indexers with http event collector, more = better. For this use case it is not appropriate to utilize dedicated HEC servers.
  • One http load balancer, I use HA proxy. You can certainly use the same one from our rsyslog configuration.
  • Optional one UDP load balancer such as NGNIX. I am not documenting this setup at this time.
  • One ubuntu 16.04 VM

Basic Setup

  1. Follow docs, to setup HTTP event collector on your indexers, note if your indexers are clustered docs does not cover this, you must create the configuration manually be sure to generate a unique GUID manually. Clusters environments can use the sample configuration below: IMPORTANT ensure your data indexes AND _internal are allowed for the token
  2. [http] 
    disabled=0
    port=8088
    #
    [http://streamfwd]
    disabled=0
    index=main
    token=DAA61EE1-F8B2-4DB1-9159-6D7AA5220B21
    indexes=_internal,main
  3. Follow documentation for your load balancer of choice to create a http VIP with https back end servers. HEC listens on 8088 by default.
  4. Install stream for the independent per Docs
  5. Kill stream if its running “killall -9 streamfwd”
  6. Remove the init script
    1. update-rc.d -f streamfwd remove”
    2. rm /etc/init.d/streamfwd
  7. Create a new service unit file for systemd /etc/systemd/system/streamfwd.service
    [Unit]
    Description= Splunk Stream Dedicated Service
    After=syslog.target network.target
    [Service]
    Type=simple
    ExecStart=/opt/streamfwd/bin/streamfwd -D
  8. Enable the new service “systemctl enable streamfwd”
  9. Create/update the streamfwd.conf replacing GUID VIP and INTERFACE
    1. [streamfwd]
      
      httpEventCollectorToken = <GUID>
      
      indexer.0.uri= <HEC VIP>
      netflowReceiver.0.ip = <INTERFACE TO BIND>
      netflowReceiver.0.port = 9995
      netflowReceiver.0.decoder = netflow
  10. Create/update the inputs.conf ensure the URL is correct for the location of your stream app
  11. [streamfwd://streamfwd]
    
    splunk_stream_app_location = https://192.168.100.62:8000/en-us/custom/splunk_app_stream/
    
    stream_forwarder_id=infra_netflow
  12. Start the streamfwd “systemctl start streamfwd”
  13. Login to the search head where Splunk App for Stream is Installed
  14. Navigate to Splunk App for Stream –> Configuration –> Distributed Forwarder Managment
  15. Click Create New Group
  16. Enter Name as “INFRA_NETFLOW”
  17. Enter a Description
  18. Click Next
  19. Enter “INFRA_NETFLOW” as the rule and click next
  20. Click Finish without selecting options
  21. Navigate to Splunk App for Stream –> Configuration –> Configure Streams
  22. Click New Stream select netflow as the protocol (this is correct for netflow/sflow/jflow/ipfix
  23. Enter Name as “INFRA_NETFLOW”
  24. Enter a Description and click next
  25. No Aggregation and click next
  26. Deselect any fields NOT interesting for your use case and click next
  27. Optional develop filters to reduce noise from high traffic devices and click next
  28. Select the index for this collection and click enable then click next
  29. Select only the Infra_netflow group and Create_Stream
  30. Configure your NETFLOW generator to send records to the new streamfwd

Validation! search the index configured in step 27

Building a more perfect Syslog Collection Infrastructure

A little while back I created a bit of code to help get data from linux systems in real time where the Splunk Universal Forwarder could not be installed. At the time we had a few limitations the biggest problem being time stamps were never parsed only “current” time on the indexer could be used.  Want to try out version 2 lets get started! First let me explain what we are doing

If you manage a Splunk environment with high rate sources such as a Palo Alto firewall or Web Proxy you will notice that events are not evenly distributed over the indexers because the the data is not evenly balanced across your aggregation tier. The reasons for this are boiled down to “time based load balancing” in Larger environments the universal forwarder may not be able to split by time to distribute a high load. So what is an admin to do? Lets look for a connection load balancing solution. We need to find a way to switch from “SYSLOG” to HTTP(s) so we can utilize a proper load balancer. How will we do this?

  1. Using containers we will dedicate one or more instance of RSYSLOG for each “type” of data,
  2. Use a custom plugin to package and forward batches of events over http(s)
  3. Use a load balancer configured for least connected round robin to balance the batches of events

What you need

  • At least two indexers with http event collector, more = better. The “benefits” of this solution require collection on the indexer dedicated collectors will not be a adequate substitute
  • One load balancer, I use HA Proxy
  • One syslog collection server with rsyslog 8.24+ host I use LXC instances hosted on proxmox. Optimal deployment will utilize 1 collector per source technology. For example 1 instance collecting for Cisco IOS and another for Palo Alto Firewalls. Using advanced configuration and filters you can combine several low volume source.
  • A GUID if you need one generated there are many ways this one is quick and easy https://www.guidgenerator.com/online-guid-generator.aspx

Basic Setup

  1. Follow docs, to setup HTTP event collector on your indexers, note if your indexers are clustered docs does not cover this, you must create the configuration manually be sure to generate a unique GUID manually. Clusters environments can use the sample configuration below:
  2. Follow documentation for your load balancer of choice to create a http VIP with https back end servers. HEC listens on 8088 by default
  3. Grab the code and configuration examples from bitbucket
    1. Deploy the script omsplunkhec.py to /opt/rsyslog/ ensure the script is executable
    2. Review rsyslogd.d.conf.example and your configuration in /etc/rsyslog.d/00-splunkhec.conf replace the GUID and IP with your correct values
    3. Restart rsyslog

What to expect, My hope data balance Zen.

HTTP Event Collector inputs.conf example deployed via master-apps

[http] 
disabled=0
port=8088
#
[http://SM_rsyslog_routerboard]
disabled=0
index=main
token=DAA61EE1-F8B2-4DB1-9159-6D7AA5220B21
indexes=main,summary

Example /etc/rsyslog.d/00-splunk.conf

This example will listen on 514 TCP and UDP sending events via http, be sure to replace the GUID and ip address

module(load="imudp")
input(type="imudp" port="514" ruleset="default_file")
module(load="imptcp")
input(type="imptcp" port="514" ruleset="default_file")
module(load="omprog")

ruleset(name="default_file"){
    $RulesetCreateMainQueue    
    action(type="omprog"
       binary="/opt/rsyslog/omsplunkhec.py DAA61EE1-F8B2-4DB1-9159-6D7AA5220B21 192.168.100.70 --sourcetype=syslog --index=main" 
       template="RSYSLOG_TraditionalFileFormat")
    stop
}

Example HAProxy Configuration 1.7 /etc/haproxy/haproxy.cfg

 

global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin
        stats timeout 30s
        user haproxy
        group haproxy
        daemon
        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private
        # Default ciphers to use on SSL-enabled listening sockets.
        # For more information, see ciphers(1SSL).
        ssl-default-bind-ciphers kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH:!LOW:!EXP:!MD5:!aNULL:!eNULL
defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http
listen  stats   
        bind            *:1936
        mode            http
        log             global
        maxconn 10
        clitimeout      100s
        srvtimeout      100s
        contimeout      100s
        timeout queue   100s
        stats enable
        stats hide-version
        stats refresh 30s
        stats show-node
        stats auth admin:password
        stats uri  /haproxy?stats
frontend localnodes
    bind *:8088
    mode http
    default_backend nodes
backend nodes
    mode http
    balance leastconn
    option forwardfor
    http-request set-header X-Forwarded-Port %[dst_port]
    http-request add-header X-Forwarded-Proto https if { ssl_fc }
    option httpchk
    server idx2 192.168.100.52:8088 ssl verify none check 
    server idx1 192.168.100.51:8088 ssl verify none check 

Making Splunk Certified Apps

As a developer of “Apps” for the Splunk platform; I have been very eager to automate more tedious tasks including build and static code analysis. Today our very awesome development community has access to a new tool App Inspect. The new python based extensible framework will allow your automated build process to validate key issues and prepare for formal certification for Public apps on Splunk Base, or assure quality for internally developed apps. The process example can easily be ported to the tool section of your choice allowing for effective version control and testing of applications built on the Splunk platform.

To help you get started I’ve developed an example using our partner’s tools at Atlassian.

  • Bitbucket repository containing the source
  • CMAKE build script for packaging and versioning
  • Bitbucket pipelines integration using docker to ensure a clean package and execute validation
  • Publish to AWS S3 as a package repository before manual publishing to Splunk Base

Getting started review https://bitbucket.org/Splunk-SecPS/seckit_sa_geolocation this is my first and most complete example.

  • CMakeLists.txt controls the build process
  • src/ contains the applications source
  • src/default/app.conf.in is the template for app.conf our build will update this file with the correct version tag supplied by git
  • bitbucket-pipelines.yml controls the pipelines automated integration process
    • Retrieve and deploy the latest docker image with build tools and app inspect
    • Package the app
    • Push to S3

Try it yourself!

Syncing up shclusterapps

This one is short and sweet, when building a Splunk search head cluster we often will create a search head unattached to indexers to “stage” .spl deployments, configure THEN update shcluster/apps and push the following rsync command does this for you and obeys the golden rule to avoid default core apps. The list is correct as of 6.4.1 update as needed for new versions and be sure to exclude anything like an “app” containing deployment client

rsync –verbose –progress –stats –recursive –times –perms \
–exclude alert_logevent \
–exclude launcher \
–exclude SplunkForwarder \
–exclude alert_webhook \
–exclude learned \
–exclude splunk_httpinput \
–exclude appsbrowser \
–exclude legacy \
–exclude SplunkLightForwarder \
–exclude framework \
–exclude sample_app \
–exclude splunk_management_console \
–exclude gettingstarted \
–exclude search \
–exclude*_deploymentclient** \
–exclude introspection_generator_addon \
–exclude splunk_archiver \
–exclude user-prefs \
/opt/splunk/etc/apps/* /opt/splunk/etc/shcluster-test/apps

Building High Performance low latency rsyslog for Splunk

This is a brief followup on my earlier post in a very large scale environment write -> monitor –> read between a log appending source such as rsyslogd and Splunk can impact the latency of log data entry into the destination environment. Last week I stumbled onto a feature of Rsyslog developed a couple of major versions ago that has been very under appreciated. OmProgram allows a developer to receive events from rsyslog using any program without first waiting for disk write. I’ve developed a little bit of code allowing direct transfer of events to Splunk using the http collector download and try it out.

What the output module allows for is direct scale-able  transfer between rsyslog and splunk in native protocols. Ideal use cases include dynamically scaling cloud environments and embedded devices where agents are not acceptable.

Credits

  • Rsyslog dev team for making this possible and Rainer for this presentation that inspired me
  • Splunk dev team for the really awesome http event collector and George who developed the python class interface
  • Splunk Stream team who added direct event collector usage in stream 6.5 proving significant scale.

Setup

  • Setup http event collector behind a load balancer
  • Note your token
  • Install requests using apt,yum or pip http://docs.python-requests.org/en/master/user/install/
  • If using certificate verification setup what is required for requests
  • “git” the code https://bitbucket.org/rfaircloth-splunk/rsyslog-omsplunk/src
  • place omsplunkhec.py and splunk_http_event_collector.py in a location executable by rsyslog
  • Setup rsyslog rule set with  an action similar to the following
    module(load="omprog")
    action(type="omprog"
           binary="/opt/rsyslog/hecout.py --source=rsyslog:hec --sourcetype=syslog --index=main" 
           template="RSYSLOG_TraditionalFileFormat")

Building reliable rsyslogd infrastructure for Splunk

 

Overview

Preparation of a base infrastructure for high availability ingestion of syslog data with a default virtual server and configuration for test data on boarding. Reference technology specific on boarding procedures.

Requirement

Multiple critical log sources require a reliable syslog infrastructure. The following attributes must be present for the solution

  • Enterprise supported linux such as RHEL, OR Centos, or recent Ubuntu LTS
  • Syslog configuration which will not impact the logging of the host on which syslog is configured
  • External Load Balancing utilizing DNAT lacking available enterprise shared services NLB devices KEMP offers a free to use version of their product up to 20 Mbs suitable for many cases

Technical Environment

The following systems will be created utilizing physical or virtual systems. System specifications will vary due estimated load.

  • servers in n+1 configuration
    • Minimum 2 GB memory
    • Minimum 2 x 2.3 GHZ core
    • Mounts configure per enterprise standard with the following additions
      • /opt/splunk 40 GB XFS
      • /var/splunk-syslog 40 GB XFS
  • Dual interfaced load balancer configured for DNAT support.
  • Subnet with at minimum the number of unique syslog sources (technologies) additional space for growth is strongly advised
  • Subnet allocated for syslog servers

Solution Prepare the rsyslogd servers

The following procedure will be utilized to prepare the rsyslogd servers

  1. Install the base operating system and harden according to enterprise standards
  2. Provision and mount the application partitions /opt/splunk and /var/splunk-syslog according the estimates required for your environment.
    1. Note 1 typical configuration utilize noatime on both mounts
    2. Note 2 typical configuration utilizes no execute on the syslog mount
  3. Create the following directories for modular configuration of rsyslogd
    mkdir -p /etc/rsyslog.d/conf.d/splunk-0-rules
    mkdir -p /etc/rsyslog.d/conf.d/splunk-1-inputs
  4. Create the Splunk master syslog-configuration /etc/rsyslog.d/splunk.conf
    #
    # Include all config files for splunk /etc/rsyslog.d/
    #
    
    $IncludeConfig /etc/rsyslog.d/splunk-0-rules/*.conf
    $IncludeConfig /etc/rsyslog.d/splunk-1-inputs/*.conf
  5. Create the catch all syslog collection source. /etc/rsyslog.d/splunk-1-inputs/default.conf
    #define syslog source
    input(type="imptcp" port="8100" ruleset="default_file");
    input(type="impudp" port="8100" ruleset="default_file");
  6. Define a rule for all incoming data on the default port /etc/rsyslog.d/splunk-0-rules/default.conf
    ruleset(name="default_file"){
        $RulesetCreateMainQueue    
        $template DynaFile,"/var/splunk-syslog/default/%HOSTNAME%.log"
        *.* -?DynaFile
        stop
    }
  7. Ensure splunk can read from the syslog folders. The paths should exist at this point due to the dedicated mount
    chown -R splunk:splunk /var/splunk-syslog
    chmod -R 0755 /var/splunk-syslog
  8. Reload rsyslogd
    systemctl reload rsyslog
  9. Create log rotation configuration /etc/logrotate.d/splunk-syslog
    /var/splunk-syslog/*/*.log
    {
        daily
        compress
        delaycompress
        rotate 4
        ifempty
        maxage 7
        nocreate
        missingok
        sharedscripts
        postrotate
        /bin/kill -HUP `cat /var/run/syslogd-ng.pid 2> /dev/null` 2> /dev/null || true
        endscript
    }
  10. Allow firewall access to the new ports (RHEL based)
    firewall-cmd --permanent --zone=public --add-port=8100/tcp 
    firewall-cmd --permanent --zone=public --add-port=8100/udp
    firewall-cmd --reload

 

Solution Prepare KEMP Loadbalancer

  • Deploy virtual load balancer to hypervisor with two virtual interfaces
    • #1 Enterprise LAN
    • #2 Private network for front end of syslog servers
  • Login to the load balancer web UI
  • Apply free or purchased license
  • Navigate to network setup
    • Set eth0 external ip
    • Set eth1 internal ip
  • Add the first virtual server (udp)
    • Navigate to Virtual Services –> Add New
    • set the virtual address
    • set port 514
    • set port name syslog-default-8100-udp
    • set protocol udp
    • Click Add this virtual service
    • Adjust virtual service settings
      • Force Layer 7
      • Transparency
      • set persistence mode source ip
      • set persistence time 6 min
      • set scheduling method lest connected
      • Use Server Address for NAT
      • Click Add new real server
        • Enter IP of syslog server 1
        • Enter port 8100
  • Add the first virtual server (tcp)
    • Navigate to Virtual Services –> Add New
    • set the virtual address
    • set port 514
    • set port name syslog-default-8100-tcp
    • set protocol tcp
    • Click Add this virtual service
    • Adjust virtual service settings
      • Service type Log Insight
      • Transparency
      • set scheduling method lest connected
      • TCP Connection only check port 8100
      • Click Add new real server
        • Enter IP of syslog server 1
        • Enter port 8100
  • Repeat the add virtual server process for additional resource servers

 

Update syslog server routing configuration

Update the default gateway of the syslog servers to utilize the NLB internal interface

Validation procedure

from a linux host utilize the following commands to validate the NLB and log servers are working together
logger -P 514 -T -n <vip_ip> "test TCP"
logger -P 514 -d -n <vip_ip> "test UDP"
verify the messages are logged in /var/splunk-syslog/default

Prepare Splunk Infrastructure for syslog

  • Follow procedure for deployment of the Universal Forwarder with deployment client ensure the client has has valid outputs and base configuration
  • Create the indexes syslog and syslog_unclassified
  • Deploy input configuration for the default input
[monitor:///var/splunk-syslog/default/*.log]
host_regex = .*\/(.*)\.log
sourcetype = syslog
source = syslog_enterprise_default
index = syslog_unclassified
disabled = enabled

 

  • Validate the index contains data

 

Ghost Detector (CVE-2015-7547)

4375461

Just in case you need need yet another reason to utilize passive DNS analytic, a new significant vulnerability is out for GLIBC. Have stream? You can monitor your queries for this IOC

https://sourceware.org/ml/libc-alpha/2016-02/msg00416.html

Update: the attack requires both A and AAAA records. Only show possible attacks with both involved. This should return zero results. If results are returned there “may” be something of interest drill into the answers involved to determine if they are malicious based on the CVE above.

index=streams sourcetype=stream:dns (query_type=A OR query_type=AAAA)
[
search index=streams sourcetype=stream:dns (query_type=A OR query_type=AAAA)
| rare limit=20 dest
| fields + dest | format
]
| stats max(bytes_in) max(bytes_out) max(bytes) values(query_type) as qt by src,dest,query
| where mvcount(qt)>=2
| sort – max*
| lookup domain_segments_lookup domain as query OUTPUT privatesuffix as domain
| lookup alexa_lookup_by_str domain OUTPUT rank
| where isnull(rank)

Don’t have stream yet? Deploy in under 20 minutes.
http://www.rfaircloth.com/2015/11/06/get-started-with-splunk-app-stream-6-4-dns/

Dealing with bad threat data

Every now and then a threat data provider will include invalid entries in their threat list creating loads of false positives in Enterprise Security. For “reasons” namely performance ES will append new entries to the internal threat system but does not remove entries no longer present in a source. You can easily clear an entire threat collection which will allow your system to reload from the current sources.

splunk stop
splunk clean inputdata threatlist
splunk clean inputdata threat_intelligence_manager
splunk start
splunk clean kvstore -app DA-ESS-ThreatIntelligence -collection

Common values for collection are http_intel and domain_intel