Keeping that up with MaxMind for Splunk and Splunk Cloud

This is a rather long time coming, today version 4.0.0 of SecKit for Geolocation with MaxMind is available for Splunk via Splunk base and Splunk cloud. This version includes a built in solution for keeping the database files up to date for both free and paid subscribers. Free subscribers are of course limited to the basic information available as equivalent to the built in iplocation command from Splunk. Commercial subscribers can use all of available fields from their licensed files. Jump over to Splunk base to check it out

To HEC with syslog – All grown up

A few years ago flying across the Atlantic, unable to sleep, I had an idea to integrate common syslog aggregation servers using Splunk’s new HTTP event Collector rather than file and the tired and true Universal Forwarder. This little idea implemented in python started solving real problems of throughput and latency while reducing the complexity of configuring syslog aggregation servers. I’m very pleased to say the python script I created is now obsolete. Leading syslog server products syslog-ng and rsyslog upstreams have implemented maintained modules. Even more exciting Mark Bonsack has invested considerable time to further develop modular configuration for both to make getting data though syslog and into Splunk even easier!

Native syslog –> http event collector modules

Modular Configuration Repositories

Quick Public Service Notice

While all linux distributions include a syslog server this should NOT be used as the production syslog aggregation solution. Linux distros are often many point releases or worse selectively back port patches based only on their own customer reported issues. Before attempting to build a syslog aggregation solution for production it is critical you source current upstream binaries or build your own.

Is your LDAP Slow? It might make your Splunk Slow

I’ve had this crop up enough times, I think its worth a short post. Most Splunk deployments use local and/or LDAP authentication. LDAP configuration is something of a black art and often the minimal configuration that works is the first and last time this is considered. It is worth your time as an administrator to optimize your LDAP configuration or better yet move to the more secure and reliable SAML standard.

Things to consider

  • Stop using LDAP and use SAML
  • LDAP authentication should never be enabled on indexers. If you have enabled LDAP authentication remove this. Indexers rarely require authentication, when required only applicable to admins and then under very strict conditions.
  • Ensure the Group BIND and Group Filters are both in use and limit the groups to only those required for Splunk access management
  • Ensure the User BIND and User Filters are appropriate to limit the potential users to only those users who may login to Splunk.
  • Validate the number of users returned by the LDAP query used is under 1000 or increase the number of precached users via limits.conf to an appropriate number.
  • Ensure DNSMASQ or an alternative DNS cache client is installed on Linux Search Heads and Indexers.

Stages to LDAP Auth

  • Interactive User Login (ad-hoc search) or scheduled report execution
  • Check User Cache if not cached or cache expired
  • DNS lookup to resolve the LDAP host to IP (This is the reason DNS Cache on linux is important)
  • TCP Connection to LDAP Server
  • Post query for specific user to LDAP
  • Wait for response
  • Process Referrals if applicable by repeating the above sequence

The time taken for each DNS query and LDAP query is added to the time taken too login to the Splunk UI, execute an ad-hoc search OR scheduled report. Its important to ensure the DNS and LDAP infrastructure is highly available and able to service potentially thousands of requests per second. Proper use of caching ensures resources on the Splunk Server including TCP client sessions are not exhausted causing users to wait in line for their turn at authentication or worst case time outs and authorization failures.



Redirecting _internal for a large forwarder deployment

Sometimes it is not noticed because there is no license charge associated with Splunk’s Universal forwarder internal logs and in some cases heavy forwarders. In very large deployments this can be a significant portion of storage used per day. Do you really need to keep those events around as long as the events associated with the Splunk Enterprise instances probably not.

License Warning – Updated

It has been pointed out this change WILL impact license on recent versions of Splunk in older versions and customers with EAA agreements in place this is OK. If you are on a recent (not sure which version) this change will impact license.


The following changes will disable the Splunk Monitoring consoles built in forwarder monitoring feature. You can customize the searches but be aware this is not upgrade safe.

Second Warning!

If you have any custom forwarder monitoring searches/dashboards/alerts they may be impacted.

Define an index

The index we need to define is _internal_forwarder the following sample configuration will allow us to keep about 3 days of data from our forwarders adjust according to need.

maxWarmDBCount = 200
frozenTimePeriodInSecs = 259200
quarantinePastSecs = 459200
homePath = $SPLUNK_DB/$_index_name/db
coldPath = $SPLUNK_DB/$_index_name/colddb
thawedPath = $SPLUNK_DB/$_index_name/thaweddb
maxHotSpanSecs = 43200
maxHotBuckets = 10

Change the index for internal logs

We need to create a new “TA” named “Splunk_TA_splunkforwarder we will CAREFULLY use the DS to push this to forwarders only. DO NOT push this to any Splunk Enterprise instance (CM/LM/MC/SH/IDX/deployer/ds) but you may push this to a “heavy” or intermediate forwarder. The app only needs two files in default app.conf and inputs.conf

state_change_requires_restart = true
is_configured = 0
state = enabled
build = 2

author = Ryan Faircloth
version = 1.0.0

is_visible = 0
label = Splunk_UF Inputs

id = Splunk_TA_splunkforwarder
index = _internal_forwarder

Check our Work

First lets check positive make sure UFs have moved to the new index, we should get results.

index=_internal_forwarder source=*splunkforwarder*

Second lets check the negative make sure only UF logs got moved we should get no results

index= _internal_forwarder source=*splunk* NOT source=*splunkforwarder*


  • Index definition example used “_internal” rather than “_internal_uf”
  • renamed app to “Splunk_TA_splunkforwarder
  • renamed index to _internal_forwarder

Windows TA 6.0 is out!

Splunk released a major update to the Splunk TA for Windows last month you may not have noticed but I think you should take a closer look. A few key things

  • Simplified deployment for new customers Splunk merged the TA for Microsoft DNS and TA for Microsoft AD
  • The improved support for “XML” format Windows events from 5.0.1 is now the default in 6.0.0 there is upgrade action to accept this switch. XML events allow for extraction of additional value-able data such as the restart reason from event ID 1074
  • Improved CIM compliance for Security events from modern logging channels like Remote Desktop Session
  • Improved extensibility its now much easier to add support for third part logging via Windows Event Log
  • Improved support for Windows Event forwarding – Note I still strongly discourage this solution for performance, reliability and audit reasons.

If you are a SecKit for Windows user it is safe to upgrade just follow Splunk’s upgrade instructions. Need some guidance on good practices for Windows data on-boarding to Splunk be sure to checkout SecKit

But Change!

While this is not a replacement for the upgrade notes you are probably wondering how will this impact my users.

  • sourcetype changes: Prepare for the upgrade review use of sourcetype=wineventlog:* and replace with an appropriate eventtype OR source= With this TA version we use the source to differentiate between the specific event logs. sourcetype which represents the format of the log becomes a constant regardless of log type. This reduces the memory used in index and search time.
  • License impact: XML is bigger, yes but classic has white space and thats not free either and all that static text is gone. In my travels I have not seen much impact if any to license it seems to be a wash
  • XML logs are ugly: You are not wrong there. What can I say its Windows
  • XML parsing is slower: Yes and no overall the impact of switch from Classic to XML is not much slower. The TA uses regex parsing not “XML”, while you see XML on screen Splunk treats it like normal text. The changes implemented in the prior release (5.0.1) made improvements compared to 4.8.4 if your prior experience relates to this version its worth a second look.

Five things you can do now to get ready for Splunk Smart Store

Splunk’s SmartStore technology is a game changing advancement in data retention for Splunk Enterprise. Allowing Splunk to move least used data to an AWS for low cost “colder storage”.

Reduce the maximum size of a bucket

We will review indexes.conf on the indexer and identify any references to the setting maxDataSize. Common historical practice has been to increase the size of this setting from the default of auto to an arbitrary large value or auto_high_volume. SmartStore is optimized and enforces the use of “auto” or 750mb as the max bucket size. This task should be completed at least 7 days prior cutover to SmartStore.

Reduce the maximum span of a bucket

We will review indexes.conf and identify all indexes which continuously stream data. Common historical practice to leave this as default value which are very wide this increases the likely a user will retrieve buckets from S3 that do not actually meet their needs. We will determine a value of maxHotSpanSecs that will SmartStore to uncache buckets not used while also keeping buckets available likely to be used. Often 1 day (86400s) is appropriate.

  • What is the time window a typical search will use for this index relative to now i.e 15 min, 1 day, 3 days, 1 week
  • What span of time would allow a set of buckets to contain the events for the user search without excessive “extra” events. For example if the span is 90 days and the users primarily only work with 1 days worth of events therefore 89 days of events will use cache space in a wasteful way.

Review Getting Data In problems impacting bucket use

Certain oversights in onboard data into Splunk impact both use-ability of data and performance review and resolve any issues identified by the Splunk Monitoring Console page Data Quality the most important indicators of concern are

  • time stamp extraction
  • time zone detection
  • indexing latency (_indextime - _time)

One common source of “latency” is events from offline endpoints such as windows laptops. Any endpoint that can spool locally for an undetermined period of time then forward old events should be routed to a index not used for normal streaming events. For example “oswinsec” is the normal index I use for Windows Security Events however for endpoint monitoring I use “oswinsecep”.

Review bucket roll behavior

After the above activities are done, wait an hour before beginning this work. We should identify pre-mature bucket roll behavior that is buckets rolled from hot to warm regularly for not great reasons.  The following search

index=_internal source="/opt/splunk/var/log/splunk/splunkd.log"
component=HotDBManager evicting_count="*" 
| stats max(maxHotBuckets) values(count) as reason count by idx
| sort -count

This search identifies buckets which are “high volume” and rolling due to lack of an available bucket to index a new event in correct relative order. For each index where the maxHotBuckets is less than 10 increase the value of maxHotBuckets in indexes.conf to no more than 10. For these indexes 10 is a safe value.

Building a Splunk CIM compatible source addon

This walk through will build a Splunk CIM compatible source addon extending the CEF source type from my CEF framework TA. This is part three in a three part Series

Before you start, I will have to gloss over many topics you should have:

  • Read the prior two articles in this series.
  • You should also be comfortable with the ways Splunk can be used to parse and enrich data notable, TRANSFORMS, REPORT, EXTRACT, EVAL, and LOOKUP. A great cheat sheet is available from Alpura 
  • Be familiar with the web data model.
  • Be familiar with the data dictionary for our sample data

In the prior two articles we create a project style development environment for our add-on and a minimally viable set of field parse but have not yet considered any specific model. Reviewing our samples and vendor documentation we learn the data is most similar to a web access log which is known as the “web” model in the Splunk Common information model. In our case we have only two events available and vendor documentation that describes the events. When replicating this process in a new data source all unique events should be considered.

  • All events are a web access event
  • A subset of these events are “attack” events which do not have a web model to compare to.

Considering this we will use the “web” model as our basis and add a number of connivence elements for our users. The following table illustrates how we will map our data.


The implementation of the mapping is explained in the following table our implementation of the mapping can be viewed in bitbucket. Review default/props.conf default/transforms.conf and the lookups present in lookups/

CEF FieldSplunk FieldNotes
sipdest_ipNot a CIM field in the web model used by convention.
Not present in sample not validated
sptdest_portNot a CIM field in the web model used by convention.
Not present in sample not validated
qstruri_queryThe formatting of this field contains escaped equal (=) signs and is omitting the leading question mark (?) used a complex eval to adjust
cs9Rule_Name AND signatureNot a CIM field however Rule_Name is similar to an attack signature. Use an eval to split by comma and remove empties
Attack SeverityCEF_severity
This field requires a lookup to set severity as one of low,medium,high,critical created “imperva_incapsula_severity.csv
actionactThis field requires a lookup to translate act to action which can be allowed or blocked
vendor_productConstant string “Imperva Incapsula”
appvendor_appsaved for user search not used in the CIM model
appConstant String “incapsula”
actcachedSome values of act can indicate cached which is set to true using the actions lookup above

Creating Eventtypes

Fields alone are not enough to include an event in a datamodel. In-fact incorrect configuration of eventtypes and tags and include data which is invalid for a model compromising the usefulness of a model.

We will create two eventtypes for this data, our implementation can be viewed in eventype.conf using the bitbucket link above:

imperva_incapsula_webAll events matching our source and sourcetype
imperva_incapsula_web_attackAll eventtypes imperva_incapsula_web with a signature field.

Creating Tags

The final step to include events in a data model is to tag the events. Additional tags can be created in this case “attack” make sense for the subset of events that indicate a detection by the Incapsula WAF service. Tags which are not used by the data model are not included by default and are only available to the users in search activities.


Testing our work

Using “make package_test” ensure no unexpected errors or warnings are produced.

Using our development environment and EventGen via “make docker_dev” we can interactively validate our mapping is CIM compliant.

Building a CEF source add on for Splunk Enterprise

In my prior post I walked you through setting up a development environment for Splunk Enterprise to allow for an IDE/RAD development experience. In this article we are going to walk through creating an add on for Imperva’s Incapsula service the app name will be “ta-cef-imperva-incapsula”. This is a very basic add-on I’ll write another post focusing on data on-boarding and the details that are important. This walkthrough focuses on the fully integrated use of the tools in your development activities.

  1. Create a new project locally
  2. Develop the add-on including an event gen
  3. Build, package and manually test
  4. Create a bitbucket project
  5. Build and package with pipelines for CI/CD
  6. Publish your docs on “read the docs”
  7. Publish your app on Splunkbase

In this walk through we will not have time to cover CIM mapping this event source stay tuned for a follow up.

Creating a new project

  • create a new directory “mkdir ta-cef-imperva-incapsula”
  • cd to the directory “cd ta-cef-imperva-incapsula”
  • Initialize git with the flow module “git flow init -d”
  • Add the build tools submodule this does the heavy lifting of make for us “git submodule add -b master”
  • Make a folder for dependencies “mkdir deps”
    • Add eventgen “git submodule add -b master deps/SA-Eventgen”
    • Add our parent TA “git submodule add -b master deps/TA-cef-for-splunk”
  • Copy the make file “cp buildtools/bootstrap/Makefile .”
  • Copy the gitignore file “cp buildtools/bootstrap/.gitignore .gitignore”
  • Copy the sample docs “cp -R buildtools/bootstrap/docs .”
  • Copy the make config file “cp buildtools/bootstrap/ .”
  • Update the file “MAIN_APP” must be updated at this point the other configuration can be updated later but must be reviewed and editor before release. This value should be the same as the folder name the app will be published in and confirm to the guidance in app.conf.spec for this walk through we will use MAIN_APP=TA-cef-imperva-incapsula-for-splunk
  • create the folder src/$MAIN_APP as updated above ie. “mkdir -p src/TA-cef-imperva-incapsula-for-splunk”
  • copy the add on template to our working directory “cp -R buildtools/bootstrap/addon/* src/TA-cef-imperva-incapsula-for-splunk/”
  • copy the Splunk License “cp buildtools/bootstrap/license-eula.txt .”
  • copy the pipelines configuration “cp buildtools/bootstrap/bitbucket-pipelines.yml .”
  • Check our work “make package_test” we expect one failure reported “Major.Minor.Revision” this is normal as development builds provide a version number pattern that is SEMVER compliant and is not allowed in Splunk base but is allowed in app.conf.spec
  • browse to “out/package/splunkbase and verify the app is packaged

Developing our Add-on Word of Warning

This add-on takes advantage of existing sourcetype definitions in TA-cef-for-splunk in the parent sourcetype the “big 8” props are addressed. If you are following this to build a totally new add-on the best practices for your specific sourcetype should be considered.

Creating Samples and eventgen.conf

We are using the samples provided by Imperva here.

  • rename the file in src/TA-cef-imperva-incapsula-for-splunk/samples from future.sample to imperva_incapsula.sample
  • Replace the sample file contents with the two CEF formatted events below
CEF:0|Incapsula|SIEMintegration|1|1|Illegal Resource Access|3| fileid=3412341160002518171 siteid=1509732 suid=50005477 requestClientApplication=Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0 deviceFacility=mia cs2=true cs2Label=Javascript Support cs3=true cs3Label=CO Support src= caIP= ccode=IL cn1=200 in=54 xff= cs1=NOT_SUPPORTED cs1Label=Cap Support cs4=c2e72124-0e8a-4dd8-b13b-3da246af3ab2 cs4Label=VID cs5=de3c633ac428e0678f3aac20cf7f239431e54cbb8a17e8302f53653923305e1835a9cd871db32aa4fc7b8a9463366cc4 cs5Label=clappsigdproc=Browser cs6=Firefox cs6Label=clapp ccode=IL cicode=Rehovot cs7=31.8969 cs7Label=latitude cs8=34.8186 cs8Label=longitude Customer=CEFcustomer123 ver=TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 start=1453290121336 requestmethod=GET qstr=p\=%2fetc%2fpasswd app=HTTP act=REQ_CHALLENGE_CAPTCHA deviceExternalID=33411452762204224 cpt=443 filetype=30037,1001, filepermission=2,1, cs9=Block Malicious User,High Risk Resources, cs9Label=Rule name
CEF:0|Incapsula|SIEMintegration|1|1|Normal|0| siteid=1509732 suid=50005477 requestClientApplication=Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0 deviceFacility=mia src= caIP= ccode=IL cicode=Rehovot cs7=31.8969 cs7Label=latitude cs8=34.8186 cs8Label=longitude Customer=CEFcustomer123 ver=TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 start=1453290121336 requestmethod=GET cn1=200 app=HTTP deviceExternalID=33411452762204224 in=54 xff= cpt=443
  • Create a new file src/TA-cef-imperva-incapsula-for-splunk/default/eventgen.conf this file will replace our samples generated above more advanced config is possible but out of the scope of this tutorial.
#mode = replay
timeMultiple = 2
backfill = -15m

token.0.token = \d{13}
token.0.replacementType = timestamp
token.0.replacement = %s
  • Create a new file src/TA-cef-imperva-incapsula-for-splunk/default/props.conf use the following content initially we will do more work on this later, right now all we need to do is setup index time transforms
TRANSFORMS-zzTACEFimpervaincapsula = ta_cef_imperva_incapsula_for_splunk_v0_source

TRANSFORMS-zzTACEFimpervaincapsula = ta_cef_imperva_incapsula_for_splunk_v0_source
  • Create a new file src/TA-cef-imperva-incapsula-for-splunk/default/transforms.conf use the following content initially we will do more work on this later, right now all we need to do is setup index time transforms
REGEX = CEF:\d+\|Incapsula\|SIEMintegration\|[^\|]*\|[^\|]*\|[^\|]*\|[^\|]*\|
FORMAT= source::Imperva:Incapsula
  • Check our work so far. use “make docker_dev” to start a splunk instance and enable event gen. Using search verify records match the search “source=”Imperva:Incapsula” sourcetype=cef”

Creating the first release

We will use git flow to tag and create the first release

  • add working files to git “git add .”
  • add a comment and checkin git commit -m “Initial work”
  • Start the release process “git flow release start 0.1.0”
  • edit the version in src/TA-cef-imperva-incapsula-for-splunk/default/app.conf to “0.1.0”
  • git add src/TA-cef-imperva-incapsula-for-splunk/default/app.conf
  • git commit -m “bump version”
  • git flow release finish ‘0.1.0’ #note each comments screen must have some form of comments. “Create release” will do for now

Add our package to bitbucket

I use bitbucket but another vcs such as github will do CI/CD processes will be different and require your own creativity for integration.

  • Using your organization’s account create a new repository named “ta-cef-imperva-incapsula” I enable issue tracker and use a public repository
  • Follow instructions to “Get your local Git repository on Bitbucket”
  • Push your other tags “git push –all –follow-tags”
  • Navigate to bitbucket settings
  • Select Branching Model
    • Select “develop” for development branch
    • Select “master” for main branch
    • Check each of the boxes and click save (keep defaults)
  • Navigate to pipeline/settings
    • Use the toggle to enable
    • Click Configure which should show “Hooray”

Edit our docs

The documentation uses restructureText in a similar what to the python documentation project. Review and update docs/index.rst view our copy on bitbucket for an up to date example.

Publish our docs

Login to and following provided instructions connect your bitbucket account to and publish the docs.

Continue development

Continue development to completion a future article may elaborate on how to optimize this source for CIM and enterprise security.

Publish our 1.0.0 version to our VCS

Once development and testing is complete we are ready to publish 1.0.0.

  • Ensure no working files are dirty “git status”
  • git flow release start 1.0.0
  • edit the version in src/TA-cef-imperva-incapsula-for-splunk/default/app.conf to “0.1.0”
  • git add src/TA-cef-imperva-incapsula-for-splunk/default/app.conf
  • git commit -m “bump version”
  • git flow release finish ‘0.1.0’ #note each comments screen must have some form of comments. “Create release” will do for now
  • publish the release to bitbucket “git push –all –follow-tags”
  • navigate to bitbucket and select “pipelines” in navigation
  • Remember: “develop” builds will fail due to a fatal error reported by appinspect. This is presently normal
  • Wait for the master build to complete (success)
  • Navigate to downloads and find the “1.0.0” release package and download.

Publish our release to Splunkbase

With each “release” we can download a “app inspected” package ready for Splunkbase. Follow the on page instructions to publish your app at