Splunk: the bigger picture

We have been dealing with some parts of the data ingestion in Splunk. Now it’s time to put everything we discussed so far into a more helicopter view.

Inputs.conf, indexes.conf and outputs.conf

There are three files on this diagram we did not speak about yet.

inputs.conf: is the file where you define the files/scripts/ports that will do the ingestion of data. For the forwarders this will be mostly scripts and files, for an indexer this will contain the splunk tcp ports 9997 (unencrypted) and 9998 (encrypted).

outputs.conf: is the file where you are outputs, the hosts you will send your data to, are defined. For the forwarders here this will be the splunk tcp ports (9997 and 9998). The indexers if you are data does not need to be forwarded to another SIEM, do not have outputs configured. The output configuration will be a little different in a indexer cluster configuration. But let’s not get ahead of ourselves.

indexes.conf: is the file where all index parameters are defined. Keep in mind that all data that is sent to Splunk need an index. This is the structure that data will be stored in the Splunk platform.

The data ingestion process

The UF and the heavy forwarders collect the data via configurations in the inputs.conf file. The heavy forwarder is also getting data from (Cloud based) API based technologies.
The data is sent to a full Splunk instance for parsing and the magic 6 are applied. NOTE: A Heavy Forwarder is full Splunk instance and therefore the data parsing will happen according to it’s props.conf configuration.
Once the data has gone through this step there is no way to change the time stamp extraction or the event breaking. The only solution will be to delete the data and ingest it again with the correct settings.
The last ‘step’ if you can call it that, is that your data can be searched using SPL queries on the Search Head.

On this graph you see that a props.conf can reside on a UF. The only settings that will be applied on a UF are the following:

INDEXED_EXTRACTIONS: if you are ingesting properly structured data, like a CSV, TSV, JSON field extractions will already happen at the UF level and will be written to disk. Note that this has an impact on the storage needed to index the data.
EVENT_BREAKER_ENABLE: is a setting where you instruct the UF to already break up the data it reads, into events. The next setting has to be configured as well for this event breaker to work
EVENT_BREAKER: defines how the UF needs to break the event via a REGEX. In 99% of the cases this REGEX is the same as you would use for the LINE_BREAKER variable on an indexer/heavy forwarder.

Please note the diagram depicts a very simple straightforward distributed setup. This is simplified for the sake of education. Live production scenarios will contain more nodes and a more complex setup.

Latest articles

Latest comments

Video Downloader on Top 10 Tips to Secure Your Linux System7 October 2024
This post showcases exceptional research and a deep understanding of the subject matter. The clarity of your writing and the…

All Splunk Posts

Where do my Splunk settings go?
2 years agoNo Comments
In our previous article about apps, we s either aw …Read More »
How does Splunk deploy apps?
2 years agoNo Comments
In our last article we went over the structure of …Read More »
What is a Splunk app?
2 years agoNo Comments
Let me confess something to you all reading this article: …Read More »
What are Splunk forwarders?
2 years agoNo Comments
In the article about roles. we treated a decent amount …Read More »
What are Splunk’s server roles?
2 years agoNo Comments
In a distributed environment we will no longer have all …Read More »
What are Splunk’s deployment types?
2 years agoNo Comments
Splunk has two ways of being deployed. One is a …Read More »
What are Splunk’s magic 6?
2 years agoNo Comments
In our last article about Splunk metadata we spoke about …Read More »
What are metadata in Splunk?
2 years agoNo Comments
Meta what?? This does really sound more complicated than it …Read More »
How can I get data into Splunk?
2 years agoNo Comments
Now that you have installed your Splunk Stand Alone machine, …Read More »
What’s a Splunk Stand Alone (AIO) server?
2 years agoNo Comments
In the last article about Splunk we spoke about the …Read More »
What is Splunk?
2 years agoNo Comments
(I have no rights to use any official Splunk logos …Read More »
What is ‘Big Data’?
2 years agoNo Comments
A definition Big Data refers to extremely large and complex …Read More »
What is a REGEX?
2 years agoNo Comments
No, REGEX is not a new type of dinosaur or …Read More »

Splunk: the bigger picture

Inputs.conf, indexes.conf and outputs.conf

The data ingestion process

Categories

Latest articles

Latest comments

All Splunk Posts

Where do my Splunk settings go?

How does Splunk deploy apps?

What is a Splunk app?

What are Splunk forwarders?

What are Splunk’s server roles?

What are Splunk’s deployment types?

What are Splunk’s magic 6?

What are metadata in Splunk?

How can I get data into Splunk?

What’s a Splunk Stand Alone (AIO) server?

What is Splunk?

What is ‘Big Data’?

What is a REGEX?

Subscribe to our newsletter and stay up to date of everything we have simplified!