You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the well-documented and supported approach for Elastic integrations is to use separate TCP input ports for each dataset. However, there's a lack of guidance on handling multiple datasets from a single TCP input within an integration.
While developing integrations, I've encountered the need to split incoming logs from one TCP port into different datasets, but I haven't found documentation or best practices on how to do this.
There are two potential approaches I see:
Single Dataset filter:
Create one dataset with a TCP input that ingests all traffic.
Use agent processors to adjust the data_stream.dataset field based on the message content.
Create additional datasets with only assets (no working input) to receive and process events.
Multiple Datasets on the same port:
Create multiple datasets, each with its own assets (index templates, ingest pipelines, datastreams).
Have all dataset inputs listen on the same port.
Use individual processors to filter data from the message field and drop what is not valid for this dataset.
Add data_stream.dataset key and value to the input config per dataset
At the moment, some integrations are solving this problem differently (see the f5_bigip integration), which is using one dataset with a redirection into different pipelines. This creates one data stream with many fields, which may lead to bad performance.
So my questions on this are:
What is the recommended best practice for handling multiple datasets from a single TCP input?
Where can we best document this properly?
Edit: Made the examples clearer
The text was updated successfully, but these errors were encountered:
Currently, the well-documented and supported approach for Elastic integrations is to use separate TCP input ports for each dataset. However, there's a lack of guidance on handling multiple datasets from a single TCP input within an integration.
While developing integrations, I've encountered the need to split incoming logs from one TCP port into different datasets, but I haven't found documentation or best practices on how to do this.
There are two potential approaches I see:
Single Dataset filter:
data_stream.dataset
field based on themessage
content.Multiple Datasets on the same port:
message
field and drop what is not valid for this dataset.data_stream.dataset
key and value to the input config per datasetAt the moment, some integrations are solving this problem differently (see the
f5_bigip
integration), which is using one dataset with a redirection into different pipelines. This creates one data stream with many fields, which may lead to bad performance.So my questions on this are:
Edit: Made the examples clearer
The text was updated successfully, but these errors were encountered: