Giter VIP home page Giter VIP logo

slinkwatch's Introduction

๐Ÿ”ƒ slinkwatch CircleCI

slinkwatch is the Suricata Link Watcher, a tool to dynamically maintain interface entries in Suricata's configuration file, depending on what network interfaces are connected. It is meant to ease deployment of identical sensor installations at many heterogenous sites, allowing to make full use of the sensor resources in the light of varying monitoring volume.

Interaction with Suricata

In order to propagate changed interface configuration to Suricata, one would need to configure Suricata in such a way that the section of the configuration YAML created by the slinkwatch template is included from a separate file, e.g. via

...

include: interfaces.yaml

...

in, for example, /etc/suricata/suricata.yaml and then specifying /etc/suricata/interfaces.yaml as the --target-file in the slinkwatch call. Note that it must be writable by the slinkwatch process!

After modifying this file whenever a status change occurs, slinkwatch will attempt to restart Suricata. For systems that run systemd, slinkwatch will always try to restart the service given by the --service-name option (default is suricata.service). On non-systemd systems, it will run a simple command to restart Suricata (default is /etc/init.d/suricata restart). Whether systemd is available will be checked at runtime.

Resource assignment

Support for interfaces going online and offline at runtime obviously raises the question of how to assign computing resources (i.e. detection threads) to the individual interfaces. Unless all interfaces on a sensor are completely identical in terms of supported bandwidth and traffic, it is not sufficient to simply assign equal amounts of threads to all interfaces that are connected at a given point in time. For example, one might have a 10Gbit interface eth1 and a 1Gbit interface eth2 on such a machine, and most certainly one would want more threads assigned to the former than to the latter. Even more importantly, how would an existing assignment be changed to be most efficient when another 10Gbit interface, say eth3 gets a link?

We address this issue using thread weights. That is, each interface i is assigned a integer value wi to denote its resource allocation importance. For example, in the case above we could assign weth1 = weth3 = 10 and weth2 = 2. We also consider the active set of interfaces A as the set of all interfaces that have an active link. On a machine with n threads available for detection, we can then assign to each interface i the value

t_i=\lceil n \frac{w_i}{\sum_{j \in A}w_j} \rceil

as the number of threads allocated to detection for traffic on interface i. For example, for n = 40 we would then set

t_{\textup{eth1}} = t_{\textup{eth3}} = \lceil 40 \frac{10}{22} \rceil = 19

and

t_{\textup{eth2}} = \lceil 40 \frac{2}{22} \rceil = 4

We aim for slight overcommitment of CPU hyperthreads to avoid idling CPUs as much as possible.

Installing dependencies via dep

The component of slinkwatch talking to systemd requires a specific godbus version. Please make sure to run

$ dep ensure

before building to make sure the correct version constraints apply.

Usage

Define the interfaces that are available to assign in a YAML file, together with their weights:

# Interfaces available for Suricata
--- 
ifaces:
  eth1: 
    clusterid: 98
    threadweight: 10
  eth2: 
    clusterid: 97
    threadweight: 2
  eth3: 
    clusterid: 96
    threadweight: 10

Adjust the template to fit the desired configuration format:

%YAML 1.1
---
af-packet:{{ range $iface, $vals := . }}
  - interface: {{ $iface }}
    threads: {{ $vals.Threads }}
    cluster-id: {{ $vals.ClusterID }}
    cluster-type: cluster_flow
    defrag: yes
    rollover: yes
    use-mmap: yes
    tpacket-v3: yes
    use-emergency-flush: yes
    buffer-size: 128000
{{ else }}
  - interface: default
    threads: auto
    use-mmap: yes
    rollover: yes
    tpacket-v3: yes
{{ end }}

Finally, run slinkwatch, preferably in the background (there's also a systemd service unit file in the repo).

$ slinkwatch run 

It is possible to specify the locations of the template and config file using the command line parameters:

$ slinkwatch run --help
Run the slinkwatch service

Usage:
  slinkwatch run [flags]

Flags:
  -c, --config string            Configuration file (default "config.yaml")
  -d, --delta-bytes uint         threshold of bytes to be exceeded on interface to be marked as up (default 100)
  -h, --help                     help for run
  -i, --interfaces string        Template file for interfaces (default "interfaces.tmpl")
  -p, --poll-interval duration   poll time for interface changes (default 5s)
  -r, --restart-command string   Suricata restart command (default "/etc/init.d/suricata restart")
  -s, --service-name string      systemd service name for Suricata service (default "suricata.service")
  -t, --target-file string       Target YAML file with interface information (default "/etc/suricata/interfaces.yaml")

Other commands

  • slinkwatch make-config creates an initial YAML file with skeleton config entries for local interfaces (or a subset defined by a regular expression)
  • slinkwatch makeman creates a set of man pages for the tool
  • slinkwatch show-active lists the currently active set of interfaces (useful for debugging)

Dependencies/requirements

Needs ifplugo for network change notifications. This introduces a runtime dependency on libdaemon. It is also highly recommended to use systemd.

Authors

Sascha Steinbiss

License

GPL2 (due to ifplugo being GPL2).

slinkwatch's People

Contributors

hillu avatar norg avatar satta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

slinkwatch's Issues

Add an option to manage affinity settings

In the case of cluster_qm and fixed affinity settings between network queues and cpu cores it is necessary that the suricata affinity settings and suricata interface settings are aligned.

Example:

enp94s0f0 cluster_qm mode with cpu cores 1-9 assigned on NUMA node 0 via set_irq_affinity and RSS Queues
enp94s0f1 cluster_qm mode with cpu cores 21-29 assigned o NUMA node 0 via set_irq_affinity and RSS Queues
enp175s0f0 cluster_flow mode with cpu cores 11-19 assigned on NUMA node 1

This requires that the worker-cpu-set is set to [ ""1-9","21-29", "11-19" ] if the interface config has the order enp94s0f0, enp94s0f1, enp175s0f0 as suricata picks the cores in order of the interface config.
So if [ ""1-9","11-19", "21-29" ] would be set enp94s0f1 would use the cores 11-19 although it should use 21-29.

This could change in the future if Suricata changes the behaviour or the interface settings can define the cores directly.

add option to define threads hardcoded

In the case of symmetric queues it's recommended to use the same amount of threads, this might work with threadsweight but it might be better to force a specific amount of threads to ensure it's equal to the queue number

Add an option to set rss queues or call scripts

In cluster_qm mode if we change the amount of assigned threads to an interface we also need to set the NIC settings via ethtool and set_irq_affinity. Easy workaround would be to add an option to call external scripts or include it somehow in slinkwatch to take care of this.
The latter could be more stable as we could add checks if it's correctly set.

cover failover use cases

There are scenarios where two (or more) interfaces are in an active/passive mode. So interface0 receives traffic to inspect and interface1 does not. But sometimes interface1 has the LINK UP and receives a small portion of (management) traffic. In those cases it's best if all cores and queues in cluster_qm mode are assigned to the interface with the traffic and avoid the other completely. But in the failover case where interface0 looses the traffic and interface1 receives it, we need to swap that.

So one part is to make sure the swap will happen but it needs to be defined how it's detected, since a simple link or traffic check won't work. So adding some sort of threshold, maybe even customized via some value/template, could ensure how much traffic is seen as "inactive" for an interface.

The other part is to make sure it is such a scenario and not just a switch of amount of traffic. This might be solved by a flag. Because it could also be that just some parts of the traffic is gone or has swapped which is nearly impossible to maintain.

Add "notification mode" that only logs interface status changes

For statically configured sensor setups, it makes sense to have slinkwatch running as a notification-only tool that only logs active set state changes instead of modifying the interface configuration and restarting Suricata. Operations engineers could then use this information to investigate loss of visibility events individually.

add an option to set a specific runmode

In some scenarios where you have 10GE and 1GE cards in a system you might want to use workers runmode for the 10GE case and the autofp runmode for the 1GE cards (due to performance/wrong_thread issues). But it needs to be detected which cards are in use and thus change the runmode in the generated config.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.