Giter VIP home page Giter VIP logo

go-pmtud's Introduction

go-pmtud

CI Go Report Card

go-pmtud is a simplified implementation of cloudflare/pmtud in Go.

Problem

Using ECMP (Equal Cost Multi Path) on bare metal Kubernetes clusters makes load sharing of traffic possible (e.g. by using service addresses of type ExternalIP).

Hosts (and Pods) try to leverage full MTU size that is derived from their interface configuration (e.g. 9000 bytes).

If (1) MTU is smaller somewhere in the path between sender and receiver and (2) packet has a DF (do-not-fragment) bit set, router sends ICMP Destination Unreachable message (type 3 code 4 message) to sender (originator of too large packets).

In case of ECMP it may not reach the original sender, thus breaking the communication.

More details in this blog post by Cloudflare: Path MTU discovery in practice.

go-pmtud replicates ICMP Destination Unreachable packets to all nodes in same Kubernetes cluster, so that the sender gets awareness that it has to use smaller packets for a particular destination.

Concept

  1. ICMP Destination Unreachable message (type 3 code 4 message) packets are filtered and sent to specific NFlog group by iptables.

  2. go-pmtud replicates ICMP packets to all nodes in same Kubernetes cluster, so that the sender pod gets awareness that it has to use smaller packets for particular destination.

Exec into the pod:

# ip route get 192.168.100.10
192.168.100.10 via 192.100.0.1 dev eth0 src 192.100.0.50
    cache  expires 484sec mtu 9000  <<<< connection is failing

# ip route get 192.168.100.10
192.168.100.10 via 192.100.0.1 dev eth0 src 192.100.0.50
    cache  expires 484sec mtu 8996  <<<< correct MTU information, connection is working

Build

Build from source:

go mod download
go build -v -o /go-pmtud cmd/go-pmtud/main.go

Build a Docker image:

docker build -t go-pmtud .

go-pmtud options

Following options are available:

  1. peers - resend ICMP frag-needed packets to this peer list.
  2. iface - interface that listens for ICMP packets and resends them to other peers.
  3. nodename - node hostname, used for metric label.
  4. nflog-group - NFLOG group, set to 33 in our case.
  5. metrics-port - Port for Prometheus metrics (30040 by default).
  6. ttl - TTL of replicated ICMP packets.
  7. ignore-networks - Do not resend ICMP frag-needed packets originated from specified networks

If iface is empty, it finds out the outgoing interface based on the default route.

Example - go-pmtud Daemonset

go-pmtud can run as a Daemonset, example.

Example values.yaml:

images:
  iptables:
    repository: sapcc/iptables
    tag: v20191226161919
  pmtud:
    repository: sapcc/go-pmtud
    tag: latest

iptables:
  nflogGroup: 33
  ignoreSourceNetworks: 192.168.100.0/24

pmtud:
  ttl: 10
  metricsPort: 30040
  interface: eth0
  peers: 192.168.100.2, 192.168.100.3, 192.168.100.4, 192.168.100.5, 192.100.0.50

Example - iptables and NFlog

There is an iptables rule on each node that redirects ICMP Destination Unreachable` packets to NFlog group nr. 33:

iptables -t raw -D PREROUTING -i <interface> -p icmp -m icmp --icmp-type 3/4 --j NFLOG --nflog-group 33

Important: we need ignore packets from summarized source networks of all nodes in the local cluster to avoid re-sending loops. Use ignore-networks option for this. This means a node will not re-send already retransmitted ICMP messages. It will only resend messages that are usually originated by routers on the path.

License

This project is licensed under the Apache2 License - see the LICENSE file for details

go-pmtud's People

Contributors

defo89 avatar majewsky avatar renovate[bot] avatar schwarzm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-pmtud's Issues

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Vulnerabilities

Renovate has not found any CVEs on osv.dev.

Detected dependencies

github-actions
.github/workflows/checks.yaml
  • actions/checkout v4
  • actions/setup-go v5
  • golangci/golangci-lint-action v6
  • golang/govulncheck-action v1
  • reviewdog/action-misspell v1
.github/workflows/ci.yaml
  • actions/checkout v4
  • actions/setup-go v5
  • actions/checkout v4
  • actions/setup-go v5
.github/workflows/codeql.yaml
  • actions/checkout v4
  • actions/setup-go v5
  • github/codeql-action v3
  • github/codeql-action v3
  • github/codeql-action v3
gomod
go.mod
  • go 1.23
  • go 1.22.5
  • github.com/sapcc/go-nflog/v2 v2.0.1
  • github.com/sapcc/arp v0.0.0-20210323090929-4fa8e70001f0@4fa8e70001f0
  • github.com/florianl/go-nflog/v2 v2.1.0
  • github.com/go-logr/logr v1.4.2
  • github.com/mdlayher/arp v0.0.0-20220512170110-6706a2966875@6706a2966875
  • github.com/mdlayher/ethernet v0.0.0-20220221185849-529eae5b6118@529eae5b6118
  • github.com/mdlayher/packet v1.1.2
  • github.com/prometheus/client_golang v1.19.1
  • github.com/spf13/cobra v1.8.1
  • github.com/spf13/viper v1.19.0
  • github.com/vishvananda/netlink v1.1.0
  • golang.org/x/net v0.27.0
  • k8s.io/api v0.30.1
  • k8s.io/apimachinery v0.30.1
  • k8s.io/client-go v0.30.1
  • sigs.k8s.io/controller-runtime v0.18.4

  • Check this box to trigger a request for Renovate to run again on this repository

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.