andydunstall / piko Goto Github PK

An open-source alternative to Ngrok, designed to serve production traffic and be simple to host (particularly on Kubernetes)

License: MIT License

Makefile 0.22% Go 99.14% Dockerfile 0.06% Shell 0.16% Smarty 0.41%

golang http reverse-proxy http-proxy tunneling

piko's Introduction

What Is Piko?
Design Goals
Getting Started
How Piko Works
Support
Docs
Contributing
License

What Is Piko?

Piko is a reverse proxy that provides a secure way to connect to services that aren’t publicly routable, known as tunneling. Instead of sending traffic directly to your services, your upstream services open outbound-only connections (tunnels) to Piko, then Piko forwards traffic to your services via their established connections.

Piko has two key design goals:

Built to serve production traffic by running as a cluster of nodes for fault tolerance, horizontal scaling and zero-downtime deployments
Simple to host behind a HTTP(S) load balancer on Kubernetes

Therefore Piko can be used as an open-source alternative to Ngrok.

Such as you may use Piko to expose services in a customer network, a bring your own cloud (BYOC) service, or to connect to user devices.

Reverse Proxy

In a traditional reverse proxy, you configure routing rules describing how to route incoming traffic to your upstream services. The proxy will then open connections to your services and forward incoming traffic. This means your upstream services must be discoverable and have an exposed port that's accessible from the proxy.

Whereas with Piko, your upstreams open outbound-only connections to the Piko server and specify what endpoint they are listening on. Piko then forwards incoming traffic to the correct upstream via its outbound connection.

Therefore your services may run anywhere without requiring a public route, as long as they can open a connection to the Piko server.

Endpoints

Upstream services listen for traffic on a particular endpoint. Piko then manages routing incoming connections and requests to an upstream service listening on the target endpoint. If multiple upstreams are listening on the same endpoint, requests are load balanced among the available upstreams.

No static configuration is required to configure endpoints, upstreams can listen on any endpoint they choose.

You can open an upstream listener using the Piko agent, which supports both HTTP and TCP upstreams. Such as to listen on endpoint my-endpoint and forward traffic to localhost:3000:

# HTTP listener.
$ piko agent http my-endpoint 3000

# TCP listener.
$ piko agent tcp my-endpoint 3000

You can also use the Go SDK to listen directly from your application using a standard net.Listener.

HTTP(S)

Piko acts as a transparent HTTP(S) reverse proxy.

Incoming HTTP(S) requests identify the target endpoint to connect to using either the Host header or x-piko-endpoint header.

When using the Host header, Piko uses the first segment as the endpoint ID. Such as if your hosting Piko with a wildcard domain at *.piko.example.com, sending a request to foo.piko.example.com will be routed to an upstream listening on endpoint foo.

To avoid having to set up a wildcard domain you can instead use the x-piko-endpoint header, such as if Piko is hosted at piko.example.com, you can send requests to endpoint foo using header x-piko-endpoint: foo.

TCP

Piko supports proxying TCP traffic, though unlike HTTP it requires using either Piko forward or the Go SDK to map the desired local TCP port to the target endpoint.

Piko forward listens on a local TCP port and forwards connections to the configured upstream endpoint via the Piko server.

Such as to listen on port 3000 and forward connections to endpoint my-endpoint:

piko forward 3000 my-endpoint

Note unlike with HTTP, there is no way to identify the target endpoint when connecting with raw TCP, which is why you must first connect to Piko forward instead of connecting directly to the Piko server. Piko forward can also authenticate with the server and forward connections via TLS.

You can also use the Go SDK to open a net.Conn that's connected to the configured endpoint.

Design Goals

Production Traffic

Piko is built to serve production traffic by running the Piko server as a cluster of nodes to be fault tolerant, scale horizontally and support zero downtime deployments.

Say an upstream is listening for traffic on endpoint E and connects to node N. Node N will notify the other nodes that it has a listener for endpoint E, so they can route incoming traffic for that endpoint to node N, which then forwards the traffic to the upstream via its outbound-only connection to the server. If node N fails or is deprovisioned, the upstream listener will reconnect to another node and the cluster propagates the new routing information to the other nodes in the cluster. See How Piko Works for details.

Piko also has a Prometheus endpoint, access logging, and status API so you can monitor your deployment and debug issues. See observability for details.

Hosting

Piko is built to be simple to host on Kubernetes. This means it can run as a cluster of nodes (such as a StatefulSet), supports gradual rollouts, and can be hosted behind a HTTP load balancer or Kubernetes Gateway.

Upstream services and downstream clients may connect to any node in the cluster via the load balancer, then the cluster manages routing traffic to the appropriate upstream.

See Kubernetes for details.

Getting Started

See Getting Started.

How Piko Works

See How Piko Works.

Support

Use GitHub Discussions to ask questions, get help, or suggest ideas.

Docs

See Wiki.

Contributing

See CONTRIBUTING.

License

MIT License, please see LICENSE for details.

piko's People

Contributors

Stargazers

Watchers

piko's Issues

Extend configuration

Piko is missing a lot of configuration options, such as gossip, cluster, timeouts, ...

The configuration documentation is also a bit light

It might also be worth splitting configuration into 'basic' and 'advanced' like Mimir does (https://grafana.com/docs/mimir/latest/configure/configuration-parameters/#parameter-categories)

'No connected upstream' retries

Say a request for endpoint E is sent to node N₁, but N₁ doesn't have an upstream connection for that endpoint. N₁ will then check its local view of the cluster to see if another node has an upstream connection for endpoint E. If N₁ finds one or more nodes are reporting as having an upstream for endpoint E, the request is load balanced among those nodes and send to N₂.

Though since the cluster state is eventually consistent, N₂ may no longer have an upstream connection for endpoint E. Such as if the upstream disconnected in the last 100ms so the updated routing information hasn't propagated to N₁.

In this case N₂ will respond that it doesn't have an upstream connection for endpoint E, so N₁ can safely backoff and retry the request as its sure the request never reached an upstream connection. Since cluster state updates are propagated quickly, N₁ should quickly learn about the new cluster state.

This minimises disruption when nodes leave the cluster (either gracefully or due to failure), or upstreams are disconnected from the server and reconnect.

If the first attempt fails because N₁ doesn't know of any nodes with an upstream connection for endpoint E, the request should immediately fail. Its only worth retrying if there was an upstream connection for endpoint E as its likely there still is after the upstream reconnects. This should be configurable so you can disable retries (such as you may prefer to retry at the application level).

提交错了，忽略

删除它，提交错了，抱歉

Cluster netsplit recovery

Say you have a cluster with 6 nodes, then a network partition means one half of the cluster can't talk to the other half

Currently Piko will end up with 2 smaller clusters, where each considers the other as unreachable or no longer part of the cluster.

To ensure the cluster recovers when the netsplit recovers, each node should periodically attempt to gossip with any unknown nodes. Such as when service discovery is configured using DNS (such as a headless service on K8S), the nodes can re-resolve the domain and check if there are any nodes that they don't consider part of the cluster and attempt to contact those nodes.

Upstream connection rebalancing

Say you have a cluster with 3 nodes, where each node has 1000 upstream connections. If those nodes are becoming overloaded you may increase the number of replicas (either manually or with autoscaling).

Currently, if you add 3 more nodes, you'll end up with 3 nodes having 1000 upstream connections and 3 nodes with 0 upstream connections. Therefore Piko should rebalance upstream connections.

As Piko is designed to be hosted behind a load balancer, if a node drops the connection to an upstream service, that service will reconnect to a random node. Therefore when nodes find they have far more connections than the average for the cluster, then can gradually shed connections to upstreams which will then reconnect to a random node, rebalancing the cluster.

Such as in the above example, the average number of connections across the 6 nodes is 500, but the first three nodes all have 1000 connections each. The threshold and rate of shed connections can be configurable, such as shedding if you have 20% more connections than the cluster average, and shedding 0.5% of connections every second.

Upstream health checks

Say you have 10 upstream connections for endpoint E, though one of the upstreams is failing to serve requests, then Piko should stop forwarding requests to that unhealthy upstream.

Therefore Piko should support health checks for upstream services (similar to other reverse proxies like Caddy).

Upstream load balancing

Say you have 10 upstream connections for endpoint E. Piko should attempt to evenly distribute of load among those upstream connections.

If node N has upstream connections to its local node, it will distribute connections among those upstream connections in a round-robin fashion.

However if N doesn't have an upstream connection itself, so must forward to another node, it should still attempt to load balance requests evenly among the known upstream connections. Such as if node N₁ handles requests for endpoint E, and it knows node N₂ has 10 upstream connections for the endpoint, and N₃ has 2 upstream connections for the endpoint, then it should send 5 times more requests to N₂ than N₃ (those nodes with then load balance among their connected upstreams).

invalid go version '1.21.1': must match format 1.23

Getting started with piko, I run into the following error.

root@vault:~/piko# make piko
mkdir -p bin
go build -o bin/piko main.go
go: errors parsing go.mod:
/root/piko/go.mod:3: invalid go version '1.21.1': must match format 1.23
make: *** [Makefile:7: piko] Error 1

I made sure to apt install -y golang-1.21, not sure what to do beyond that.

root@vault:~/piko# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.4 LTS
Release:        22.04
Codename:       jammy

Helm chart

Congrats on your project. Looks promising.

Have you any plans to add Helm charts or/and an Operator?

Connection closures running `piko` as `StatefulSet`

It seems as though when I run piko as a StatefulSet using tcp, I am running into random connection closures (from the agent?).

These are the logs I see from the agent:

{"level":"info","ts":"2024-07-29T18:18:19.116Z","subsystem":"proxy.tcp.access","msg":"connection opened","endpoint-id":"my-redis-endpoint"}
{"level":"debug","ts":"2024-07-29T18:18:49.116Z","subsystem":"proxy.tcp","msg":"copy to conn closed","endpoint-id":"my-redis-endpoint","error":"writeto tcp [::1]:41612->[::1]:6379: read tcp [::1]:41612->[::1]:6379: use of closed network connection"}
{"level":"info","ts":"2024-07-29T18:18:49.116Z","subsystem":"proxy.tcp.access","msg":"connection closed","endpoint-id":"my-redis-endpoint"}

One interesting thing I noticed as that the connection closures seem to happen at a regular interval from the time a connection is opened (30s). This does not happen when I just run 1 replica of the server interestingly enough, only when I run more than 1 using the gossip protocol.

Here is a repro config, running the workload on kubernetes (minikube cluster)...

apiVersion: v1
kind: Service
metadata:
  name: piko-forward-redis
  labels:
    app: piko-forward-redis
spec:
  type: NodePort
  ports:
    - port: 6001
      protocol: TCP
      name: forward
  selector:
    app: piko-forward-redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: piko-forward-redis
spec:
  selector:
    matchLabels:
      app: piko-forward-redis
  template:
    metadata:
      labels:
        app: piko-forward-redis
    spec:
      containers:
        - name: piko-forward-redis
          image: piko-hyperbolic:a587abc
          imagePullPolicy: IfNotPresent
          args:
            - forward
            - tcp
            - --connect.url
            - http://piko.default.svc:8000
            - "0.0.0.0:6001"
            - my-redis-endpoint
          ports:
            - containerPort: 6001
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-server
  namespace: default
spec:
  selector:
    matchLabels:
      app: redis-server
  template:
    metadata:
      labels:
        app: redis-server
    spec:
      containers:
        - name: piko-agent
          image: piko-hyperbolic:a587abc
          imagePullPolicy: IfNotPresent
          args:
            - agent
            - tcp
            - --log.level
            - debug
            - --server.bind-addr
            - 0.0.0.0:5000
            - --connect.url
            - http://piko.default.svc:7000
            - my-redis-endpoint
            - "6379"
          ports:
            - containerPort: 5000
        - name: redis-server
          image: redis:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 6379
---
apiVersion: v1
kind: Service
metadata:
  name: piko
  labels:
    app: piko
spec:
  ports:
    - port: 8000
      name: proxy
    - port: 8002
      name: admin
    - port: 8003
      name: gossip
    - port: 7000
      name: upstream
      targetPort: 7000
  selector:
    app: piko
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: server-config
data:
  server.yaml: |
    cluster:
      node_id_prefix: ${POD_NAME}-
      join:
        - piko.default.svc
    upstream:
      bind_addr: 0.0.0.0:7000
    log:
      level: debug
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: piko
spec:
  selector:
    matchLabels:
      app: piko
  serviceName: "piko"
  replicas: 3
  template:
    metadata:
      labels:
        app: piko
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: piko
          image: piko-hyperbolic:a587abc
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8000
              name: proxy
            - containerPort: 7000
              name: upstream
            - containerPort: 8002
              name: admin
            - containerPort: 8003
              name: gossip
          args:
            - server
            - --config.path
            - /config/server.yaml
            - --config.expand-env
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
          volumeMounts:
            - name: config
              mountPath: "/config"
              readOnly: true
      volumes:
        - name: config
          configMap:
            name: server-config
            items:
              - key: "server.yaml"
                path: "server.yaml"

I will try and dig into the gossip protocol tomorrow, but just wanted to raise this issue in case there were any quick hints from your end @andydunstall.

Note

The image names would need to be changed from the above config. The images there were just from my local Docker build.