Giter VIP home page Giter VIP logo

leia's Introduction

leia

What?

A service backed by Ktor, writing incoming HTTP requests to one or serveral topics of one or more message brokers (Kafka, Redis or Cloud Pub/Sub).

Why?

Sometimes a call to a webhook is only received once. Writing a HTTP request to a semi-permanent message broker log allows for repeated processing.

How?

Defining a route with the same syntax used for Ktor routing

Example

title = "My routes config"

[[routes]]
        path = "/status/mail"
        topic = "mail"
        verify = false
        format = "proto"
        methods = [ "POST", "PUT", "HEAD", "GET" ]

[[routes]]
        path = "/api/{id}"
        topic = "api_calls"
        format = "raw_body"
        verify = true
        validateJson = true
        jsonSchema = """
<put your JSON schema here>
"""

Routing is saved in files with .toml extension, and their location is set with the environment varaiable CONFIG_DIRECTORY. If this is not set it will default to /etc/config/.

Configuration

Configuration consists of few tables:

Configuration is loaded on startup and then refreshed every 1 minute.

It can be retrieved either from a file or from kubernetes custom resources.

Configuration in files

Configuration in a file is stored in TOML file format. See example configuration here. You can have several configuration files stored in /etc/config/ or in path defined in CONFIG_DIRECTORY environment variable.

Configuration in kubernetes

Configuration in kubernetes requires creating custom resource defitnions first:

kubectl create -f leiaroute_crd.yaml
kubectl create -f leiasinkprov_crd.yaml

Then add your objects:

kubectl create -f object1.yaml
...

Custom resource definitions and sample object definitions YAML files can be found here.

Routes

Route defines where given request should be sent to.

  • path - is mandatory, HTTP URL path to match
  • topic - is mandatory, topic where the message will be sent
  • sink - is optional, name of sink where message will be sent, if not provided uses default sink.
  • verify - is optional, default is false. Decides whether only request with a verified json web token is allowed. Requires ktor-jwt.
  • format - is optional, default is proto a format using protobuf defined in zensum/webhook-proto. Also available is raw_body - writes HTTP body as is.
  • methods - is optional, default is all seven HTTP verbs.
  • response - is optional, HTTP response code which should be sent to the sender on success, default is 204.
  • validateJson - is optional, wether to validate body of the request as JSON before sending, default is false. 204 code is returned on success, 400 code on error.
  • jsonSchema - is optional, validates request body against this JSON schema if validateJson is set to true
  • cors - is optional, list of hosts to check incoming request's origin against for CORS. By default all traffic is allowed. If contains * then all hosts are allowed. When non empty, OPTIONS method is allowed implicitly.
  • hosts - is optional, list of hosts to check incoming request's host field against.
  • authenticateUsing - is optional, list of auth providers to verify request against

Sink providers

Leia needs at least one sink configured and marked as default.

  • name - is mandatory, name to identify the sink in routes configuration sink field
  • type - is optional, one of these types: kafka, redis, gpubsub, null, default is kafka
  • isDefault - is optional, must be set to true for one sink, default is false
  • options - is optional, additional options to configure sink provider, it is a list of key/value pairs

For testing purposes you can use null sink type which does not forward messages and only logs them.

Options

Kafka

To set kafka hostname and port use following option in configuration:

host = "<hostname>:<port>"
Redis

To set up hostname and port for redis Pub/Sub use following options in configuration:

host = "<hostname>"
port = "<port>"
Cloud Pub/Sub

To use other project than the default for Cloud Pub/Sub one use following option in configuration:

projectId = "<your-project-id>"

Before writing to sinks in Cloud Pub/Sub you need to create them first:

gcloud pubsub topics list-subscriptions --project <your-project-id> <topic>

GOOGLE_APPLICATION_CREDENTIALS environment variable needs to point to your cloud service account key in json file. Create service account key.

Auth providers

This table is optional in configuration.

  • name - is mandatory, name to identify the auth provider in routes configuration authenticateUsing field
  • type - is optional, one of these types: jwk, basic_auth, no_auth, default is no_auth
  • options - is optional, additional options to configure auth provider, it is a list of key/value pairs

Basic_auth

Option basic_auth_users is a map of users and passwords.

Jwk

Option jwk_configis mandatory, it is a map of key/values. The map must contain jwk_url and jwk_issuer keys.

Environment variables

  • CONFIG_DIRECTORY - location of toml files (default /etc/config/)
  • PORT - port on which leia listens for requests (default 80)
  • KUBERNETES_SERVICE_HOST - hostname of kubernetes API (default localhost)
  • KUBERNETES_SERVICE_PORT - port of kubernetes API (default 8080)
  • KUBERNETES_ENABLE - whether to read configuration from Kubernetes custom resources (default true). To disable set it to false.
  • PROMETHEUS_ENABLE - if set to true enables /metrics endpoint on port 9090 exposing metrics for prometheus (uses zensum/ktor-prometheus-feature), default false
  • LOG_REMOTE_HOST - if set to true enables logging of remote host (client address or host name), default false. Note the value may come from headers and could be falsified.

Health check

To check status of leia make request to http://leiahost/leia/health. Sample output:

sink redis1: ERROR
sink kafka1: OK
leia: ERROR

Error messages will be written to logs. You can also get them using verbose parameter in the request. Here is sample output for http://leiahost/leia/health?verbose:

sink redis1: ERROR redis.clients.jedis.exceptions.JedisConnectionException: Failed connecting to host redis:6379
sink kafka1: OK
leia: ERROR

leia's People

Contributors

dependabot-support avatar mantono avatar pn avatar torbacka avatar williamhogman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

pn

leia's Issues

Dependabot couldn't find a build.gradle for this project

Dependabot couldn't find a build.gradle for this project.

Dependabot requires a build.gradle to evaluate your project's current Java dependencies. It had expected to find one at the path: /build.gradle.

If this isn't a Java project, or if it is a library, you may wish to disable updates for it from within Dependabot.

You can mention @dependabot in the comments below to contact the Dependabot team.

Advanced Authentication

Currently Leia supports only supports a trivial authentication scheme. If the JWK_URL env-var is set routes with verify = true require there to be a JWT token present in the http authentication header. While JWT is a good interoperable and federated authentication scheme the current method is lacking in a number of ways. The system does not verify audience or subject claims as specified in the JWT standard nor does it offer any mechanisms for verifying other parts of the claim.

In addition to the limitations in the implementation of JWT, there is also a need for other authentication schemes such as HTTP-basic auth. Finally, if configurable authentication and authorisation mechanisms were introduced it would enable us to expose certain internals to potential admin and monitoring tools if proper authentication was provided.

For these reasons I propose that we introduce a new concept to Leia which for purposes of discussion we will call an Auth-provider. An auth-provider is a named filter which can be applied to a route so that only authenticated requests match the route. In addition to the filter the Auth-provider may expose additional fields to the data-sinks and formats such as username or isAuthenticated. A single route is configurable to use multiple Auth-providers and only a single auth-provider needs to match for the route to match.

Auth-providers themselves, will have type e.g. JWK or BasicAuth or HashJWT. Depending on the type different configuration variables will be available e.g URL for JWK or users for BasicAuth. Another idea for an auth-provider may be webhook which authenticates the request against some other provider before letting it through.

Allow configuration from a folder of Kubernetes objects

Add a config provider that reads a folder of Kubernetes style objects, reloading them on some interval

---
      apiVersion: leia/v1
      kind: LeiaRoute 
      name: the_name_of_my_route_for_dedeupe
      somevar: somevalue 

This might be a prerequisite for #62

Add support for ensuring that a request is JSON

When making an HTTP endpoint it is useful to be able to reject responses that do not contain valid JSON. To implement this issue add a configuration field allowing a route to restrict input to valid JSON. If the JSON is invalid 400 should be returned.

Be careful with what happens if the JSON payload is too big, so that it doesn't crash everything

Implement Support for Basic Auth

This is directly related to #13, but since there are some questions directly concerning the basic auth implementation, I created an issue of its own.

1. Should we use a simpler hash algorithm such as SHA-512 or a more complex one like PBKDF2WithHmacSHA1? Or should it be configurable?
2. Should we (allow/enforce) the use of salt for hashed basic auth credentials? Or can we expect the inputs to be so unique that salting will not be necessary?
3. If we will use PBKDF2, should the number of hash iterations be possible to configure?

  • Hashing
    • Uses SHA-256
    • No Salt
    • Only hash password and not username
  • Return WWW-Authenticate: Basic realm="User Visible Realm", charset="UTF-8" as a response to any 401 request.
  • Allow configuration of credentials - Example config provided here

Support custom reponse codes

Some service providers will expect a specific response code and regard every other response as a failure (even when it is a 2** code). For example, Twilio expects 204/No-Content and Mandrill expects 200/OK. This will make it hard to make both of these services happy. This can be solved by providing an optional response code on success for each entry/route, leaving no entry defaulting to 204.

JSON schema validation

JSON schema is a commonly used format validating JSON documents. Checking incoming request against a specified JSON schema would allow consumer of Leia produced data to skip certain validation steps, and would allow producers to get proper error messages when their data-format is invalid.

Log all incoming requests

We should log all incoming calls in order to more easily follow what happens in Leia. Should also map which topic the incoming request is written to.

Connections hangs when awating response from Leia

Some different possible causes (can be responsible alone for this behavior or in combination)

  • Status codes, especially No-Content (204) vs OK (200)
  • Header Content-Length (or rather the absence of it) ❗️
  • Header transfer-encoding: chunked - should we really use this? ❗️
  • Google load balancer (error cannot be reproduced without it, but this may still not be the actual problem) ❗️

Dependabot couldn't find a build.gradle for this project

Dependabot couldn't find a build.gradle for this project.

Dependabot requires a build.gradle to evaluate your project's current Java dependencies. It had expected to find one at the path: /build.gradle.

If this isn't a Java project, or if it is a library, you may wish to disable updates for it from within Dependabot.

You can mention @dependabot in the comments below to contact the Dependabot team.

Add integration test

This test should run everything with docker-compose and Kafka and try a number of requests ensuring that they all return 200

Configuration read only from one file

Leia should be able to read from directory of configuration files, but when second file is added it does not affect the app.

Looks like the issue is with computeCurrentState() in TomlRegistry which does not join configurations, but returns last one.

java.util.concurrent.CompletionException: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.

08:44:59.625 [nettyCallPool-4-1] ERROR ktor.application - Unhandled: POST - /xxxxxxx
java.util.concurrent.CompletionException: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
	at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
	at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at franz.producer.kafka_one.FutureCallback.onCompletion(Async.kt:19)
	at org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:201)
	at org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:185)
	at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:599)
	at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:575)
	at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:539)
	at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:474)
	at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:75)
	at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:660)
	at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101)
	at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:454)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:446)
	at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:224)
	at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.

This happened in production, resulting in an event not being recordded to Kafka.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.