Giter VIP home page Giter VIP logo

Comments (10)

jeffhollan avatar jeffhollan commented on July 3, 2024 1

And to clarify the behavior of Event Hubs is:

  1. Retrieve in batch (regardless of if single message function or batch message function)
    2a. If they have written their function to process individual messages
  • Do a Task.WhenAll() and parallel-y invoke a function for each message in the batch. So if you got a batch of 30 messages, you would have 30 function invocations tasks kicked off and potentially processing in parallel.
    2b. If they have written their function to process a batch (recommended and default)
  • Pass the entire batch into a single execution
  1. Wait for batch to process (either all the concurrent executions or the single batch execution)
  2. Checkpoint and retrieve the next batch

from azure-functions-kafka-extension.

fbeltrao avatar fbeltrao commented on July 3, 2024

Unlike EventHubs, Kafka provides checkpoint saving internally (saved as a compact topic). The idea is to use this but save it depending on how settings have been defined, right?

from azure-functions-kafka-extension.

jeffhollan avatar jeffhollan commented on July 3, 2024

I think this makes sense. The timing of the checkpoint is a bit important though. We want to have "at least once" delivery, so as long as the Kafka internal checkpointing allows us to "checkpoint" a batch after processing that should work great.

from azure-functions-kafka-extension.

fbeltrao avatar fbeltrao commented on July 3, 2024

Yes, we call commit topic/partition/offset. Current implementation does that after batch is consumed, but I need to add more customisation and retries.

from azure-functions-kafka-extension.

ryancrawcour avatar ryancrawcour commented on July 3, 2024

@fbeltrao does it checkpoint as soon as batch is read from Kafka, before it is processed?

from azure-functions-kafka-extension.

ryancrawcour avatar ryancrawcour commented on July 3, 2024

@fbeltrao let's chat about what "more customisation" we need in this first milestone. We can always come back and make it more customisable in future.

from azure-functions-kafka-extension.

fbeltrao avatar fbeltrao commented on July 3, 2024

My proposal:

  • For triggers with batch (KafkaEventData[]) we commit after successfully triggering
  • For single event triggers we commit every X times

from azure-functions-kafka-extension.

ryancrawcour avatar ryancrawcour commented on July 3, 2024

Batch sounds good

For single, I am not sure what you mean by every X times.
If I read 1 event from Kafka, process it do we not checkpoint then?
That means if another event later somewhere fails then this one we've already processed successfully will be replayed?

from azure-functions-kafka-extension.

fbeltrao avatar fbeltrao commented on July 3, 2024

Correct. Now that I think about, maybe is better to do the way you described, saving after each message, giving the most accurate solution out of the box.

Performance is bad, but anything less than trivial should be using batch anyway.

from azure-functions-kafka-extension.

ryancrawcour avatar ryancrawcour commented on July 3, 2024

Implemented as above

from azure-functions-kafka-extension.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.