Idempotent HTTP requests correlated over commands
and replies
Kafka topics must be able to properly handle duplicate requests with the same idempotency-key
header.
Typically, both the commands
and replies
topics are configured for time based retention, so the correlated reply is typically already in the replies
topic when the idempotent request is replayed by the client.
Note: the time base retention period for the replies
topic effectively dictates the idempotency key expiration window.
Ideally, we would detect this scenario at zilla
and avoid sending the replayed request to the commands
topic, simply returning the already correlated response from the replies
topic to the client. (not currently implemented)
Even if such an enhancement were implemented, we must still consider the race condition where the idempotent request has already been sent to the commands
topic, and the reply will be sent but it is not yet present in the replies
topic when the idempotent request is replayed, so zilla
would not detect the correlated reply and would send the replayed request to the commands
topic.
Note: even if zilla
would retain some in-memory awareness of previously sent idempotent requests, a peer or restarted zilla
instance could receive the replayed request with no such historical awareness.
If the service observes multiple idempotent requests, then it needs to retain knowledge of previously received idempotency keys to prevent duplicate processing.
When such a duplicate request is detected, it can be safely ignored, as long as the correlated response is guaranteed to still be present in the replies
topic and not already cleaned up due to time based retention policy.
So the service has to handle this race condition, where the replayed idempotent request arrives on the commands
topic after the response on the replies
topic has expired.
This implies that the historical awareness at the service of previously observed idempotency keys would need to be synchronized perfectly with the time based retention policy of the replies
topic.
Rather than placing this edge case requirement on the service, we can instead implement the proposed enhancement above, where zilla
detects the already correlated response in the replies
topic and avoids sending the replayed request to the commands
topic.
This narrows the time window of the race condition where the service can receive a replayed request to something approximating the request-response round trip time, rather than the full time based retention period of the replies
topic, making it straightforward for the service to detect and ignore the replayed request, as the correlated response is still present in the replies
topic.
In http-kafka
binding, we can change the sync
and async
implementations to first attempt to fetch the correlated response from the replies
topic before producing the request to the commands
topic, rather than always producing the request to the commands
topic as we do now.