Giter VIP home page Giter VIP logo

Comments (16)

osfameron avatar osfameron commented on June 27, 2024

Thanks @ankurchavda, can you confirm which docs URL you're following, and which SDK/Server versions you're on, and we'll take a look!

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

@osfameron The one on the official website - https://docs.couchbase.com/python-sdk/current/howtos/error-handling.html.

And the github code for this is

# This an example of error handling for idempotent operations (such as the full-doc op seen here).
def change_email(collection, # type: Collection
maxRetries # type: int
):
try:
result = collection.get("doc_id") # type: GetResult
if not result:
raise couchbase.exceptions.KeyNotFoundException()
else:
content = result.content
content["email"]="[email protected]"
collection.replace("doc_id", content);
except couchbase.exceptions.CouchbaseError as err:
# isRetryable will be true for transient errors, such as a CAS mismatch (indicating
# another agent concurrently modified the document), or a temporary failure (indicating
# the server is temporarily unavailable or overloaded). The operation may or may not
# have been written, but since it is idempotent we can simply retry it.
if err.is_retryable:
if maxRetries > 0:
logging.info("Retrying operation on retryable err " + err)
change_email(collection, maxRetries - 1);
else:
# Errors can be transient but still exceed our SLA.
logging.error("Too many attempts, aborting on err " + err)
raise
# If the err is not isRetryable, there is perhaps a more permanent or serious error,
# such as a network failure.
else:
logging.error("Aborting operation on err " + err);
raise
MAX_RETRIES=5
try:
change_email(collection, MAX_RETRIES)
except RuntimeError as err:
# What to do here is highly application dependent. Options could include:
# - Returning a "please try again later" error back to the end-user (if any)
# - Logging it for manual human review, and possible follow-up with the end-user (if any)
logging.error("Failed to change email")

from docs-sdk-python.

osfameron avatar osfameron commented on June 27, 2024

OK, that page definitely needs some update, thanks for spotting. I've opened https://issues.couchbase.com/browse/DOC-9281 to track this internally, and have added to this sprint.

In the meantime:

  • CouchbaseException is correct
  • is_transient is the original method, is_retryable was indeed added as an alias of this, and later removed

So it looks like you're good to go! Let me know if you have any other questions.

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

Thanks for the quick action @osfameron. I am trying to add proper error handling to my code. Can you please check if the below code looks good?

def couchbase_connection(username, password, host, bucket='bucketname', max_retries=5, back_off=0.1):
    """
    Create couchbase connection

        Parameters:
            username: user id
            password: password
            host: connection string
            bucket (string): Name of the couchbase bucket
            max_retries (int): An integer
            back_off (int): An integer

        Returns:
            collection (couchbase.collections): A couchbase collection object

    """
    try:
        log.info("Creating couchbase connection.")
        # Setting timeout options for cluster object. kv_timeout is for key-value operations
        # like get() and other mutation commands. Simlarly query_timeout is for N1QL queries.
        options = ClusterTimeoutOptions(kv_timeout=timedelta(
            seconds=60), query_timeout=timedelta(seconds=120))

        cluster = Cluster('couchbase://' + host, ClusterOptions(
            PasswordAuthenticator(username, password), timeout_options=options))

        bucket = cluster.bucket('matrixDB')
        collection = bucket.default_collection()
        log.info("Connection created.")
        return collection

    # Transient errors are something that resolve automatically. is_retyable is provided
    # for such errors. So have added a retry mechanism if the error is retryable.
    except couchbase_exceptions.CouchbaseTransientException as err:
        log.error(f"Encountered a transient error: {err}")
        if max_retries > 0:
            log.info(f"Retries left: {max_retries}")
            log.info(f"Backing Off: {back_off} seconds")
            sleep(back_off)
            couchbase_connection(username, password, bucket,
                                 max_retries=max_retries-1, back_off=back_off*2)
        else:
            log.error(
                f"Too many attempts to create connection with couchbase: {err}")
            raise
    # catch generic couchbase exception
    except couchbase_exceptions.CouchbaseException as err:
        log.error(f"Unexpected error {err}")

So I have written the above function to create a couchbase connection. I catch transient errors and retry couchbase creation there. So should I also add is_transient check under CouchbaseException? Because sometimes connection creation fails with LCB_ERR_AUTHENTICATION_FAILURE (206) and that is not caught under CouchbaseTransientException.

from docs-sdk-python.

osfameron avatar osfameron commented on June 27, 2024

I'll speak to our Python SDK specialist soon hopefully and get them to look over it (which will also help me improve and update the documentation!)

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

Sure that would be great, thank you!

from docs-sdk-python.

thejcfactor avatar thejcfactor commented on June 27, 2024

Hi @ankurchavda - The is_retryable and is_transient properties are not actually available within the 3.x SDK. The doc strings need to be updated in order to reflect what is actually available (I will look to create a ticket to track these changes accordingly). When it comes to retrying, definitely can retry on the CouchbaseTransientException. The authentication exception is tricky, since the connection process is relatively expensive, so that will depend how you want the application behave in terms of retrying or not (i.e. do you expect incorrect credentials fairly often, if so, probably not worth retrying and let the user know there was a problem connecting).

Something I would note with your current solution is that maybe you want to make a retry function that can behave as a decorator, that way you can reuse the retry mechanism for other operations. Also, the couchbase_connection() call in the exception path does not return anything. Having the connection wrapped via a decorator would also help to solve this problem. I will try to provide a sample of what I mean by tomorrow.

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

Hi @thejcfactor - The Authentication exception came in despite the credentials being correct. Also, it was resolved on a retry, but I have to specifically catch via AuthenticationException and retry there. Also on this retry, another exception was thrown LCB_ERROR_UNKNOWN_HOST (1049) (attached the image below). And the issue is that none of them are caught under CouchbaseTransientException which is causing the program to fail. Is there is a more robust way to handle these kind of errors that can be resolved via retries?

Also, it would great if you can share the sample of what you are trying to suggest.

image

from docs-sdk-python.

thejcfactor avatar thejcfactor commented on June 27, 2024

Hmmm, there seems to be more going on in terms of the overall cluster setup if you are seeing intermittent connection issues. I would suggest trying out SDK doctor to help diagnose the connection.

Below is some sample code that demonstrates what I am talking about. You can apply the allow_retries decorator to any method that you would like to have retries. Also, you can also change the decorator signature to have more functionality (for instance pass in a dict that has exception types mapped to a function on how you want to handle the exception type...lots of possibilities :)).

import functools
import time
from datetime import timedelta
from typing import Optional, Tuple, Callable

from couchbase.cluster import Cluster, ClusterOptions, ClusterTimeoutOptions
from couchbase.auth import PasswordAuthenticator
from couchbase.exceptions import AuthenticationException, CouchbaseTransientException


def allow_retries(retry_limit=3,                # type: int
                  backoff=1.0,                  # type: float
                  exponential_backoff=False,    # type: bool
                  allowed_exceptions=None       # type: Optional[Tuple]
                  ) -> Callable:
    def handle_retries(func):
        @functools.wraps(func)
        def func_wrapper(*args, **kwargs):
            for i in range(retry_limit):
                try:
                    return func(*args, **kwargs)
                except Exception as ex:
                    if allowed_exceptions is None or not isinstance(ex, allowed_exceptions):
                        raise

                    if (retry_limit-1)-i == 0:
                        raise

                    delay = backoff
                    if exponential_backoff is True:
                        delay *= i
                    print(f"Retries left: {(retry_limit-1) - i}")
                    print(f"Backing Off: {delay} seconds")
                    time.sleep(delay)
        return func_wrapper
    return handle_retries


@allow_retries(retry_limit=5,
               backoff=0.5,
               allowed_exceptions=(CouchbaseTransientException, AuthenticationException))
def couchbase_connect(host,         # type: str
                      username,     # type: str
                      pw,           # type: str
                      bucket_name  # type: str
                      ) -> Tuple:
    options = ClusterTimeoutOptions(kv_timeout=timedelta(seconds=60),
                                    query_timeout=timedelta(seconds=120))

    cluster = Cluster(host, ClusterOptions(
        PasswordAuthenticator(username, pw), timeout_options=options))

    bucket = cluster.bucket(bucket_name)
    collection = bucket.default_collection()
    return cluster, bucket, collection


def run_sample_code():
    try:
        _, _, collection = couchbase_connect("couchbase://localhost",
                                             "Administrator", "password", "default")
        _ = collection.upsert("testKey", {"info": "is a test doc"})
        res = collection.get("testKey")
        print(f"Got: {res.content_as[dict]}")

    except Exception as ex:
        import traceback
        traceback.print_exc()


if __name__ == '__main__':
    run_sample_code()

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

Hi @thejcfactor, Sadly can't use SDK Doctor for the time being. But the code that you shared is very helpful, have added this decorator in my code for retrying after exceptions. Will let you know how the retry robustness changes after using this but the code definitely has become cleaner. Thanks.

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

Hi @thejcfactor - I used the base exception CouchbaseException for couchbase_connect function -

@allow_retries(backoff=0.5, exponential_backoff=True, allowed_exceptions=(CouchbaseException))

The allow_retries did not seem to catch AuthenticationException, TimeoutException.

I was expecting that since CouchbaseException is the base class, if allowed_exceptions is None or not isinstance(err, allowed_exceptions): would consider AuthenticationException an instance of CouchbaseException type. My understanding might be incorrect and sorry for digressing and asking a python related question. But will I have to catch and retry the exact exceptions using this decorator?

from docs-sdk-python.

thejcfactor avatar thejcfactor commented on June 27, 2024

Hi @ankurchavda - I apologize for the delayed response (something is up w/ my notifications). I imagine you have figured out how to alter the helper method already. Just to close the loop, you would either need to pass in the specific exceptions if you are wanting to allow when using isinstance(), otherwise you could use issubclass(). I would lean on the side of being more explicit and passing in a tuple of all the errors you want to allow. IMHO, that would allow the code to be more clear.

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

Hi @thejcfactor, no problem at all. Yes, I did figure it out eventually. Just as you suggested, I passed the errors explicitly that I wanted to handle. The program is working fine now. Thanks for all the help :D.

Btw any updates on when the official documentation is going to be updated?

from docs-sdk-python.

thejcfactor avatar thejcfactor commented on June 27, 2024

Hi @ankurchavda - Good to hear!

I am planning to make updates to the docs within the coming weeks (prior to the US holidays).

from docs-sdk-python.

ankurchavda avatar ankurchavda commented on June 27, 2024

Hi @thejcfactor ,

Hope you are having a nice time :D

Apologies for asking a different question under this issue, I don't see an option of opening an issue in the Python SDK repo.

I wanted to understand a little more about the query_timeout option.

I set the option as 120 seconds in the ClusterTimeoutOptions(query_timeout=timedelta(seconds=120)) as a default option. However, I change it in the QueryOptions(timeout=timedelta(seconds=240)) if my query is going to take longer than 120. What I see happening is, it still takes 120 seconds and times out after it. I checked the API code and this is what is mentioned -

Timeouts may also be adjusted on a per-query basis by setting the
:attr:timeout property in the options to the n1ql_query method.
The effective timeout is either the per-query timeout or the global timeout,
whichever is lower.

Why the whichever is lower? Shouldn't the preference ideally be given to the lowest level of setting i.e. timeout settings in that query options? In this way, I might never be able to specify different timeouts for different queries if it exceeds the global timeout.

Also, I set QueryOptions(timeout=timedelta(seconds=120)) and ClusterTimeoutOptions(query_timeout=timedelta(seconds=240)). The query executed taking 240 as the timeout value. Meanwhile the API says it will take lower of the two. Then shouldn't it have taken 120 seconds as the timeout value and timed out? I see that the preference is just being given to the global settings in all scenarios.

I am using couchbase==3.1.2

Also, do let me know if there is a better place to raise this issue, will be happy to do it :D

from docs-sdk-python.

osfameron avatar osfameron commented on June 27, 2024

Initial error is closed and merged, should appear on our docs site at next build.

@ankurchavda sorry just noted your last question - is that still an open issue?
I'm going to close this ticket as the original problem is resolved.

If you still have an issue, you could open a new ticket, or as this looks like perhaps more of a discussion point, then https://forums.couchbase.com/ might be appropriate.

from docs-sdk-python.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.