Giter VIP home page Giter VIP logo

Comments (7)

TheCloudlessSky avatar TheCloudlessSky commented on July 18, 2024

Hi @lomaxx,

  1. What's your environment like? Is it only on your development machine? AWS? Where is Redis located? Do you have other Redis timeout problems?
  2. Do you have many concurrent requests for the same objects that would cause locking? How many objects are being locked when you're getting this exception? What is the kind of operation that is causing this exception? Is it repeatable?
  3. Acquiring locks calls redis.StringSet() when the lock is not already taken. So, it's not doing anything complicated that would directly cause lock timeouts.
  4. Acquiring a lock is retried based on the AcquireLockRetryStrategy property of the RedisCacheProviderOptions. The default retry strategy (ExponentialBackoffWithJitterAcquireLockRetryStrategy) will retry until the RedisCacheConfiguration.AcquireLockTimeout is reached. Therefore, by default, if we can't acquire a lock, a TimeoutException is thrown (what you're seeing). You can override the retry strategy (e.g. to disable retries), and then handle the RedisCacheProviderOptions.LockFailed event so that it doesn't throw the TimeoutException (it could throw a more informative LockNotAcquiredException).
  5. The LockValueFactory property on RedisCacheProviderOptions allows you to control the value that is stored for each lock. Normally, this value is unimportant (that's why by default we just use a Guid - GuidLockValueFactory). But, in production to help with debugging, it's incredibly useful to override this. For example, you could override it to return a combination of the machine name/ip/process id/thread id. So, if your lock does fail, you could use the LockFailed event to get the current value of the LockKey from Redis. This allows you to figure out who has the current lock (and go down that rabbit hole of why it has the lock held).
  6. This library does not control when Lock() and Unlock() are called. This is entirely dependent on your mappings, the queries used, and how NHibernate uses the ICache. Maybe something is locking but failing to unlock properly? As a precaution to prevent locks from lasting forever, locks are held in Redis only until RedisCacheConfiguration.LockTimeout.
  7. To minimize lock contention (which causes the timeouts), figure out the access patterns on the cache and see why locks are being held for so long. Ideally, NHibernate should lock, get/put, and unlock relatively quickly (but it really depends on the SQL you're executing).

Hope this helps!

from nhibernate.caches.redis.

lomaxx avatar lomaxx commented on July 18, 2024

Hi @TheCloudlessSky,

Really appreciate you taking the time to respond.

  1. We're currently using an elastichache redis instance on AWS. It's located on the same region as our production instance.
  2. We do have a lot of concurrent requests for the same object, particularly in the QueryCache

Based on reviewing the monitor logs, we actually think the QueryCache is causing the problem (more info here: http://puredanger.github.io/tech.puredanger.com/2009/07/10/hibernate-query-cache/). We've disabled the QueryCache and we're currently only using the 2nd level cache. We'll have to wait until tomorrow until load increases to see the effects. I'll let you know how we go.

Thanks again for responding - if disabling the QueryCache doesn't help, we'll look at the LockValueFactory and see if we can work out if there's anything else happening in the cache that could be causing the locking.

from nhibernate.caches.redis.

TheCloudlessSky avatar TheCloudlessSky commented on July 18, 2024

Thanks for the info.

I'd recommend that you leave the QueryCache off (only individual entities should be cached). From another issue:

If you're using the NHibernate query cache, be aware that there are broad cases where an updated will cause NHibernate to call ICache.Clear(). This, again, depends on how your using NHibernate, but I've found that in situations like this, the NHibernate query cache has no real benefit, so we leave it off completely. We then move the caching "up" (e.g. cache rendered HTML or JSON rather than actual DB entities).

I've had ~10 years of experience with NHibernate and I've pretty much come to the conclusion that it's pretty terrible for performance in general. You're much better off doing raw SQL with correctly tuned queries/indexes to get the best performance (caching in NHibernate introduces a huge overhead). What database are you using? Most modern databases have the ability to cache queries so that they're super fast.

Anyways, if that's not an option for you (or you think QueryCache really helps your performance problems), the problem is most likely the concurrent operations that are attempting to lock on the cache key. You'll probably want to tune how the retries work (e.g. disabling them and failing immediately). You can also look into different second-level caching strategies such as ReadWrite and NonStrictReadWrite. This will help control whether or not a Lock() needs to happen depending on your exact scenario.

from nhibernate.caches.redis.

lomaxx avatar lomaxx commented on July 18, 2024

Interesting that you mention

Most modern databases have the ability to cache queries so that they're super fast.

As this is pretty much the conclusion we've come to. We're using MSSQL 2014 so that should assist. I'll also change the caching strategy to some of our entities to NonStrictReadWrite as they rarely change so there's no real need for them to be ReadWrite.

Finally, do you know what circumstances an ICache.Clear() would happen? I noticed in the redis monitor that there were a number of consecutive deletes which happened a couple of times within a minute - which seemed odd?

Thanks again for your detailed response, it's definitely given us some insights!

from nhibernate.caches.redis.

TheCloudlessSky avatar TheCloudlessSky commented on July 18, 2024

I'm aware of a few scenarios where ICache.Clear() can be called:

...it's not entirely obvious where it can be called, but if you do a "find all references" and work backwards you'll see at least the following places:

  • Executing certain HQL statement (that may delete/update), it can clear all cache regions (or at least try to clear the affected cache regions):
    • See: AbstractStatementExecutor.CoordinateSharedCacheCleanup() (used by MultiTableDeleteExecutor, MultiTableUpdateExecutor and BasicExecutor)
  • Executing a native SQL query (e.g. session.CreateSQLQuery()):
    • See: NativeSQLQueryPlan.CoordinateSharedCacheCleanup()
  • Manually evicting:
    • SessionFactoryImpl.Evict(Type persistentClass)
    • SessionFactoryImpl.EvictEntity(string entityName)
    • SessionFactoryImpl.EvictCollection(string roleName, object id)
    • SessionFactoryImpl.EvictQueries()
    • SessionFactoryImpl.EvictQueries(string cacheRegion)

from nhibernate.caches.redis.

lomaxx avatar lomaxx commented on July 18, 2024

from nhibernate.caches.redis.

TheCloudlessSky avatar TheCloudlessSky commented on July 18, 2024

@lomaxx It's also the conclusion that I've come to as well. Great to hear you at least got things settled. I'm going to close this, but feel free to re-open if you're noticing this problem for other reasons.

from nhibernate.caches.redis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.