Giter VIP home page Giter VIP logo

Comments (7)

BeryJu avatar BeryJu commented on May 20, 2024

The initial sync is definitely slower than subsequent syncs, however you can also speed this up by scaling up the amount of workers you are running, as the sync gets parallelised across them.

from authentik.

septatrix avatar septatrix commented on May 20, 2024

This was not the first sync though for some reason Authentik still displays "Not synced yet." on the source overview. Could that somehow be the culprit?

Regarding multiple workers there is #6929 (comment) where you mentioned #6815 which is still open so my assumption is that it does not yet work as expected?

from authentik.

BeryJu avatar BeryJu commented on May 20, 2024

There is a workaround for that, with which I still need to update that issue with. You can run worker containers with -b to not run the scheduled tasks (so 1 worker with no args set and N workers with -b set), and with that you won't run into those issues. Aside from that

Additionally with f728bbb#diff-17304f637c355091282495601690ebf8e379affb23f8d3afe43f0ff230d1318bR194 there's a lock on LDAP syncs now, so that if one runs, other syncs can't start

from authentik.

septatrix avatar septatrix commented on May 20, 2024

Still, it feels like almost a second per user is a really long time. I wonder what could lead to this slowdown.

There is a workaround for that, with which I still need to update that issue with. You can run worker containers with -b to not run the scheduled tasks (so 1 worker with no args set and N workers with -b set), and with that you won't run into those issues.

The LDAP server would have to support pagination for that to work, right? Our LDAP server does not support pagination, sadly

from authentik.

BeryJu avatar BeryJu commented on May 20, 2024

Ah yes, the scaling for LDAP sync does require pagination (which also explains why you're only seeing 1 task instead of 1 task for each 100 or so objects)

Part of the reason for the speed is that when authentik can't distribute pages over different workers, all users/groups are iterated through serially, property mapping values are computed, then authentik tries to create/update the user and lastly there are some checks for vendor-specific quirks

from authentik.

septatrix avatar septatrix commented on May 20, 2024

Ah yes, the scaling for LDAP sync does require pagination (which also explains why you're only seeing 1 task instead of 1 task for each 100 or so objects)

Would solely the support of paging already make a difference? Can a single worker already handle several tasks concurrently?

Part of the reason for the speed is that when authentik can't distribute pages over different workers, all users/groups are iterated through serially, property mapping values are computed, then authentik tries to create/update the user and lastly there are some checks for vendor-specific quirks

I already took a look at the code but could not find any reason why it would be that slow. Retrieving the users is rather quick, even without pagination our server returns all users within a few seconds. And the property mappings as well as the FreeIPA/AD check seem to only be simple attribute checks on a dictionary, so nothing which Python wouldn't be able to handle thousands of in a second.
My guess would have been that are some hidden database calls which are duplicated for every user even though they might be reusable, or that the expression evaluation is god awful slow and has to be parsed/compiled etc. anew for each user. When I have some spare time on my hand I will try to do the same setup but remove basically all mappings and when I got even more spare time I might try to set up pg_stat_statements to check in with the former. Regardless, I would find it hard to believe if there is not an avoidable performance issue present in the current code.

from authentik.

BeryJu avatar BeryJu commented on May 20, 2024

Yeah that is true, looking through the code, fetching the property mappings is not cached and is done for each object. Similarly, compiling the python in the expression is also not cached.

from authentik.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.