Giter VIP home page Giter VIP logo

Comments (8)

Samreay avatar Samreay commented on May 31, 2024 2

I think keeping it simple would be best to try and get something out. I simple maximum number of flows that an agent can launch and have active at once seems like that easiest path forward

from server.

Samreay avatar Samreay commented on May 31, 2024

This sounds like a great lightweight way of adding a really useful feature. My agent runs as a service in the same swarm that the prefect server itself runs, and thus being able to limit a flow to having N active flows at once would make it so much easier to keep the swarm functioning and healthy.

from server.

jacksund avatar jacksund commented on May 31, 2024

Thanks for the follow-up!

To address load-balancing across agents, it may be useful to make comparisons between a DaskWorker and a PrefectAgent. I like to think of Prefect+Dask as having two levels of executors: PrefectAgent is a "flow.run() executor" and DaskWorker is a "task.run() executor". Much like how DaskWorker has a ThreadPool that dictates the number of tasks, a PrefectAgent can have a ThreadPool or FlowRunnerPool for the number of flows.

Going off that threadpool_size (which would be the same as the flow_limit) could be detected automatically, rather than being an input supplied by AgentConfig. Currently, Prefect Agents assume a limitless ThreadPool and resources, which is why we run into trouble.

from server.

Samreay avatar Samreay commented on May 31, 2024

Even if we separate out dask and the perfect agent, there would still be the issue of the flow process itself. For a docker agent, even with a remote dask executor, you should still get one container per flow, and those a computationally expensive and memory intensive.

from server.

jacksund avatar jacksund commented on May 31, 2024

@Samreay So for DockerAgents, we'd have something like an InstancePool instead of a ThreadPool. The size of InstancePool is based off of local resources. Is that right?

from server.

zanieb avatar zanieb commented on May 31, 2024

@jacksund The 'Pool' approach is a nice idea but would require us to invent a pool type for every agent we introduce which may not make sense. It also means we have to implement a way for a 'Pool' to persistently track used resources across restarts i.e. if a you're using a DockerAgent is the ContainerPool the total count of containers on the system? how do we know which containers belong to that agent? what's a reasonable number of containers to run? Prefect currently isn't (and likely won't be) a resource manager, there are many tools that solve that problem better than we can. I do not think we should be attempting to infer values for this concurrency limit from system resources. I don't see an easy mapping; if you have a machine with 8 cores you may still want to run more than 8 concurrent flows.

from server.

jacksund avatar jacksund commented on May 31, 2024

@madkinsz Yeah these are some really good points. I guess I was only thinking in the context of LocalAgent where local resources make good measure of flow_run_limit. To your point about 8 cores != 8 concurrent flow limit, I agree. But testing Prefect may give us empirical data that shows something like "Agents on 8 core machines w. 16G RAM are typically only stable up to 32 concurrent flows" - or something like that to set a default value and allow the user to overwrite it if desired. If I were to really push this Pool idea, I could argue to add a default UnlimitedPool attached to the base Agent class -- so subclass agents aren't required to define a pool type.

But @Samreay is right. What I'm suggesting might be too much. The original comment of a simple AgentConfig kwarg is definitely the easiest. My suggestion is probably only relevant upon refactoring and major revisions (i.e. Prefect v1.0 or later).

from server.

zanieb avatar zanieb commented on May 31, 2024

See work in PrefectHQ/prefect#4532

from server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.