Giter VIP home page Giter VIP logo

Comments (7)

ShuochengWang avatar ShuochengWang commented on July 30, 2024 1

Related Work

  • async-executor
    • global queue + local queues + steal. All tasks will be firstly pushed to global queue. When a thread tries to get a task, it will find tasks from its local queue. If local queue is empty, the thread will try to steal half of the tasks from the global queue. If the global queue is still empty, then the thread will try to steal tasks from other thread's local queue (select a random thread and try to steal half of the tasks from the local queue. If failed, try next thread.)
  • crossbeam-deque
    • Concurrent work-stealing deques. most commonly used in work-stealing schedulers.
    • Worker + Injector + Stealer. Each thread corresponds to a worker. And each worker has a corresponding stealer, this stealer can steal tasks from its worker. Injector is a queue shared by threads. Hence, worker is the local queue, injector is the global queue. get tasks from local worker's queue => get tasks from global injector's queue => get tasks from stealer (other worker's queue)
  • tokio
    • local queue + Inject + lifo_slot + Steal. Similar to crossbeam-deque, inject is global queue. 1. Tasks will be pushed to local queue. if local queue is full, push tasks to global queue. 2. Get tasks from the local queue first most of the time. 3. each thread has a lifo_slot, which with higher priority than local_queue. 3. If local queue is empty, steal tasks from other local queues first. if steal failed, steal from global queue.

from ngo.

ShuochengWang avatar ShuochengWang commented on July 30, 2024

The results of the discussion with @tatetian :

  1. idle vcpu (the thead of the async runtime) should sleep and wakeup later. When the vcpu's run_queue is empty, the vcpu need sleep some time, eg, sleep 50ms. We can use eventfd (better) or futex in occlum to implement this feature. However, when the run_queue is empty, there may be some pending tasks on this vcpu, other vcpu need to invoke wake(), insert the task to the run_queue, and wakeup the corresbonding vcpu.
  2. performance problem of io_uring callback. In current ngo implementation, we set a sched_callback (io_uring callback) before init async runtime, then each vcpu will invoke sched_callback (eg, poll io_uring completion queue) in each event loop. Actually, It is not necessary for all vcpu to poll io_uring queue. We can use only one vcpu, or several vcpu to poll the io_uring queue. We can deprecated sched_callback, and treat polling io_uring queue as a async task.
  3. [RFC] Config the number vCPUs #16 . Implement this RFC
  4. sleep function for user task. User task maybe want to sleep some time, eg, timer. Since async-rt is a no_std runtime, there is no time function in no_std env. We can implement sleep function in ngo libos.

from ngo.

ShuochengWang avatar ShuochengWang commented on July 30, 2024

Our Design Goals

A Rust async runtime with no_std, for LibOS, In particular, for SGX LibOS.

Take ngo as an example, ngo need rust async runtime with following features:

  • Idle thread sleep
  • Load balancing
  • Support priority

Idle thread sleep (Already supported)

If one thread is idle, this thread will sleep. When a new task is ready, the sleeping thread will be waked up.

Load balancing

  1. Assign tasks to each thread in a balanced manner
  2. Threads with fewer workload should execute some workload on other threads
  3. Try to avoid starvation
  4. Can not violate the cpu affinity.

Support priority

There are different kinds of tasks in LibOS, e.g. user tasks and LibOS tasks. User tasks generally have higher priority, while LibOS tasks maybe running in the background and with lower priority.

  1. Distinguish different kinds of tasks.
  2. Support priority in the scheduler
  3. Shutdown background tasks correctly. (background tasks may be infinite loop pattern, we need shutdown it correctly before exits)

from ngo.

ShuochengWang avatar ShuochengWang commented on July 30, 2024

Design Overview

Firstly, thanks to tokio's blog https://tokio.rs/blog/2019-10-scheduler, those ideas are very usefull.

Basic Idea

  • local queues: each thread has a local queue, in particular, a bounded mpmc / spmc queue.
  • global queue: a unbounded mpmc queue. Since the local queue is bounded (fixed size), the global queue is used to store tasks when the local queue is full. The implementation of global queue in tokio is a intrusive linked list guarded by mutex.
  • steal: if one thread's local queue is empty, it can steal tasks from other threads' local queue and global queue.

Task

Task priority

Task priority affects how to select tasks in the thread.

  • default (mid) : If not specify priority, use default. User tasks usually are default priority.
  • high
  • low
Task type

Task type affects how to assign task to threads.

  • default : if not specify type, use default. User tasks usually are default type.
  • blocking : a task that will block for a time. If one thread executes a blocking task, other tasks on this thread may starve. Hence, when accept a blocking task to one thread, we will move all tasks on this thread to global queue to avoid starvation.
  • background : e.g. LibOS tasks. These tasks may be keep running until runtime shutdown.

CPU affinity

In our async runtime, the thead is called vcpu. Since user can bind threads to specific cpu set in OS, we allow user to bind tasks to specific vcpu set.

If one task is bound to a vcpu set, then the task can only run in those vcpus. vcpu outside the set can not steal the task.

Run queue

Each thread has three local queues. All threads share one global queue.

  • high priority queue
  • mid priority queue
  • low priority queue
  • global queue

from ngo.

ShuochengWang avatar ShuochengWang commented on July 30, 2024

Assign tasks to threads

When accept one task, we need to decide which thread to run the task.

In the assign algorithm, we consider following factors:

  • If the task is new or not: If the task is not new (a wake up task), that is, the task has been scheduled before, we will try to schedule the task to the previous thread.
  • cpu affinity: we need meet the cpu affinity.
  • task type: If the type is blocking, we need try to find a thread that has relatively few tasks, and we need to move those tasks to other threads. And we need consider the cpu affinity.
  • local queue lengh: according to the task's priority, we find a thread with short corresponding local queue.
  • wait time: when each task is assigned, we record a assign_tick. And we record a global_tick. wait_time = global_tick - queue.front().assign_tick. we find a thread with short wait time.

from ngo.

ShuochengWang avatar ShuochengWang commented on July 30, 2024

Select one task to run

  1. high probability: select from high_priority queue.
  2. medium probability: select form mid_pri queue.
  3. low probability: select from low_pri queue.
  4. very low probability: select from global queue.
  • If first try failed (the queue is empty), try other local queues.
  • If still failed, try to steal.

Steal

  • select one random thread to steal half of tasks.
  • do not steal tasks with invalid cpu affinity.
  • If steal failed, try next thread. (try k times)
  • If steal still failed, try steal global queue.

from ngo.

ShuochengWang avatar ShuochengWang commented on July 30, 2024

Optimization: Temporarily increase priority

  • If a mid_pri default_type task is wake up, insert it to high_pri quue temporarily to reduce delay.

Optimization: Throttle stealing (tokio's opt)

  • Limit the max number of concurrent threads performing steal to reduce contention.

from ngo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.