I am converting my home brew web app from pyHS100 to Kasa. I thought the asyncio inter

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Concurrency performance question about python-kasa HOT 3 CLOSED

python-kasa commented on September 21, 2024

Concurrency performance question

from python-kasa.

Comments (3)

pcwalden commented on September 21, 2024 1

Thank you for the extensive set of answers. It will take me awhile to process these as I am relatively new to asyncio.

I have one KP400, four HS100's, six HS200's and two HS210's. All plugs or strips and no bulbs.

from python-kasa.

rytilahti commented on September 21, 2024

Hi @pcwalden, and thanks for your interesting questions! The performance really piqued my interest, so I had to do some benchmarking... :-)

I thought the asyncio interface for Kasa would speed it up, but so far I do not see much improvement and in some cases it is slower.

How many devices are we talking about? And how fast are the updates already? How did you benchmark the performance?

Anyway, asyncio does something called cooperative concurrency, where a task can yield and let other tasks to run while it is waiting for I/O (or something else to finish). This is mostly useful in cases where an operation is taking some time to finish, so if the devices are already fast to respond, it doesn't really matter that much.

The question is whether the asyncio.gather is really concurrent or am I mistaken?

Awaiting on asyncio.gather() will wait for all coroutines to finish before returning, so even if some tasks are already finished it is still blocking. So doing updates with it could be faster, especially if some devices are less responsive than others (that is, sequential accesses would wait for responses). The total execution time does not really change that much if you wait all requests to finish, I'll add a test script & some results from the devices I have at the end of this comment.

You usually use asyncio in asynchronous context, which allows you to do updates as soon as the results come in. Check out asyncio.as_completed() or use add_complete_callback() to handle the results as soon as they come in. As you mentioned using ajax, you should simply signal your web app to do updates as they come.

The question is how to reduce the multiple round trips to the device?

You can assume that the action has completely successfully if no exception is raised. Alternatively, you could check the return value of methods like turn_on, which will contain the payload from the device.

Now, for the tests I did. Thanks for raising the question, it made me wonder why I'm also having sometimes troubles with my light bulb. For these tests I used 10 rounds of updates:, both concurrently and sequentially. The test devices were two HS110s, a KL130 and a KP303.

Note that it's not uncommon for devices (or their chipsets) to go to some sort of sleep mode if they are not actively communicating, so if you really want to get more conclusive results, you should adjust the sleeps to be longer to simulate a more realistic use case.

These are only done using python-kasa, but the query itself is (mostly) the same as used by pyhs100. The only real difference is that also emeter statistics are being queried for devices supporting it on python-kasa.

=== Testing using gather on all devices ===
              took                                                                      
             count      mean       std       min       25%       50%       75%       max
type                                                                                    
concurrently  10.0  0.196667  0.068511  0.044919  0.154346  0.229221  0.237399  0.269855
sequential    10.0  0.272287  0.084113  0.136539  0.249556  0.250960  0.260461  0.454781

So based on this, executing the tasks "concurrently" yields better results on average (the median doesn't matter that much considering the sample size), but it seems still to perform better.

I also did do another test round to demonstrate what happens if you wouldn't do a gather but simply handle the results as soon as they come:

=== Testing per-device performance ===
                           took                                                                      
                          count      mean       std       min       25%       50%       75%       max
id                                                                                                   
139882552852288-KP303(UK)  10.0  0.022219  0.007838  0.011652  0.018958  0.020973  0.023152  0.041773
139882552994688-HS110(EU)  10.0  0.023627  0.008106  0.014368  0.018894  0.021746  0.025552  0.043705
139882553034304-KL130(EU)  10.0  0.190850  0.105943  0.042239  0.083322  0.246528  0.251389  0.336146
139882553158912-HS110(EU)  10.0  0.020193  0.009985  0.012975  0.016302  0.017394  0.018336  0.047909

So in my case, it looks like KL130 is consistently an order of magnitude slower than others..

Here's the test script I used, to try it out simply fill the addrs array and let it run for a while:

import asyncio
import time
import pandas as pd
from kasa import SmartPlug, SmartBulb, SmartStrip


async def update(dev, lock=None):
    if lock is not None:
        await lock.acquire()
        await asyncio.sleep(2)
    try:
        start_time = time.time()
        #print("%s >> Updating" % id(dev))
        await dev.update()    
        #print("%s >> done in %s" % (id(dev), time.time() - start_time))
        return {"id": f"{id(dev)}-{dev.model}", "took": (time.time() - start_time)}
    finally:
        if lock is not None:
            lock.release()


async def update_concurrently(devs):
    start_time = time.time()
    update_futures = [asyncio.ensure_future(update(dev)) for dev in devs]
    done = await asyncio.gather(*update_futures)
    return {"type": "concurrently", "took": (time.time() - start_time)}


async def update_sequentially(devs):
    start_time = time.time()

    for dev in devs:
        await update(dev)

    return {"type": "sequential", "took": (time.time() - start_time)}


async def main():
    devs = [
        # SmartPlug("127.0.0.1"),
    ]

    data = []
    rounds = 10
    test_gathered = True
    
    if test_gathered:
        print("=== Testing using gather on all devices ===")
        for i in range(rounds):
            data.append(await update_concurrently(devs))
            await asyncio.sleep(2)


        await asyncio.sleep(5)

        for i in range(rounds):
            data.append(await update_sequentially(devs))
            await asyncio.sleep(2)


        df = pd.DataFrame(data)
        print(df.groupby("type").describe())
    
    
    print("=== Testing per-device performance ===")
    
    futs = []
    data = []
    locks = {dev: asyncio.Lock() for dev in devs}
    for i in range(rounds):
        for dev in devs:
            futs.append(asyncio.ensure_future(update(dev, locks[dev])))

    for fut in asyncio.as_completed(futs):
        res = await fut
        data.append(res)
        
    df = pd.DataFrame(data)
    print(df.groupby("id").describe())

if __name__ == "__main__":
    asyncio.run(main())

from python-kasa.

rytilahti commented on September 21, 2024

I think we can close this issue now. There have been various improvements to the I/O handling in this library, the most recent being keeping the connection open to avoid TCP setup&teardown (#213). Please give a test to the 0.4.0 release, it will probably be much more performant if you are doing multiple requests on the device :-)

from python-kasa.

Concurrency performance question about python-kasa HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent