Giter VIP home page Giter VIP logo

Comments (19)

 avatar commented on June 22, 2024

EFLOW Performance Screen Capture - 1
EFLOW Performance Screen Capture - 2

from iotedge-eflow.

TerryWarwick avatar TerryWarwick commented on June 22, 2024

@dmaxwel,

Thank you for the very thorough report. We have a fix for the excessive CPU utilization of WSSDAgent and will make it available in our next servicing update. If all goes well this update will be available the week of July 26th.

Terry Warwick
Microsoft

from iotedge-eflow.

 avatar commented on June 22, 2024

OK, I noticed, by the way, that it seemed calm when I had just the SimulatedTemperatureSensor running alone. When adding the Azure Streaming Analytics module to do a simple reset of the temperature module then the trouble began and after 15 or 20 minutes the entire PowerShell window is useless and telemetry stops.

In the next week to 10 days until this update comes out, I will experiment with other types of modules to see if that makes a difference.

from iotedge-eflow.

samuelbertrand avatar samuelbertrand commented on June 22, 2024

I had this exact issue on the public preview (before the GA). Increasing the virtual machine memory from 1 GB (default) to 2 GB (or more) fixed the issue for me. It seems that the issue is related to the virtual machine excessively using the Linux swap space on the virtual disk caused by the lack of memory availability. Though, I was not able to reproduce the issue on the GA.

from iotedge-eflow.

nealpeters86 avatar nealpeters86 commented on June 22, 2024

@TerryWarwick we encounter the same issue. Is there a workaround available?

from iotedge-eflow.

 avatar commented on June 22, 2024

@samuelbertrand your tip seems to have provided a solution for this problem of everything coming to a halt. I should have thought about that but didn't. The upgrade of virtual machine memory has solved one additional problem. The Vmmem CPU utilization is very low now and the Disk usage is nearly 0% after an hour of running with the original two modules (SimulatedTemperatureSensor, and my Azure Streaming Analytic temperature reset mdule).

Here are the steps I did to get to this point (taken from the PowerShell functions for IoT Edge for Linux on Windows web page:
https://docs.microsoft.com/en-us/azure/iot-edge/reference-iot-edge-for-linux-on-windows-functions )

Stop-EflowVM
Set-EflowVM -memoryInMB 4096
Start-EflowVM

That redeployed the Linux virtual machine with 4GB of RAM allocated.

I don't know how to to find out how much memory that the virtual machine has when running so am not sure what it was before doing this. @samuelbertrand mentioned 1 GB (default) above. Now it is set to 4 GB.

After doing that, I also did the following:
sudo systemctl stop iotedge
sudo docker image list
sudo docker rmi "idofdockerimage"

Repeated the removal of docker images until all docker images were deleted.

Then

I then went into the Azure IoT Edge Deployments and changed the priority of the desired deployment containing the Azure Streaming Analytics module so that when the iotedge service restarts it will download the correct containers and runtimes of edgeAgent and edgeHub.

Then

sudo systemctl start iotedge

Then in the Azure IoT Edge portal, went to the IoT Hub, IoT Edge devices, then clicked on this device and then clicked on "Troubleshoot". This has become a very valuable addition to the portal and I appreciate that. I was able to see logs of each module.

After an hour all is calm.

However, the WSSD Agent CPU utilization is still at 26% and so the upcoming release will be appreciated to reduce that.

from iotedge-eflow.

ms-vincent avatar ms-vincent commented on June 22, 2024

@dmaxwe22 What version of EFLOW are you using?

from iotedge-eflow.

 avatar commented on June 22, 2024

@ms-vincent can you tell me how to find the EFLOW version? I downloaded it around the 15th of July. Why do you ask?

from iotedge-eflow.

 avatar commented on June 22, 2024

PS C:\WINDOWS\system32> Get-EflowVM | Format-List

VmConfiguration : @{ID=1b07306e9a1fc5c; name=DESKTOP-G5K33IS-EFLOW; properties=; tags=}
EdgeRuntimeVersion : @{IotEdgeVersion=1.1.3; MobyEngineVersion=19.03.15+azure; MobyCliVersion=19.03.15+azure}
EdgeRuntimeStatus : @{SystemCtlStatus=System.Object[]; ModuleList=System.Object[]}
SystemStatistics : @{TotalMemMb=3747; UsedMemMb=847; AvailableMemMb=2749; TotalStorageMb=9900; UsedStorageMb=1441; AvailableStorageMb=8041; CpuCount=4; KernelVersion=5.10.37.1-1.cm1 #1 SMP Fri Jun 4
11:14:43 UTC 2021}

from iotedge-eflow.

fcabrera23 avatar fcabrera23 commented on June 22, 2024

Hi @dmaxwe22

We have released our latest EFLOW update that fixes the high CPU usage by WSSD Agent. Thank you for your detailed information about the high memory usage, and it's nice to know that you were able to solve it by assigning more memory to the EFLOW VM. For more information on our latest update, check 1.1.2107.0 Release Notes.

Thanks,
Francisco

from iotedge-eflow.

nenright avatar nenright commented on June 22, 2024

Is there any other ways to validate that the update has been deployed? I'm seeing high CPU even after installing 1.1.2107.0. There is no deployment set and the only module running is EdgeAgent. Just wanted to make sure I had the bits deployed correctly before opening a new issue:

Screen Shot 2021-09-07 at 11 31 49 AM
Screen Shot 2021-09-07 at 11 30 30 AM
Screen Shot 2021-09-07 at 11 28 02 AM

Not sure if its related or not, but edgeAgent logs look like this:

Screen Shot 2021-09-07 at 11 34 34 AM

from iotedge-eflow.

fcabrera23 avatar fcabrera23 commented on June 22, 2024

Hi @nenright,

Can you run Get-EflowVM | Format-List in PowerShell so that we can understand your VM configuration?

Thanks,
Francisco

from iotedge-eflow.

nenright avatar nenright commented on June 22, 2024

Does the version upgrade require a machine reboot? After rebooting the host things appear back to normal however I don't know if its something that will creep back over time.

Get-EflowVM | Format-List


VmConfiguration    : @{ID=13dfb745fe82122; name=PSFD-NENRIGHT-EFLOW; properties=; tags=}
EdgeRuntimeVersion : @{IotEdgeVersion=1.1.4; MobyEngineVersion=19.03.15+azure; MobyCliVersion=19.03.15+azure}
EdgeRuntimeStatus  : @{SystemCtlStatus=System.Object[]; ModuleList=System.Object[]}
SystemStatistics   : @{TotalMemMb=787; UsedMemMb=278; AvailableMemMb=450; TotalStorageMb=9900; UsedStorageMb=652;
                     AvailableStorageMb=8831; CpuCount=1; KernelVersion=5.10.42.1-3.cm1 #1 SMP Mon Jun 28 13:00:04 UTC
                     2021}

from iotedge-eflow.

fcabrera23 avatar fcabrera23 commented on June 22, 2024

@nenright,

Thanks for your information. I'm seeing that you're using a 1GB VM - Can you assign 2GB to the VM using the Set-EflowVm command?

Also, could you please share information about your Windows version? You can type winver directly in Windows and will get the whole information.

Thanks,
Francisco

from iotedge-eflow.

nenright avatar nenright commented on June 22, 2024

I'm no longer experiencing the behavior on my test device after the reboot - it is currently working as expected even with 1 gig. test device is on 21H1 OS Build 19043.1165

we did see this in the field that had a device configured with 2 cores and 4GB. After a reboot of that device we think it resolved itself as well. I may have another device in the field in the high cpu/mem state but I'm not 100% sure (currently don't have a good way to monitor the VM Host remotely). If this is of interest to you, I can try to coordinate with the client to get some downtime to remote in and get some details.

if a reboot is needed after edge is updated with wus going forward, we'll just need to know that to coordinate.

from iotedge-eflow.

nenright avatar nenright commented on June 22, 2024

@fcabrera23, I believe my test machine is back in the high cpu/memory state. I haven't done anything with it since my previous post. It's provisioned but no deployments set. edgeAgent is the only running module. The memory and cpu load appears to get worse over time.

Windows update indicates that KB5005565) was installed and there is a pending restart. I assume if I do, memory and CPU will return. Let me know if there is anything I can collect for you while it is in this state.

Screen Shot 2021-09-20 at 8 23 41 AM

from iotedge-eflow.

fcabrera23 avatar fcabrera23 commented on June 22, 2024

Hi @nenright,

Can you try doing a Get-EflowVm | Format-List and share the output? We are seeing an increase in the Docker logs file that can end up in this situation.

Thanks,
Francisco

from iotedge-eflow.

nenright avatar nenright commented on June 22, 2024

So some more info... looks like the Antimaleware scan was what was causing things to spike. after 20ish min, things calmed down although ever ~1 second the hyper-v compute and wssdagent cycle through a ~3 second high cpu period so they go up and down. Maybe the metrics scrape (its trying to scrape the metrics for edgeHub but it isn't deployed)?

here is the get-eflowvm output.

PS C:\Users\nenright> Get-EflowVm | Format-List VmConfiguration : @{ID=13dfb745fe82122; name=PSFD-NENRIGHT-EFLOW; properties=; tags=} EdgeRuntimeVersion : @{IotEdgeVersion=1.1.4; MobyEngineVersion=19.03.15+azure; MobyCliVersion=19.03.15+azure} EdgeRuntimeStatus : @{SystemCtlStatus=System.Object[]; ModuleList=System.Object[]} SystemStatistics : @{TotalMemMb=787; UsedMemMb=269; AvailableMemMb=453; TotalStorageMb=9900; UsedStorageMb=903; AvailableStorageMb=8579; CpuCount=1; KernelVersion=5.10.42.1-3.cm1 #1 SMP Mon Jun 28 13:00:04 UTC 2021}

from iotedge-eflow.

fcabrera23 avatar fcabrera23 commented on June 22, 2024

Hi @nenright,

After more investigation, we didn't see any modules generating excessive logs in journal logs, so we checked how the journald is designed again. It turns out that the memory usage is bound to the possible maximum of the active log file (the one currently written to).

image

This value can be correlated to the setting of "SystemMaxFileSize". The lower the value is, the less of the working memory the journald will take. By default, this value of "SystemMaxFileSize" is 1/8 of the maximum total logs size (1GB in EFLOWVM by default), so hypothetically, the journald at most will map 250MB (125MB for active system journal + 125MB for active user journal) into the working memory. And for the resource constraint device, this behavior may be not ideal.

Customers that do not want this behavior can modify the "/etc/systemd/journald.conf" to add "SystemMaxFileSize" to indirectly cap the journald memory usage.

Just for testing purpose, one can modify the file, and then do the following without VM reboot:

  1. sudo systemctl daemon-reload
  2. sudo systemctl restart systemd-journald
  3. sudo journalctl --rotate

from iotedge-eflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.