Giter VIP home page Giter VIP logo

Comments (28)

vtrajan1962 avatar vtrajan1962 commented on September 1, 2024 4

In our case, serverless container app jobs were neither writing logs nor working but execution summary shows running state. Ran for many hours with no progress.
However, when I changed the queue name event is tied to and started the job manually, everything worked fine. This is a major hurdle at this point

Hope this gets patched over weekend as all our environments face this issue.

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024 2

@ruvintri yes we do. The container is in fact running just fine - it interacts with Azure storage for example to store its results and we see that as expected. Just nothing in the Console logs since 6/12/2024. Very weird.

It seems almost like when you trigger them manually, there's some sort of context set that is not set when triggered by a queue.

from azure-container-apps.

shirashka avatar shirashka commented on September 1, 2024 2

Having the same issue. Deployed a simple container app job this morning and only the manually-triggered jobs show stdout or stderr logs.

My job doesn't (yet) contain logic. It has a few simple Python print() statements for testing purposes. It's triggering as expected based on the KEDA scaling rule for a Service Bus queue. It runs in the expected amount of time, but the output can't be found anywhere for the auto-triggered jobs.

from azure-container-apps.

GuiUzeda avatar GuiUzeda commented on September 1, 2024 2

Same here. When I start the job manually it works perfectly and finishes in success. All the log outputs works as intended. When the job is starts with the servicebus queue trigger it runs until timeout and fails with no console log. Only the system logs are shown and they report no unexpected errors except timeout.

I am actually questioning if I can trust this kind of service to deploy an application.

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024 2

@GuiUzeda - questioning the same thing. Wondering if I should just go old school with a VM for these jobs. Ugh.

from azure-container-apps.

GuiUzeda avatar GuiUzeda commented on September 1, 2024 1

@GuiUzeda - questioning the same thing. Wondering if I should just go old school with a VM for these jobs. Ugh.

It is a little bit out there. There is a lot of unanswered issues here with triage tag. I have noticed that your original report is 3 days old. I am deploying an app in development for tests so no big issues for me but if that was to happen in production we would be in big trouble here.

from azure-container-apps.

chinadragon0515 avatar chinadragon0515 commented on September 1, 2024 1

We have fixed the issue, a hotfix is deploying, we expect to deploy to all impacted environments within 1 day.

from azure-container-apps.

chinadragon0515 avatar chinadragon0515 commented on September 1, 2024 1

I have patched all impacted environment, let me know anyone still meet issues.

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024 1

@jedjohan - I noticed it yesterday too. No system logs, but console logs are back. I went to look for system logs because a job randomly crashed in the evening. I notice things seem to work much better during the day (PST) - I hope it's just because they're doing maintenance at night. I'm using the West US data center.

@jason-berk-k1x - Yea, this service is way not ready for prime time. It reminds me of when Azure first started 20 years ago, and they couldn't keep VMs up and running. Hopefully they'll figure this out too - glad I'm not in production, yet.

Summarizing, I've noticed the following issues:

  1. No way to create a support ticket, documentation is wrong or misleading. This repo is the only place I could find to log errors. Some serious tickets remain in "need triage" forever. Guess since this isn't "AI", Microsoft doesn't want to put any resources on it.
  2. "Max. Executions" doesn't work and relatedly, Container App Jobs can't scale beyond a dozen (or so) concurrent replicas. Documented in #1216. Try starting 20 jobs that each peg the CPU. Easy as pie to reproduce.
  3. Randomly missing logs (sometimes console, sometimes system).
  4. Unexplained execution failures, at night (I'm in PST, West US DC). No system logs to diagnose.
  5. Backoff errors in system logs do not explain the parameter that triggered them - no way to find out what caused the backoff.
  6. Replicas do not appear to use dedicated resources. 10 1-vCPU replicas starting at the same time running the same loop should run just as fast (actual time) as 1-vCPU replica. This is not the case. The 10-replica case runs much slower. This is really bad because it means that Microsoft is likely way overcharging for vCPU time. This may be related to the concurrent replica problem mentioned above.
  7. When starting many (e.g., 20) replicas (i.e., Container App Jobs), errors occur in pulling the replicas from the container registry.

Again, like 20 years ago, I'm left wondering... will Microsoft ever get its act together on this?

from azure-container-apps.

oligarchy avatar oligarchy commented on September 1, 2024

we're seeing something similar. seems to have started around tuesday. manual runs go correctly and produce logs, anything that is triggered from the queue reports as a failure and produces no logs for console.writelines or our custom logging client.

the queue activated runs appears to stop dropping system logs somewhere around the time it should be trying to pull the container. the delivery count is not incremented on our queue message on the failed runs, so it's unclear at what point this failing, but it seems to be pretty early on.

we tried recreating the app from the ground up, remade the queue, stripped out functionality and redeployed, but no luck. if i run the code directly on my local it functions as expected. we also downloaded the container image from our acr and ran it on a local docker and it also performed as expected.

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024

@oligarchy - June 11 2024 was the last time we saw Console messages.

I've also seen failures in starting "too many" (e.g. 40) replicas simultaneously. You'll see 401 errors in the container system log coming from the container registry - it fails because I guess the container registry can only server a certain number of replicas starting simultaneously at a time.

We're compiling Rust code into an Alpine OS to create our container app btw.

from azure-container-apps.

ruvintri avatar ruvintri commented on September 1, 2024

@robrennie Do you see 'Created container .. ' and 'Started container ..' in your queue triggered job system logs?

We are seeing a similar issue in multiple environments including production and realizing that the containers are not actually starting, hence the empty console logs.

The only difference we can see in our logs is that the 'msi-transition' image that is pulled when starting the container has rolled back from tag 1.39.26-m to 1.0.8-m

This rollback occurred in the last 24 hours has caused all event triggered jobs to fail starting.

from azure-container-apps.

vinisoto avatar vinisoto commented on September 1, 2024

Hi, we have root-caused the missing logs to a platform regression: #1211

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024

@vinisoto - #1211 did not fix this. I just ran a container app job, system log says everything was fine, no Console Log. In fact, ContainerAppConsoleLogs_CL still isn't even created.

'where' operator: Failed to resolve table or column expression named 'ContainerAppConsoleLogs_CL'
Request id: 14fab04f-1582-4890-8c99-93f0ef01319a

from azure-container-apps.

jasonrberk avatar jasonrberk commented on September 1, 2024

The amount of open issue, closed issues that never actually got fixed and overall horrible communication around contain apps makes me question how this made it out of preview.

The lack of communication and accountability is telling

from azure-container-apps.

jasonrberk avatar jasonrberk commented on September 1, 2024

As part of the retrospective, can anyone share where these changes are publicly discussed so we can be better prepared and have a chance at correlating platform changes to regressions. My entire non-prod set of subscriptions have been dead in the water since last Thursday

from azure-container-apps.

GuiUzeda avatar GuiUzeda commented on September 1, 2024

I have patched all impacted environment, let me know anyone still meet issues.

Thanks for your answer! Any redeploy needed? I am still having issues with container not starting:

image

from azure-container-apps.

vtrajan1962 avatar vtrajan1962 commented on September 1, 2024

from azure-container-apps.

vtrajan1962 avatar vtrajan1962 commented on September 1, 2024

from azure-container-apps.

jason-berk-k1x avatar jason-berk-k1x commented on September 1, 2024

I get no system or console logs:

Screenshot 2024-06-25 at 10 01 08 AM

Screenshot 2024-06-25 at 9 57 40 AM

Screenshot 2024-06-25 at 9 59 47 AM

from azure-container-apps.

shirashka avatar shirashka commented on September 1, 2024

It's been resolved for my jobs. Thank you!

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024

Looking good for me too now. I can see Console Logs as expected. Thank you.

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024

@jason-berk-k1x I do notice a significant delay between the replicas completing and the Console logs appearing - fooled me a couple times. I think it's due to the verbosity of my app though - and will fix that anyways. Also, I tend to remove the filter on the default Log Analytics query and rerun it and that's usually when everything (or something) appears.

from azure-container-apps.

jason-berk-k1x avatar jason-berk-k1x commented on September 1, 2024

@robrennie did you have to recreate the job or anything?

from azure-container-apps.

robrennie avatar robrennie commented on September 1, 2024

@jason-berk-k1x I did not, just ran it again and it worked.

from azure-container-apps.

jedjohan avatar jedjohan commented on September 1, 2024

Since yesterday 14:00 UTC my Systemlogs are no more :( Hundreds of executions for 4 ACJs in one specific environment with no data for ContainerAppSystemLogs_CL

from azure-container-apps.

jason-berk-k1x avatar jason-berk-k1x commented on September 1, 2024

@jedjohan isn't it funny you can see console logs....meaning the container is running, but get no system logs! I've been fighting this issue for literally MONTHS!!!!

from azure-container-apps.

jedjohan avatar jedjohan commented on September 1, 2024

@jedjohan isn't it funny you can see console logs....meaning the container is running, but get no system logs! I've been fighting this issue for literally MONTHS!!!!

Mine are back again wohoo. Just went missing for about 24h.

from azure-container-apps.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.