Giter VIP home page Giter VIP logo

Comments (9)

jedcunningham avatar jedcunningham commented on September 28, 2024 2

It's this one, apache-airflow-providers-cncf-kubernetes==8.1.1, sorry, I tend to just refer to it as the "k8s provider", but its actually named "CNCF Kubernetes".

For metrics info, see https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/metrics.html.

I suspect you might be hitting #36998, and 8.3.0 from #39551 might fix it? Can you try upgrading to that version?

from airflow.

hbc-acai avatar hbc-acai commented on September 28, 2024 1

I upgraded from 2.7.3 directly to 2.9.1 and this issue showed up. We did not catch it during testing because it only happened after running the scheduler for some time. Today I tested it more in our dev cluster by running more tasks, and the same issue showed up.

We use KubernetesExecutor, and mostly Python and PythonVirtualEnvOperator.

Meta database: PostgresDB 15.6
DAGs and logs Persistent Volumes: Azure Blob Storage
Single Scheduler

from airflow.

hbc-acai avatar hbc-acai commented on September 28, 2024 1

Thanks @jedcunningham. I upgraded apache-airflow-providers-cncf-kubernetes==8.3.0 and I have not seen the same error in the last few hours.

from airflow.

boring-cyborg avatar boring-cyborg commented on September 28, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

from airflow.

bangnh1 avatar bangnh1 commented on September 28, 2024

I'm having the same issue. Airfow tasks stuck after a few days of running. I handled it by restarting scheduler pods.

from airflow.

shahar1 avatar shahar1 commented on September 28, 2024

Could you please provide more details for reproducing the issue? (e.g., Airflow settings, what operators are being used in the tasks, versions of operators, etc.)
More details are needed so someone can reproduce and fix the issue themselves.

from airflow.

jedcunningham avatar jedcunningham commented on September 28, 2024

Assuming you haven't installed a different version of the k8s provider, you're running on 8.1.1.

@hbc-acai @bangnh1 do either of you see pods stacking and not being deleted up in k8s? If you are exporting metrics to statsd/prometheus, what do you see for executor.open_slots and executor.running_tasks during that time period?

from airflow.

hbc-acai avatar hbc-acai commented on September 28, 2024

@jedcunningham Yes, I do see more than 10 more pods show in READY=0/1 and Status=complete states when I run kubectl get pods -n airflow.

Sorry, how can I export metrics to statsd/prometheus?

Also, what is the correct why to check the k8s provider version? It is strange that I do not see a k8s provider in the package list for my scheduler:

airflow@airflow-scheduler-79b6db7fd8-x77mk:/opt/airflow$ pip freeze
adal==1.2.7
adlfs==2024.4.1
aiobotocore==2.12.3
aiofiles==23.2.1
aiohttp==3.9.5
aioitertools==0.11.0
aioodbc==0.4.1
aiosignal==1.3.1
alembic==1.13.1
amqp==5.2.0
annotated-types==0.6.0
anyio==4.3.0
apache-airflow==2.9.1
apache-airflow-providers-amazon==8.20.0
apache-airflow-providers-celery==3.6.2
apache-airflow-providers-cncf-kubernetes==8.1.1
apache-airflow-providers-common-io==1.3.1
apache-airflow-providers-common-sql==1.12.0
apache-airflow-providers-docker==3.10.0
apache-airflow-providers-elasticsearch==5.3.4
apache-airflow-providers-fab==1.0.4
apache-airflow-providers-ftp==3.8.0
apache-airflow-providers-google==10.17.0
apache-airflow-providers-grpc==3.4.1
apache-airflow-providers-hashicorp==3.6.4
apache-airflow-providers-http==4.10.1
apache-airflow-providers-imap==3.5.0
apache-airflow-providers-microsoft-azure==10.0.0
apache-airflow-providers-mysql==5.5.4
apache-airflow-providers-odbc==4.5.0
apache-airflow-providers-openlineage==1.7.0
apache-airflow-providers-postgres==5.10.2
apache-airflow-providers-redis==3.6.1
apache-airflow-providers-sendgrid==3.4.0
apache-airflow-providers-sftp==4.9.1
apache-airflow-providers-slack==8.6.2
apache-airflow-providers-smtp==1.6.1
apache-airflow-providers-snowflake==5.4.0
apache-airflow-providers-sqlite==3.7.1
apache-airflow-providers-ssh==3.10.1
apispec==6.6.1

from airflow.

shahar1 avatar shahar1 commented on September 28, 2024

As no further action seems required, I'll close this issue.
If you encounter this behavior again, please create another issue.

from airflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.