Comments (9)
It's this one, apache-airflow-providers-cncf-kubernetes==8.1.1
, sorry, I tend to just refer to it as the "k8s provider", but its actually named "CNCF Kubernetes".
For metrics info, see https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/metrics.html.
I suspect you might be hitting #36998, and 8.3.0
from #39551 might fix it? Can you try upgrading to that version?
from airflow.
I upgraded from 2.7.3 directly to 2.9.1 and this issue showed up. We did not catch it during testing because it only happened after running the scheduler for some time. Today I tested it more in our dev cluster by running more tasks, and the same issue showed up.
We use KubernetesExecutor, and mostly Python and PythonVirtualEnvOperator.
Meta database: PostgresDB 15.6
DAGs and logs Persistent Volumes: Azure Blob Storage
Single Scheduler
from airflow.
Thanks @jedcunningham. I upgraded apache-airflow-providers-cncf-kubernetes==8.3.0
and I have not seen the same error in the last few hours.
from airflow.
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
from airflow.
I'm having the same issue. Airfow tasks stuck after a few days of running. I handled it by restarting scheduler pods.
from airflow.
Could you please provide more details for reproducing the issue? (e.g., Airflow settings, what operators are being used in the tasks, versions of operators, etc.)
More details are needed so someone can reproduce and fix the issue themselves.
from airflow.
Assuming you haven't installed a different version of the k8s provider, you're running on 8.1.1.
@hbc-acai @bangnh1 do either of you see pods stacking and not being deleted up in k8s? If you are exporting metrics to statsd/prometheus, what do you see for executor.open_slots
and executor.running_tasks
during that time period?
from airflow.
@jedcunningham Yes, I do see more than 10 more pods show in READY=0/1 and Status=complete states when I run kubectl get pods -n airflow
.
Sorry, how can I export metrics to statsd/prometheus?
Also, what is the correct why to check the k8s provider version? It is strange that I do not see a k8s provider in the package list for my scheduler:
airflow@airflow-scheduler-79b6db7fd8-x77mk:/opt/airflow$ pip freeze
adal==1.2.7
adlfs==2024.4.1
aiobotocore==2.12.3
aiofiles==23.2.1
aiohttp==3.9.5
aioitertools==0.11.0
aioodbc==0.4.1
aiosignal==1.3.1
alembic==1.13.1
amqp==5.2.0
annotated-types==0.6.0
anyio==4.3.0
apache-airflow==2.9.1
apache-airflow-providers-amazon==8.20.0
apache-airflow-providers-celery==3.6.2
apache-airflow-providers-cncf-kubernetes==8.1.1
apache-airflow-providers-common-io==1.3.1
apache-airflow-providers-common-sql==1.12.0
apache-airflow-providers-docker==3.10.0
apache-airflow-providers-elasticsearch==5.3.4
apache-airflow-providers-fab==1.0.4
apache-airflow-providers-ftp==3.8.0
apache-airflow-providers-google==10.17.0
apache-airflow-providers-grpc==3.4.1
apache-airflow-providers-hashicorp==3.6.4
apache-airflow-providers-http==4.10.1
apache-airflow-providers-imap==3.5.0
apache-airflow-providers-microsoft-azure==10.0.0
apache-airflow-providers-mysql==5.5.4
apache-airflow-providers-odbc==4.5.0
apache-airflow-providers-openlineage==1.7.0
apache-airflow-providers-postgres==5.10.2
apache-airflow-providers-redis==3.6.1
apache-airflow-providers-sendgrid==3.4.0
apache-airflow-providers-sftp==4.9.1
apache-airflow-providers-slack==8.6.2
apache-airflow-providers-smtp==1.6.1
apache-airflow-providers-snowflake==5.4.0
apache-airflow-providers-sqlite==3.7.1
apache-airflow-providers-ssh==3.10.1
apispec==6.6.1
from airflow.
As no further action seems required, I'll close this issue.
If you encounter this behavior again, please create another issue.
from airflow.
Related Issues (20)
- Executor reports task instance <TaskInstance: XXX [queued]> finished (failed) although the task says it's queued. (Info: None) Was the task killed externally HOT 4
- Clearing a Mapped Task (with Downstream included) clears all Downstream tasks instead of mapped downstream tasks HOT 3
- Airflow REST GET API call for TaskInstances returns unordered items, making paged retrieval impossible HOT 3
- taskflow does not understand * (argument unpack operator): ValueError: The key 'how' in args is a part of kwargs and therefore reserved. HOT 3
- DatasetOrTimeSchedule causes deadlock HOT 4
- Airflow DAG throws GlueJobOperator is not JSON serializable HOT 25
- Airflow UI Event Log Notification
- SparkSubmitHook requires yarn binary HOT 4
- Hash Virtual Environment Cache Based on Actual Package Versions in `PythonVirtualEnvOperator` HOT 9
- Missing dependency `methodtools` when using `microsoft-mssql==3.8.0` with `airflow==2.7.3` HOT 1
- Graph view crashes when a DAG is empty HOT 3
- [OpenAI Provider] Support Batch API in OpenAI Hook HOT 2
- Add a CoalesceOperator to Airflow Providers HOT 6
- DAGs go missing after a while HOT 8
- Race condition Airflow's Celery executor timeout and import redis leave a broken import HOT 3
- [Providers] Airbyte new deployment version changed API endpoints HOT 5
- Attribute error: 'MsSqlHook' object has no attribute 'get_conn_id' due to breaking changes made in a new release 3.8.0 of apache-airflow-providers-microsoft-mssql library HOT 4
- Bug Report: Issues with Task Group Dependencies and Dynamic Task Outputs in Airflow HOT 1
- Setup Github Runners via Kubernetes Controller on our AWS instance
- Microsoft-Azure - ms graph hook - Tenant ID fetched from wrong extra_dejson key.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from airflow.