Giter VIP home page Giter VIP logo

airflow-azure's People

Contributors

lenadroid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

airflow-azure's Issues

Worker node cannot write to the db

Thanks for the great guide!

I'm currently following the guide with an airflow 2.0 image.

I'm running into this issue where my worker nodes crashs with the error

[2021-06-22 22:54:17,516] {cli_action_loggers.py:105} WARNING - Failed to log action with (sqlite3.OperationalError) no such table: log [SQL: INSERT INTO log (dttm, dag_id, task_id, event, execution_date, owner, extra) VALUES (?, ?, ?, ?, ?, ?, ?)] [parameters: ('2021-06-22 22:54:17.511978', 'airflow_tutorial_v01', 'print_hello', 'cli_task_run', '2021-06-22 22:47:52.924823', 'airflow', '{"host_name": "airflowtutorialv01printhello.34fc0661f3f845f29464738aa150b18f", "full_command": "[\'/home/airflow/.local/bin/airflow\', \'tasks\', \'r ... (28 characters truncated) ... \', \'print_hello\', \'2021-06-22T22:47:52.924823+00:00\', \'--local\', \'--pool\', \'default_pool\', \'--subdir\', \'/opt/airflow/dags/hello.py\']"}')]

The scheduler can connect to the postgres db verified by $: airflow db check

[2021-06-22 22:56:15,550] {db.py:776} INFO - Connection successful.
and the presence of the requisite tables in postgres db having been written to.

I believe my metadata connection secrets is ok:

echo -n "postgresql+psycopg2://airflow%40{hostname}:{pwd}@{hostname}.postgres.database.azure.com:5432/airflow" | base64

Any advice on how to proceed would be wonderful, I'm a bit lost here.

Thanks!

File share volumes fail to mount

Hi

I'm wonderring if you can help, I have been following the guide, but I seem to have an issue with the volumes not mounting when connecting to the file shares I have setup when the scheduler and web pods are creating.

This is the output from describing the scheduler pod:

Name:           scheduler-fb5579585-g56sx
Namespace:      airflow3
Priority:       0
Node:           aks-nodepool1-10959598-vmss000000/10.203.30.4
Start Time:     Fri, 12 Feb 2021 12:17:23 +0000
Labels:         component=scheduler
                pod-template-hash=fb5579585
                release=RELEASE-NAME
                tier=airflow
Annotations:    checksum/airflow-config: 6132d4c762bec566a83667e8a23486fcbc29157811f277b66e6568047f627c14
                checksum/metadata-secret: a3512f27fea8455cdddc51ef650052d74657bcaa16194d24b555417e312d43da
                checksum/pgbouncer-config-secret: da52bd1edfe820f0ddfacdebb20a4cc6407d296ee45bcb500a6407e2261a5ba2
                checksum/result-backend-secret: 4bd4a60ef60435fe29fc8135a43a436c0854074a228246c67a6e7488b138200f
                cluster-autoscaler.kubernetes.io/safe-to-evict: true
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/scheduler-fb5579585
Init Containers:
  run-airflow-migrations:
    Container ID:
    Image:         apache/airflow:1.10.10.1-alpha2-python3.6
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Args:
      bash
      -c
      airflow upgradedb || airflow db upgrade
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      AIRFLOW__CORE__FERNET_KEY:        <set to the key 'fernet-key' in secret 'fernet-key'>        Optional: false
      AIRFLOW__CORE__SQL_ALCHEMY_CONN:  <set to the key 'connection' in secret 'airflow-metadata'>  Optional: false
      AIRFLOW_CONN_AIRFLOW_DB:          <set to the key 'connection' in secret 'airflow-metadata'>  Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from scheduler-serviceaccount-token-f2vgh (ro)
Containers:
  scheduler:
    Container ID:
    Image:         apache/airflow:1.10.10.1-alpha2-python3.6
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Args:
      scheduler
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       exec [python -Wignore -c import os
os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'

from airflow.jobs.scheduler_job import SchedulerJob
from airflow.utils.net import get_hostname
import sys

job = SchedulerJob.most_recent_job()
sys.exit(0 if job.is_alive() and job.hostname == get_hostname() else 1)
] delay=0s timeout=1s period=30s #success=1 #failure=10
    Environment:
      AIRFLOW__CORE__FERNET_KEY:        <set to the key 'fernet-key' in secret 'fernet-key'>        Optional: false
      AIRFLOW__CORE__SQL_ALCHEMY_CONN:  <set to the key 'connection' in secret 'airflow-metadata'>  Optional: false
      AIRFLOW_CONN_AIRFLOW_DB:          <set to the key 'connection' in secret 'airflow-metadata'>  Optional: false
    Mounts:
      /opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
      /opt/airflow/dags from dags-pv (rw)
      /opt/airflow/logs from logs-pv (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from scheduler-serviceaccount-token-f2vgh (ro)
  scheduler-gc:
    Container ID:
    Image:         apache/airflow:1.10.10.1-alpha2-python3.6
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Args:
      bash
      /clean-logs
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /opt/airflow/logs from logs-pv (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from scheduler-serviceaccount-token-f2vgh (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      airflow-config
    Optional:  false
  dags-pv:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  dags-pvc
    ReadOnly:   false
  logs-pv:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  logs-pvc
    ReadOnly:   false
  scheduler-serviceaccount-token-f2vgh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  scheduler-serviceaccount-token-f2vgh
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason       Age                   From                                        Message
  ----     ------       ----                  ----                                        -------
  Warning  FailedMount  45m (x15 over 5h41m)  kubelet, aks-nodepool1-10959598-vmss000000  MountVolume.MountDevice failed for volume "logs-pv" : rpc error: code = Internal desc = volume(fs-logs) mount "//testairflowpoc.file.core.windows.net/logshare" on "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/logs-pv/globalmount" failed with mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t cifs -o dir_mode=0777,file_mode=0777,uid=0,gid=0,mfsymlinks,cache=strict,nosharesock,actimeo=30,vers=3.0,<masked> //testairflowpoc.file.core.windows.net/logshare /var/lib/kubelet/plugins/kubernetes.io/csi/pv/logs-pv/globalmount
Output: mount error(13): Permission denied
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)
  Warning  FailedMount  31m (x68 over 5h37m)   kubelet, aks-nodepool1-10959598-vmss000000  Unable to attach or mount volumes: unmounted volumes=[logs-pv dags-pv], unattached 
volumes=[scheduler-serviceaccount-token-f2vgh logs-pv dags-pv config]: timed out waiting for the condition
  Warning  FailedMount  11m (x147 over 5h41m)  kubelet, aks-nodepool1-10959598-vmss000000  MountVolume.MountDevice failed for volume "dags-pv" : rpc error: code = Internal desc = volume(fs-dags) mount "//testairflowpoc.file.core.windows.net/dagshare" on "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/dags-pv/globalmount" failed with mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t cifs -o dir_mode=0777,file_mode=0777,uid=0,gid=0,mfsymlinks,cache=strict,nosharesock,vers=3.0,actimeo=30,<masked> //testairflowpoc.file.core.windows.net/dagshare /var/lib/kubelet/plugins/kubernetes.io/csi/pv/dags-pv/globalmount
Output: mount error(13): Permission denied
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)
  Warning  FailedMount  6m35s (x24 over 5h33m)  kubelet, aks-nodepool1-10959598-vmss000000  Unable to attach or mount volumes: unmounted volumes=[logs-pv dags-pv], unattached volumes=[logs-pv dags-pv config scheduler-serviceaccount-token-f2vgh]: timed out waiting for the condition
  Warning  FailedMount  63s (x153 over 5h41m)   kubelet, aks-nodepool1-10959598-vmss000000  MountVolume.MountDevice failed for volume "logs-pv" : rpc error: code = Internal desc = volume(fs-logs) mount "//testairflowpoc.file.core.windows.net/logshare" on "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/logs-pv/globalmount" failed with mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t cifs -o dir_mode=0777,file_mode=0777,uid=0,gid=0,mfsymlinks,cache=strict,nosharesock,vers=3.0,actimeo=30,<masked> //testairflowpoc.file.core.windows.net/logshare /var/lib/kubelet/plugins/kubernetes.io/csi/pv/logs-pv/globalmount
Output: mount error(13): Permission denied
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)

The volumes dont appear to mount and I'm not too sure why.
At the moment there are no dag definition or any files in the shares, not sure if that should make any difference.
From what I can see the PV's and PVC's have been created as well as the Secrets with the storage name and account key as described in the guide.

I dont know if I need to do anything on the file shares or storage account to grant access or if I am missing something.
AKS v1.18.8

Any help would be great

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.