Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metastore Variables are not recognized by scheduler #43151

Open
1 of 2 tasks
BretaGlac opened this issue Oct 18, 2024 · 0 comments
Open
1 of 2 tasks

Metastore Variables are not recognized by scheduler #43151

BretaGlac opened this issue Oct 18, 2024 · 0 comments
Labels
area:core area:MetaDB Meta Database related issues. area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet

Comments

@BretaGlac
Copy link

Apache Airflow version

2.10.2

If "Other Airflow 2 version" selected, which one?

No response

What happened?

We are using custom plugin with listener on_dag_run_running. That is being used within scheduler. It should fetch the variable defined by either Airflow UI or Airflow CLI, but fails to do it.
KeyError: 'Variable monitoring_api_key does not exist'
I've been debugging this a bit and checked how airflow.models.variable.Variable picks it up.
So I've used ensure_secrets_loaded and iterated through secret backends, only without try/except block.
for secrets_backend in ensure_secrets_loaded(): var_val = secrets_backend.get_variable(key=key) print(var_val)

Everything was fine until it reached Metastore and throw error
[2024-10-14T14:30:15.578+0200] {variable.py:357} ERROR - Unable to retrieve variable from secrets backend (MetastoreBackend). Checking subsequent secrets backend. Traceback (most recent call last): File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/models/variable.py", line 353, in get_variable_from_secrets var_val = secrets_backend.get_variable(key=key) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/utils/session.py", line 96, in wrapper with create_session() as session: File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/contextlib.py", line 144, in __exit__ next(self.gen) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/utils/session.py", line 57, in create_session session.commit() File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1454, in commit self._transaction.commit(_to_root=self.future) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 832, in commit self._prepare_impl() File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 800, in _prepare_impl self.session.dispatch.before_commit(self.session) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/event/attr.py", line 346, in __call__ fn(*args, **kw) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/utils/sqlalchemy.py", line 424, in _validate_commit raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!") RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!

If variable is used from envrionment or Hashicorp vault, then everything is fine. Only issue is if the variable exists only in Metastore.

What you think should happen instead?

Scheduler should be able to pickup variables from Metastore as well.

How to reproduce

Set variable with Airflow UI or Airflow CLI. Do not set variable into environment or secrets backend.
Try to use variable within scheduler - eg. use event listener on_dag_run_running and use
`
import logging
from airflow.models import Variable
from datetime import datetime
from airflow.listeners import hookimpl
from airflow.plugins_manager import AirflowPlugin

class Plg(AirflowPlugin):
class Listener:
@hookimpl
def on_dag_run_running(self, dag_run, msg: str):
"""
This method is called when dag run state changes to RUNNING.
"""
start_date = datetime.utcnow()
state = dag_run.get_state()
var_val = Variable.get("your_variable")
logging.info(f"LSNR Dag running, status:{start_date} state:{state}, variable {var_val}")

`

Operating System

Red Hat Enterprise Linux 8.10 (Ootpa)

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.28.0
apache-airflow-providers-celery==3.8.1
apache-airflow-providers-common-compat==1.2.0
apache-airflow-providers-common-io==1.4.0
apache-airflow-providers-common-sql==1.16.0
apache-airflow-providers-dbt-cloud==3.10.0
apache-airflow-providers-fab==1.3.0
apache-airflow-providers-ftp==3.11.0
apache-airflow-providers-google==10.22.0
apache-airflow-providers-hashicorp==3.8.0
apache-airflow-providers-http==4.13.0
apache-airflow-providers-imap==3.7.0
apache-airflow-providers-jdbc==4.5.0
apache-airflow-providers-microsoft-mssql==3.9.0
apache-airflow-providers-mysql==5.7.0
apache-airflow-providers-odbc==4.7.0
apache-airflow-providers-oracle==3.11.0
apache-airflow-providers-postgres==5.12.0
apache-airflow-providers-salesforce==5.8.0
apache-airflow-providers-sftp==4.11.0
apache-airflow-providers-slack==8.9.0
apache-airflow-providers-smtp==1.8.0
apache-airflow-providers-snowflake==5.7.0
apache-airflow-providers-sqlite==3.9.0
apache-airflow-providers-ssh==3.13.1

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@BretaGlac BretaGlac added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Oct 18, 2024
@dosubot dosubot bot added area:MetaDB Meta Database related issues. area:Scheduler including HA (high availability) scheduler labels Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core area:MetaDB Meta Database related issues. area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet
Projects
None yet
Development

No branches or pull requests

1 participant