Metastore Variables are not recognized by scheduler #43151
Labels
area:core
area:MetaDB
Meta Database related issues.
area:Scheduler
including HA (high availability) scheduler
kind:bug
This is a clearly a bug
needs-triage
label for new issues that we didn't triage yet
Apache Airflow version
2.10.2
If "Other Airflow 2 version" selected, which one?
No response
What happened?
We are using custom plugin with listener on_dag_run_running. That is being used within scheduler. It should fetch the variable defined by either Airflow UI or Airflow CLI, but fails to do it.
KeyError: 'Variable monitoring_api_key does not exist'
I've been debugging this a bit and checked how airflow.models.variable.Variable picks it up.
So I've used ensure_secrets_loaded and iterated through secret backends, only without try/except block.
for secrets_backend in ensure_secrets_loaded(): var_val = secrets_backend.get_variable(key=key) print(var_val)
Everything was fine until it reached Metastore and throw error
[2024-10-14T14:30:15.578+0200] {variable.py:357} ERROR - Unable to retrieve variable from secrets backend (MetastoreBackend). Checking subsequent secrets backend. Traceback (most recent call last): File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/models/variable.py", line 353, in get_variable_from_secrets var_val = secrets_backend.get_variable(key=key) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/utils/session.py", line 96, in wrapper with create_session() as session: File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/contextlib.py", line 144, in __exit__ next(self.gen) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/utils/session.py", line 57, in create_session session.commit() File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1454, in commit self._transaction.commit(_to_root=self.future) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 832, in commit self._prepare_impl() File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 800, in _prepare_impl self.session.dispatch.before_commit(self.session) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/sqlalchemy/event/attr.py", line 346, in __call__ fn(*args, **kw) File "/data01/airflow/.pyenv/versions/3.12.6/lib/python3.12/site-packages/airflow/utils/sqlalchemy.py", line 424, in _validate_commit raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!") RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
If variable is used from envrionment or Hashicorp vault, then everything is fine. Only issue is if the variable exists only in Metastore.
What you think should happen instead?
Scheduler should be able to pickup variables from Metastore as well.
How to reproduce
Set variable with Airflow UI or Airflow CLI. Do not set variable into environment or secrets backend.
Try to use variable within scheduler - eg. use event listener on_dag_run_running and use
`
import logging
from airflow.models import Variable
from datetime import datetime
from airflow.listeners import hookimpl
from airflow.plugins_manager import AirflowPlugin
class Plg(AirflowPlugin):
class Listener:
@hookimpl
def on_dag_run_running(self, dag_run, msg: str):
"""
This method is called when dag run state changes to RUNNING.
"""
start_date = datetime.utcnow()
state = dag_run.get_state()
var_val = Variable.get("your_variable")
logging.info(f"LSNR Dag running, status:{start_date} state:{state}, variable {var_val}")
`
Operating System
Red Hat Enterprise Linux 8.10 (Ootpa)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==8.28.0
apache-airflow-providers-celery==3.8.1
apache-airflow-providers-common-compat==1.2.0
apache-airflow-providers-common-io==1.4.0
apache-airflow-providers-common-sql==1.16.0
apache-airflow-providers-dbt-cloud==3.10.0
apache-airflow-providers-fab==1.3.0
apache-airflow-providers-ftp==3.11.0
apache-airflow-providers-google==10.22.0
apache-airflow-providers-hashicorp==3.8.0
apache-airflow-providers-http==4.13.0
apache-airflow-providers-imap==3.7.0
apache-airflow-providers-jdbc==4.5.0
apache-airflow-providers-microsoft-mssql==3.9.0
apache-airflow-providers-mysql==5.7.0
apache-airflow-providers-odbc==4.7.0
apache-airflow-providers-oracle==3.11.0
apache-airflow-providers-postgres==5.12.0
apache-airflow-providers-salesforce==5.8.0
apache-airflow-providers-sftp==4.11.0
apache-airflow-providers-slack==8.9.0
apache-airflow-providers-smtp==1.8.0
apache-airflow-providers-snowflake==5.7.0
apache-airflow-providers-sqlite==3.9.0
apache-airflow-providers-ssh==3.13.1
Deployment
Virtualenv installation
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: