Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Missing and outdated Airflow metrics >=2.7.0 #18584

Open
Casara360 opened this issue Sep 13, 2024 · 1 comment
Open

[Doc] Missing and outdated Airflow metrics >=2.7.0 #18584

Casara360 opened this issue Sep 13, 2024 · 1 comment

Comments

@Casara360
Copy link

Casara360 commented Sep 13, 2024

The DD_DOGSTATSD_MAPPER_PROFILES config used by the Airflow integration is missing the queued_duration metric, and mishandles task_instance_created past 2.6.3.

Queued duration

The dag.<dag_id>.<task_id>.queued_duration looks to have been introduced in Airflow 2.7.0, and the DogStatsD mapper profile provided on the Integrations page is missing its entry.
As a result, it's exploding the number of metrics (+3k in my case).
I suggest adding the following entry

{
    "match": "airflow\\.dag\\.(.*)\\.([^.]*)\\.queued_duration",
    "match_type": "regex",
    "name": "airflow.dag.task.queued_duration",
    "tags": {"dag_id": "$1", "task_id": "$2"},
},

Task instance created

From version 2.6.3 to 2.7.0, the counter the task instance created metric changed format:

# < 2.7.0
task_instance_created-<operator_name>
# >= 2.7.0
task_instance_created_<operator_name>

This results in 1 metric per operator, which makes it hardly usable.
The current mapping is

  {
      "match": "airflow.task_instance_created-*",
      "name": "airflow.task.instance_created",
      "tags": {"task_class": "$1"},
  }

which could perhaps be

{
    "match": "airflow.task_instance_created*",
    "name": "airflow.task.instance_created",
    "tags": {"task_class": "$1"},
}

to handle both versions ?


Would it be worth adding a disclaimer, warning the user that the mapping was created for version X ?

@Casara360
Copy link
Author

Same for scheduled_duration, need to add

{
    "match": "airflow\\.dag\\.(.*)\\.([^.]*)\\.scheduled_duration",
    "match_type": "regex",
    "name": "airflow.dag.task.scheduled_duration",
    "tags": {"dag_id": "$1", "task_id": "$2"},
},

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant