Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Cannot use filtered and unfiltered metric based on same measure together in one metric #1156

Open
2 tasks done
Prokos opened this issue Apr 29, 2024 · 1 comment
Open
2 tasks done
Labels
bug Something isn't working triage Tasks that need to be triaged

Comments

@Prokos
Copy link

Prokos commented Apr 29, 2024

Is this a new bug in metricflow?

  • I believe this is a new bug in metricflow
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I have a model that contains users with an ID as well as a dimension marking them as converted or not. Based on this model I want to define 3 metrics: Users, Converted Users, and Conversion Rate.

Attempting to do this leads to an aliasing error (conversion_rate depends on multiple different constrained versions of measure users), which when attempted to resolve leads to another aliasing error (Measure alias x conflicts with a measure alias used elsewhere in the model!).

Expected Behavior

I expect this to work out of the box, or if really necessary after manually setting an alias.

Steps To Reproduce

  1. Set up the following project files:

users.sql

SELECT
	date,
	user_id,

	COUNTIF(event = 'conversion') > 0 AS has_conversion,
FROM
	(
		SELECT
			*
		FROM
			UNNEST([
				STRUCT(1 AS event_id, '2024-01-01' AS date, 1 AS user_id, 'pageview' AS event),
				STRUCT(2 AS event_id, '2024-01-01' AS date, 1 AS user_id, 'conversion' AS event),
				STRUCT(3 AS event_id, '2024-01-01' AS date, 2 AS user_id, 'pageview' AS event)
			])
	)
GROUP BY
	date,
	user_id

semantic.yml

  - name: users
    model: ref('users')
    defaults:
      agg_time_dimension: date

    entities:
      - name: user
        type: primary
        expr: user_id

    dimensions:
      - name: date
        type: time
        type_params:
          time_granularity: day
      - name: has_conversion
        type: categorical

    measures:
      - name: users
        expr: user_id
        agg: count_distinct

metrics.yml

    - name: users
      label: Users
      type: simple
      type_params:
          measure:
              name: users

    - name: converted_users
      label: Converted Users
      type: simple
      type_params:
          measure:
              name: users
              filter: |
                  {{ Dimension('users__has_conversion') }} = TRUE

    - name: conversion_rate
      label: Conversion Rate
      type: ratio
      type_params:
          numerator:
              name: converted_users
          denominator:
              name: users
  1. Run dbt compile
  2. Get the following error:
    PydanticMetric conversion_rate depends on multiple different constrained versions of measure users. In such cases, aliases must be provided, but the following input measures have constraints specified without an alias: [PydanticMetricInputMeasure(name='users', filter=PydanticWhereFilterIntersection(where_filters=[PydanticWhereFilter(where_sql_template="{{ Dimension('users__has_conversion') }} = TRUE\n")]), alias=None, join_to_timespine=False, fill_nulls_with=None)].
  3. Attempt to add an alias to the converted_users metric and run dbt compile again.
  4. Get the following error:
    Measure alias users_alias conflicts with a measure alias used elsewhere in the model! This can cause ambiguity for certain types of query. Please choose another alias, or, if the measures are constrained in the same way, consider centralizing that definition in a new semantic model. Measure specification: name='users' filter=PydanticWhereFilterIntersection(where_filters=[PydanticWhereFilter(where_sql_template="{{ Dimension('users__has_conversion') }} = TRUE\n")]) alias='users_alias' join_to_timespine=False fill_nulls_with=None. Existing metrics with that measure alias used: ['converted_users']

Additional notes:

  • When removing the filter from converted_users the issue is "fixed". When replacing the filter with a simple 1 = 1 check the issue persists.
  • Adding aliases to all metrics (instead of only converted_users) does not solve the problem.
  • Adding another measure to users.sql that is a duplicate of users but with a different name (like, users_duplicate), and then using this measure instead for converted_users circumvents the problem.

Relevant log output

No response

Environment

- OS: 
Distributor ID:	Debian
Description:	Debian GNU/Linux 11 (bullseye)
Release:	11
Codename:	bullseye

- Python: 3.10.7
- dbt: 1.7.10
- metricflow: 0.205.0

Which database are you using?

bigquery

Additional Context

No response

@Prokos Prokos added bug Something isn't working triage Tasks that need to be triaged labels Apr 29, 2024
@tlento
Copy link
Contributor

tlento commented May 14, 2024

Ok, I've figured this out. It's possible we can simply remove this validation check. It's also possible doing so will put people in a position where they have ambiguous names for metric inputs.

The problem you're running into here is that derived metrics that depend on a metric with a filtered measure input cause problems. The solution, at least at first level, is to move the filter off of the measure input and onto the metric. So this:

 - name: converted_users
    label: Converted Users
    type: simple
    type_params:
        measure:
            name: users
            filter: |
                  {{ Dimension('users__has_conversion') }} = TRUE

Becomes this:

 - name: converted_users
    label: Converted Users
    type: simple
    filter: |
          {{ Dimension('users__has_conversion') }} = TRUE
    type_params:
        measure:
            name: users

You may need further aliasing on the converted_users metric, but that should unstick you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Tasks that need to be triaged
Projects
None yet
Development

No branches or pull requests

2 participants