-
Notifications
You must be signed in to change notification settings - Fork 4
/
CHANGELOG
204 lines (161 loc) · 9.2 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
Snowplow Normalize 0.3.5 (2024-03-18)
---------------------------------------
## Summary
This release adds support for [schema grants](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/package-features/table-grants/#granting-usage-on-schemas)
## Features
- Add support for schema grants
## Under the hood
- Enforce full refresh flag to refresh manifest tables
## Upgrading
To upgrade simply bump the snowplow-normalize version in your `packages.yml` file. Note the minimum version of snowplow-utils required is now 0.16.2
Snowplow Normalize 0.3.4 (2024-01-26)
---------------------------------------
## Summary
This version bumps the package dependency to add support for the latest snowplow utils package. Please note that from this version onwards this package is under the SPAL license.
## Under the hood
- Bump support for latest utils
- Migrate to SPAL license
## Upgrading
To upgrade simply bump the snowplow-normalize version in your `packages.yml` file.
Snowplow Normalize 0.3.3 (2023-09-29)
---------------------------------------
## Summary
- Include the new base macro functionality from utils in the package
- Allow users to specify the timestamp used to process events (from the default of `collector_tstamp`)
## Under the hood
- Simplify the model architecture
## Upgrading
Bump the snowplow-normalize version in your `packages.yml` file.
Snowplow Normalize 0.3.2 (2023-09-12)
---------------------------------------
## Summary
Bumps the max supported `snowplow-utils` version to allow usage with our other packages.
## Upgrading
Bump the snowplow-normalize version in your `packages.yml` file.
Snowplow Normalize 0.3.1 (2023-06-14)
---------------------------------------
## Summary
This version bumps the requirement of the `jsonschema` package to validate schemas with the `MultipleOf` property.
## Fixes
- Bump `jsonschema` minimum version (Close #33)
## Upgrading
To upgrade the package, bump the version number in the `packages.yml` file in your project.
Snowplow Normalize 0.3.0 (2023-03-28)
---------------------------------------
## Summary
This version migrates our models away from the `snowplow_incremental_materialization` and instead move to using the built-in `incremental` with an optimization applied on top.
## 🚨 Breaking Changes 🚨
### Changes to materialization
To take advantage of the optimization we apply to the `incremental` materialization, users will need to add the following to their `dbt_project.yml` :
```yaml
# dbt_project.yml
...
dispatch:
- macro_namespace: dbt
search_order: ['snowplow_utils', 'dbt']
```
For custom models please refer to the [snowplow utils migration guide](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/migration-guides/utils/#upgrading-to-0140) and the latest docs on [creating custom incremental models](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-custom-models/#incremental-models).
## Features
- Migrate from `get_cluster_by` and `get_partition_by` to `get_value_by_target_type`
- Migrate all models to use new materialization
## Docs
- Update readme
## Upgrading
Bump the snowplow-normalize version in your `packages.yml` file, and ensuring you have followed the above steps. You can read more in our [upgrade guide](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/migration-guides/normalize/#upgrading-to-0140)
Snowplow Normalize 0.2.3 (2023-03-16)
---------------------------------------
## Summary
This release allows users to disable the days late data filter to enable normalizing of events that don't populate the `dvce_sent_tstamp` field.
## Features
- Allow disabling of days late filter by setting `snowplow__days_late_allowed` to `-1` (#28)
## Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project.
Snowplow Normalize 0.2.2 (2023-03-13)
---------------------------------------
## Summary
This release fixes an issue with column aliasing in BigQuery when the schema order differs from the table in the warehouse. It also adds the ability to alias your `user_id` column and add flat `atomic.events` columns into your `users` table.
## Features
- Fix column alias ordering issue in BigQuery
- Add ability to alias `user_id` column
- Add ability to add flat columns to the events table
## Under the hood
- Alter github pages publishing action
## Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project. To use the new features please see our docs for the new values to add to your config file.
Snowplow Normalize 0.2.1 (2023-01-23)
---------------------------------------
## Summary
This release fixes the expected path for a private registry to require it to end with `/api` in line with other resolvers. It also upgrades the schema requirement for the resolver to 1.0.3 so you can use newer resolver files without issues.
## Features
- Fix private registry uri requirement
- Bump iglu resolver schema version
## Under the hood
- Fix github pages publishing action
## Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project.
Snowplow Normalize 0.2.0 (2022-12-19)
---------------------------------------
## Summary
This release drops support for dbt versions <1.3 to use the latest dbt-utils package, adds functionality for custom user_ids and multiple events per table, as well as ensuring all appropriate versions of BigQuery contexts are used. Due to these changes this version contains a number of breaking changes, so please read the Upgrade section carefully.
## 🚨 Breaking Changes 🚨
- Config file structure has changed to enable multiple event types per table
- Macro inputs have changed to enable multiple event types per table and custom user_id field
- The filtered events table has a new column to enable multiple event types per table
- Support for versions of dbt < 1.3 has been dropped
## Features
- Allow for multiple event types (including self-describing) per normalized table
- Allow for custom user_id field within users table
- BigQuery optimized to use all same major version number sdes and contexts (in line with other Snowplow packages)
- Enhanced testing and warnings under the hood
- Drop support for dbt versions below 1.3 (Close #17)
## Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project.
### Upgrading your config file
To upgrade your config file:
- Change the `event_name` field to `event_names` and make the value a list
- Change the `self_describing_event_schema` field to `self_describing_event_schemas` and make the value a list
- If you wish to make use of the new features, see the example config or the docs for more information
### Upgrading your models
Once you have upgraded your config file, the easiest way to ensure your models match the new macros is to re-run the Python script. If you would prefer not to do thi, you can:
- For each normalized model:
- Convert the `event_name` and `sde_cols` fields to lists, and pluralize the names in both the set and the macro call
- Add a new field, `sde_aliases` which is an empty list, add this between `sde_types` and `context_cols` in the macro call
- For your filtered events table:
- Change the `unique_key` in the config section to `unique_id`
- Add a line between the `event_table_name` and `from` lines for each select statement; `, event_id||'-'||'<that_event_table_name>' as unique_id`, with the event table name for that select block.
- For your users table:
- Add 3 new values to the start of the macro call, `'user_id','',''`, before the `user_cols` argument.
### Upgrade your filtered events table
If you use the master filtered events table, you will need to add a new column for the latest version to work. If you have not processed much data yet it may be easier to simply re-run the package from scratch using `dbt run --full-refresh --vars 'snowplow__allow_refresh: true'`, alternatively run the following in your warehouse, replacing the schema/dataset/warehouse and table name for your table:
```sql
ALTER TABLE {schema}.{table} ADD COLUMN unique_id STRING;
UPDATE {schema}.{table} SET unique_id = event_id||'-'||event_table_name WHERE 1 = 1;
```
Snowplow Normalize 0.1.0 (2022-12-05)
---------------------------------------
## Summary
This is the first full release of the Snowplow Normalize package, it stabilizes features and file formats for future developments and delivers a working product for use within your analysis.
## Features
- Python script to generate models to normalize your custom events and entities
- Support for Databricks, BigQuery, and Snowflake
- Full test suite to avoid unexpected breaking changes in the future
- Example config and resolver files to get started fast
- Name of package agreed (Close #5)
## Installation
To install the package, add the following to the `packages.yml` in your project:
### Github
```yml
packages:
- git: "https://github.com/snowplow/dbt-snowplow-normalize.git"
revision: 0.1.0
```
### dbt hub
Please note that it may be a few days before the package is available on dbt hub after the initial release.
```yml
packages:
- package: snowplow/snowplow_normalize
version: [">=0.1.0", "<0.2.0"]
```
Snowplow Normalize 0.0.1 (2022-10-21)
---------------------------------------
Initial commit of package