Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/v0.12.0 #79

Merged
merged 34 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
b50f183
feature/performance-updates
fivetran-catfritz Mar 18, 2024
53b3d89
update cluster
fivetran-catfritz Mar 18, 2024
00889bf
update docs
fivetran-catfritz Mar 18, 2024
32c976f
add partition by macro
fivetran-catfritz Mar 18, 2024
03efb17
adjust whitespace
fivetran-catfritz Mar 18, 2024
bf7e2da
update changelog, decisionlog, readme
fivetran-catfritz Mar 18, 2024
c855908
updates & regen docs
fivetran-catfritz Mar 20, 2024
63eb43d
update yml
fivetran-catfritz Mar 20, 2024
074c400
Apply suggestions from code review
fivetran-catfritz Mar 20, 2024
ccbf965
update comment
fivetran-catfritz Mar 20, 2024
d7bc25a
update yml & regen docs
fivetran-catfritz Mar 20, 2024
367ca2d
regen docs
fivetran-catfritz Mar 20, 2024
16c18d1
update docs and tests
fivetran-catfritz Mar 20, 2024
59d79f0
update docs and tests
fivetran-catfritz Mar 20, 2024
bfa7001
update docs and tests
fivetran-catfritz Mar 20, 2024
56d3082
update yml
fivetran-catfritz Mar 20, 2024
ded7fe2
updates
fivetran-catfritz Apr 5, 2024
f161cbf
use fivetran_utils lookback & regen docs
fivetran-catfritz Apr 18, 2024
e156944
update to use fivetran_utils macros
fivetran-catfritz Apr 18, 2024
ea1b81e
regen docs
fivetran-catfritz Apr 18, 2024
56225fa
add sql warehouse testing
fivetran-catfritz Apr 19, 2024
cbc4d10
update sql warehouse
fivetran-catfritz Apr 19, 2024
578c5b2
update sql warehouse
fivetran-catfritz Apr 19, 2024
b2837eb
Updated orders case statement for fixed amount discounts (#78)
fivetran-avinash Apr 22, 2024
d931dcb
update is_databricks_sql_warehouse
fivetran-catfritz Apr 23, 2024
115d267
change to local macros
fivetran-catfritz Apr 29, 2024
05ad2da
Merge branch 'release/v0.12.0' into feature/performance-updates
fivetran-catfritz Apr 29, 2024
3954ea5
Merge pull request #76 from fivetran/feature/performance-updates
fivetran-catfritz Apr 29, 2024
91425d4
regen docs
fivetran-catfritz Apr 29, 2024
251d3c5
Update CHANGELOG.md
fivetran-catfritz Apr 30, 2024
25c52ec
Update CHANGELOG.md
fivetran-catfritz Apr 30, 2024
585b4c0
Apply suggestions from code review
fivetran-catfritz May 1, 2024
d9fb249
Update CHANGELOG.md
fivetran-catfritz May 1, 2024
485e6a1
Update packages.yml
fivetran-catfritz May 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .buildkite/hooks/pre-command
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,7 @@ export CI_SNOWFLAKE_DBT_USER=$(gcloud secrets versions access latest --secret="C
export CI_SNOWFLAKE_DBT_WAREHOUSE=$(gcloud secrets versions access latest --secret="CI_SNOWFLAKE_DBT_WAREHOUSE" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HOST=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HOST" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HTTP_PATH=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HTTP_PATH" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_CATALOG=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_CATALOG" --project="dbt-package-testing-363917")
export CI_DATABRICKS_SQL_DBT_HTTP_PATH=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_SQL_DBT_HTTP_PATH" --project="dbt-package-testing-363917")
export CI_DATABRICKS_SQL_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_SQL_DBT_TOKEN" --project="dbt-package-testing-363917")
18 changes: 17 additions & 1 deletion .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,5 +69,21 @@ steps:
- "CI_DATABRICKS_DBT_HOST"
- "CI_DATABRICKS_DBT_HTTP_PATH"
- "CI_DATABRICKS_DBT_TOKEN"
- "CI_DATABRICKS_DBT_CATALOG"
commands: |
bash .buildkite/scripts/run_models.sh databricks
bash .buildkite/scripts/run_models.sh databricks

- label: ":databricks: :database: Run Tests - Databricks SQL Warehouse"
key: "run_dbt_databricks_sql"
plugins:
- docker#v3.13.0:
image: "python:3.8"
shell: [ "/bin/bash", "-e", "-c" ]
environment:
- "BASH_ENV=/tmp/.bashrc"
- "CI_DATABRICKS_DBT_HOST"
- "CI_DATABRICKS_SQL_DBT_HTTP_PATH"
- "CI_DATABRICKS_SQL_DBT_TOKEN"
- "CI_DATABRICKS_DBT_CATALOG"
commands: |
bash .buildkite/scripts/run_models.sh databricks-sql
11 changes: 11 additions & 0 deletions .buildkite/scripts/run_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,20 @@ db=$1
echo `pwd`
cd integration_tests
dbt deps

if [ "$db" = "databricks-sql" ]; then
dbt seed --vars '{shopify_schema: shopify_integrations_tests_sqlw}' --target "$db" --full-refresh
dbt run --vars '{shopify_schema: shopify_integrations_tests_sqlw}' --target "$db" --full-refresh
dbt test --vars '{shopify_schema: shopify_integrations_tests_sqlw}' --target "$db"
dbt run --vars '{shopify_schema: shopify_integrations_tests_sqlw, shopify_timezone: "America/New_York", shopify_using_fulfillment_event: true, shopify_using_all_metafields: true}' --target "$db" --full-refresh
dbt test --vars '{shopify_schema: shopify_integrations_tests_sqlw}' --target "$db"
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"

else
dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
dbt test --target "$db"
dbt run --vars '{shopify_timezone: "America/New_York", shopify_using_fulfillment_event: true, shopify_using_all_metafields: true}' --target "$db" --full-refresh
dbt test --target "$db"
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
fi
41 changes: 10 additions & 31 deletions .github/PULL_REQUEST_TEMPLATE/maintainer_pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,27 @@
**This PR will result in the following new package version:**
<!--- Please add details around your decision for breaking vs non-breaking version upgrade. If this is a breaking change, were backwards-compatible options explored? -->

**Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:**
**Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:**
<!--- Copy/paste the CHANGELOG for this version below. -->

## PR Checklist
### Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
- [ ] dbt compile
- [ ] dbt run –full-refresh
- [ ] dbt run
- [ ] dbt test
- [ ] dbt run –vars (if applicable)
- [ ] dbt run –full-refresh && dbt test
- [ ] dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:
- [ ] The appropriate issue has been linked and tagged
- [ ] You are assigned to the corresponding issue and this PR
- [ ] The appropriate issue has been linked, tagged, and properly assigned
- [ ] All necessary documentation and version upgrades have been applied
- [ ] docs were regenerated (unless this PR does not include any code or yml updates)
- [ ] BuildKite integration tests are passing
- [ ] Detailed validation steps have been provided below

### Detailed Validation
Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":
- [ ] You have validated these changes and assure this PR will address the respective Issue/Feature.
- [ ] You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
- [ ] You have provided details below around the validation steps performed to gain confidence in these changes.
Please share any and all of your validation steps:
<!--- Provide the steps you took to validate your changes below. -->

### Standard Updates
Please acknowledge that your PR contains the following standard updates:
- Package versioning has been appropriately indexed in the following locations:
- [ ] indexed within dbt_project.yml
- [ ] indexed within integration_tests/dbt_project.yml
- [ ] CHANGELOG has individual entries for each respective change in this PR
<!--- If there is a parallel upstream change, remember to reference the corresponding CHANGELOG as an individual entry. -->
- [ ] README updates have been applied (if applicable)
<!--- Remember to check the following README locations for common updates. →
<!--- Suggested install range (needed for breaking changes) →
<!--- Dependency matrix is appropriately updated (if applicable) →
<!--- New variable documentation (if applicable) -->
- [ ] DECISIONLOG updates have been updated (if applicable)
- [ ] Appropriate yml documentation has been added (if applicable)

### dbt Docs
Please acknowledge that after the above were all completed the below were applied to your branch:
- [ ] docs were regenerated (unless this PR does not include any code or yml updates)

### If you had to summarize this PR in an emoji, which would it be?
<!--- For a complete list of markdown compatible emojis check our this git repo (https://gist.github.com/rxaviers/7360908) -->
:dancer:

3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ dbt_modules/
logs/
env/
dbt_packages/
package-lock.yml
.DS_Store
package-lock.yml
32 changes: 32 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,35 @@
# dbt_shopify v0.12.0

[PR #76](https://github.com/fivetran/dbt_shopify/pull/76) includes the following updates:

## 🚨 Breaking Changes 🚨
> ⚠️ Since the following changes are breaking, a `--full-refresh` after upgrading will be required.

- Performance improvements:
- Added an incremental strategy for the following models. These models were picked for incremental materialization based on the size of their upstream sources.
- `shopify__customer_cohorts` (For Databricks SQL Warehouse destinations, this model is materialized as a table without support for incremental runs at this time.)
- `shopify__customer_email_cohorts` (For Databricks SQL Warehouse destinations, this model is materialized as a table without support for incremental runs at this time.)
- `shopify__discounts`
- `shopify__order_lines`
- `shopify__orders`
- `shopify__transactions`
- Updated the materialization of `shopify__orders__order_line_aggregates` to a table. This model draws on several large upstream sources and is also referenced in several downstream models, so this was done to improve performance. This model was not selected for incremental materialization since its structure was not conducive to incremental strategy.
- To reduce storage, updated the default materialization of the upstream staging models from tables to views. (See the [dbt_shopify_source CHANGELOG](https://github.com/fivetran/dbt_shopify_source/blob/main/CHANGELOG.md) for more details.)

## Features
- Added a default 7-day look-back to incremental models to accommodate late arriving records. The number of days can be changed by setting the var `lookback_window` in your dbt_project.yml. See the [Lookback Window section of the README](https://github.com/fivetran/dbt_shopify/blob/main/README.md#lookback-window) for more details.
- Added macro `shopify_lookback` to streamline the lookback calculation.
- Updated the partitioning logic in window functions to use only the necessary columns, depending on whether the unioning feature is used. This benefits mainly Redshift destinations, which can see errors when the staging models are materialized as views.

## 🪲 Bug Fixes 🪛
- Corrected the `fixed_amount_discount_amount` logic to appropriately bring in fixed amount discounts in `shopify__orders`. [PR #78](https://github.com/fivetran/dbt_shopify/pull/78)
- Removed the `index=1` filter in `stg_shopify__order_discount_code` in the `dbt_shopify_source` package to ensure all discount codes are brought in for every orders. For customers with multiple discount codes in an order, this could update the `count_discount_codes_applied` field in the `shopify__orders` and `shopify__daily_shop` models. [PR #78](https://github.com/fivetran/dbt_shopify/pull/78)

## Under the Hood
- Updated the maintainer PR template to the current format.
- Added integration testing pipeline for Databricks SQL Warehouse.
- Added macro `shopify_is_databricks_sql_warehouse` for detecting if a Databricks target is an All Purpose Cluster or a SQL Warehouse.

# dbt_shopify v0.11.0
[PR #74](https://github.com/fivetran/dbt_shopify/pull/74) includes the following updates:

Expand Down
19 changes: 18 additions & 1 deletion DECISIONLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,21 @@ We have chosen to make the severity of these tests `warn`, as non-accepted value

### Currency

All monetary values reported in the Shopify end models are in the default currency of your Shop.
All monetary values reported in the Shopify end models are in the default currency of your Shop.

### Incremental Strategy
The models having an incremental strategy were chosen based on the size of their upstream models. We wanted to be selective rather than make all models incremental due to the complexity of changes and maintenance required when stacking incrementals. However, we would still like to hear feedback on these choices.

The strategies for each model are:

| Model | Bigquery/Databricks strategy | Snowflake/Postgres/Redshift strategy |
| --- | --- | --- |
| `shopify__customer_cohorts` | insert_overwrite | delete+insert |
| `shopify__customer_email_cohorts` | insert_overwrite | delete+insert |
| `shopify__discounts` | merge | delete+insert |
| `shopify__order_lines` | merge | delete+insert |
| `shopify__orders` | merge | delete+insert |
| `shopify__transactions` | merge | delete+insert |

For Bigquery and Databricks, insert_overwrite was chosen for the cohort models since the date_day grain provides a suitable column to partition on.
Merge was chosen for the remaining models since this can handle updates to the records.
25 changes: 21 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,20 @@ The following table provides a detailed list of all models materialized within t
To use this dbt package, you must have the following:

- At least one Fivetran Shopify connector syncing data into your destination.
- A **BigQuery**, **Snowflake**, **Redshift**, **Databricks**, or **PostgreSQL** destination.
- One of the following destinations:
- [BigQuery](https://fivetran.com/docs/destinations/bigquery)
- [Snowflake](https://fivetran.com/docs/destinations/snowflake)
- [Redshift](https://fivetran.com/docs/destinations/redshift)
- [PostgreSQL](https://fivetran.com/docs/destinations/postgresql)
- [Databricks](https://fivetran.com/docs/destinations/databricks) with [Databricks Runtime](https://docs.databricks.com/en/compute/index.html#databricks-runtime)

## Step 2: Install the package (skip if also using the `shopify_holistic_reporting` package)
If you are **not** using the [Shopify Holistic reporting package](https://github.com/fivetran/dbt_shopify_holistic_reporting), include the following shopify package version in your `packages.yml` file:
> TIP: Check [dbt Hub](https://hub.getdbt.com/) for the latest installation instructions or [read the dbt docs](https://docs.getdbt.com/docs/package-management) for more information on installing packages.
```yml
packages:
- package: fivetran/shopify
version: [">=0.11.0", "<0.12.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.12.0", "<0.13.0"] # we recommend using ranges to capture non-breaking changes automatically
```

Do **NOT** include the `shopify_source` package in this file. The transformation package itself has a dependency on it and will install the source package as well.
Expand Down Expand Up @@ -118,7 +123,7 @@ vars:
> **Note**: This will only **numerically** convert timestamps to your target timezone. They will however have a "UTC" appended to them. This is a current limitation of the dbt-date `convert_timezone` [macro](https://github.com/calogica/dbt-date#convert_timezone-column-target_tznone-source_tznone) we leverage.

## (Optional) Step 6: Additional configurations
<details><summary>Expand for configurations</summary>
<details open><summary>Expand/Collapse details</summary>

### Passing Through Additional Fields
This package includes all source columns defined in the macros folder. You can add more columns using our pass-through column variables. These variables allow for the pass-through fields to be aliased (`alias`) and casted (`transform_sql`) if desired, but not required. Datatype casting is configured via a sql snippet within the `transform_sql` key. You may add the desired sql while omitting the `as field_name` at the end and your custom pass-though fields will be casted accordingly. Use the below format for declaring the respective pass-through variables:
Expand Down Expand Up @@ -187,6 +192,18 @@ If an individual source table has a different name than the package expects, add
vars:
shopify_<default_source_table_name>_identifier: your_table_name
```

#### Lookback Window
Records from the source can sometimes arrive late. Since several of the models in this package are incremental, by default we look back 7 days to ensure late arrivals are captured while avoiding the need for frequent full refreshes. While the frequency can be reduced, we still recommend running `dbt --full-refresh` periodically to maintain data quality of the models. For more information on our incremental decisions, see the [Incremental Strategy section](https://github.com/fivetran/dbt_shopify/blob/main/DECISIONLOG.md#incremental-strategy) of the DECISIONLOG.

To change the default lookback window, add the following variable to your `dbt_project.yml` file:

```yml
vars:
shopify:
lookback_window: number_of_days # default is 7
```

</details>


Expand All @@ -205,7 +222,7 @@ This dbt package is dependent on the following dbt packages. Please be aware tha
```yml
packages:
- package: fivetran/shopify_source
version: [">=0.11.0", "<0.12.0"]
version: [">=0.12.0", "<0.13.0"]

- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'shopify'
version: '0.11.0'
version: '0.12.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/run_results.json

Large diffs are not rendered by default.

22 changes: 15 additions & 7 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: shopify_integration_tests_6
schema: shopify_integration_tests_7
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: shopify_integration_tests_6
schema: shopify_integration_tests_7
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: shopify_integration_tests_6
schema: shopify_integration_tests_7
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,21 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: shopify_integration_tests_6
schema: shopify_integration_tests_7
threads: 8
databricks:
catalog: null
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: shopify_integration_tests_6
threads: 2
schema: shopify_integration_tests_7
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
databricks-sql:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_SQL_DBT_HTTP_PATH') }}"
schema: shopify_integrations_tests_sqlw
threads: 8
token: "{{ env_var('CI_DATABRICKS_SQL_DBT_TOKEN') }}"
type: databricks
Loading