Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webhook #29

Closed
wants to merge 79 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
702e076
Format using Blue, drop unused imports
kaapstorm Nov 21, 2023
065ce5f
Fix unimported exception
kaapstorm Nov 21, 2023
79a30af
isort, drop unused imports
kaapstorm Nov 21, 2023
c23f53a
Define undefined exception, drop unused comment
kaapstorm Nov 21, 2023
bfb8c7f
Move refresh_hq_datasource() out of views.py
kaapstorm Nov 21, 2023
04cc01b
Create webhook endpoint
kaapstorm Nov 22, 2023
10f9436
Happy path up front to improve readability
kaapstorm Nov 24, 2023
a449215
Move `convert_to_array()` to utils
kaapstorm Nov 27, 2023
ced4d8f
Nit: Typo
kaapstorm Nov 27, 2023
e3c6d65
Pull out function
kaapstorm Nov 27, 2023
bcc6d71
Define `update_dataset()`
kaapstorm Nov 27, 2023
4ab6617
Move superset imports locally
kaapstorm Dec 5, 2023
7e1c379
Call `add_view()`
kaapstorm Dec 5, 2023
e9e2207
Add details to README
kaapstorm Dec 5, 2023
591857f
Error handling in DataSetChangeAPI view
kaapstorm Dec 5, 2023
db84723
Define `update_dataset()`
kaapstorm Nov 27, 2023
b6f9ec4
Add `blinker` as a dependency
kaapstorm Dec 5, 2023
6507f64
Add oauth models and authorization api
Charl1996 Nov 28, 2023
26bdaa0
Add scope check and better error handling
Charl1996 Nov 28, 2023
612616f
Update error message
Charl1996 Nov 28, 2023
3281a2b
Pull hq requests out into class
Charl1996 Dec 1, 2023
08b91f6
Add subscriber task and move models to separate file
Charl1996 Dec 4, 2023
405c266
Add require_oauth decorator
Charl1996 Dec 4, 2023
70b280f
Fix accessor of response
Charl1996 Dec 4, 2023
fdaedb7
Minor changes
Charl1996 Dec 4, 2023
934ca9c
Add scope back to Token and add grant types on client
Charl1996 Dec 5, 2023
b0480b3
Undo my hack
Charl1996 Dec 5, 2023
19df6d0
Invoke celery task to subscribe to datasource
Charl1996 Dec 5, 2023
b746cc6
Fix circular import
Charl1996 Dec 6, 2023
7fdfff1
Add BASE_URL to config
Charl1996 Dec 6, 2023
e69828f
Some fixes
Charl1996 Dec 6, 2023
0bbbee6
Append test_download_datasource
Charl1996 Dec 6, 2023
7f5ec64
Add require_oauth decorator to endpoint
Charl1996 Dec 7, 2023
4279f4c
Remove breakpoint
Charl1996 Dec 7, 2023
3338ffe
Uncomment presumably useful line
Charl1996 Dec 7, 2023
b6dd2a4
models.py: Support Python 3.8
kaapstorm Dec 8, 2023
a838483
`setup_hq_db()` assumes test DB is Postgres
kaapstorm Dec 8, 2023
2250eb7
Add comments to clarify methods
Charl1996 Dec 8, 2023
5e919ca
Make use of timedelta days instead of seconds
Charl1996 Dec 8, 2023
c6f6bac
Use more secure method for generating a client_secret
Charl1996 Dec 8, 2023
43cd6b6
Increase length of client_secret
Charl1996 Dec 8, 2023
8524beb
Merge pull request #30 from dimagi/cs/SC-3069-subscribe-to-ds-changes
kaapstorm Dec 8, 2023
950077e
Revert "Add `blinker` as a dependency"
kaapstorm Dec 8, 2023
f0a820a
Move DataSetChangeAPI to api module
kaapstorm Dec 13, 2023
faa8269
--wip-- Troubleshooting: Upgrade Superset to master
kaapstorm Dec 13, 2023
1a3679d
--wip-- Config, and test nitpicks
kaapstorm Jan 25, 2024
76ce919
Merge branch 'master' into nh/webhook
kaapstorm Jan 29, 2024
8a9e263
Fix tests
kaapstorm Jan 29, 2024
fa34441
Merge branch 'master' into nh/webhook
kaapstorm Feb 9, 2024
e4cdb41
Document rookie mistake
kaapstorm Feb 16, 2024
571c3bb
Better error message
kaapstorm Feb 16, 2024
9765272
"CommCare HQ" has a space
kaapstorm Feb 16, 2024
afb1ee6
Give an error message if unable to connect to HQ
kaapstorm Feb 19, 2024
bc06e03
Set HQ Data URI in `superset_config.py`
kaapstorm Feb 19, 2024
f287af4
isort
kaapstorm Feb 19, 2024
1154a80
Use SQLAlchemy bind for HQ Data db
kaapstorm Feb 19, 2024
73e32d4
Add migration
kaapstorm Feb 19, 2024
ef72e21
Add how to create a migration to docs
kaapstorm Feb 19, 2024
54f5d31
Call `subscribe_to_hq_datasource()` synchronously
kaapstorm Feb 20, 2024
00c244c
`abort()` requires a code
kaapstorm Feb 20, 2024
b4502e0
Dependency inversion: Create services layer
kaapstorm Feb 20, 2024
76ac389
Dependency inversion: Move util to model method
kaapstorm Feb 21, 2024
801b962
Move URL functions into a module
kaapstorm Feb 20, 2024
4aea365
Use symmetric encryption for client secrets
kaapstorm Feb 20, 2024
f8b13eb
Don't hardcode URLs
kaapstorm Feb 21, 2024
b59dd0f
Move OAuth2 server code to its own module
kaapstorm Feb 24, 2024
d81eba0
A few nits
kaapstorm Feb 24, 2024
c2a43e8
Simplify check_client_secret()
kaapstorm Feb 24, 2024
1a52c35
Add `AUTHLIB_INSECURE_TRANSPORT` to README.md
kaapstorm Feb 24, 2024
1d6c6e4
Avoid Celery pickle error
kaapstorm Feb 24, 2024
1f7dc4b
Drop `get_explore_database()`
kaapstorm Feb 24, 2024
133b7ee
Fix DataSetChange
kaapstorm Feb 24, 2024
b2aa3cb
Add API endpoints to domain-excluded views
kaapstorm Feb 24, 2024
e66e002
Simplify update_dataset()
kaapstorm Feb 24, 2024
83538b7
Cast data to column types
kaapstorm Feb 24, 2024
a11b348
Small refactor `update_dataset()` fixes insert
kaapstorm Feb 24, 2024
b375cb5
Don't bother with non-pk index column
kaapstorm Feb 24, 2024
e7d921e
Fix decorator
kaapstorm Feb 24, 2024
8a91695
Token extends OAuth2TokenMixin
kaapstorm Feb 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 0 additions & 34 deletions hq_superset/tasks.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
import logging
import os

from superset.extensions import celery_app

from .utils import AsyncImportHelper, refresh_hq_datasource

logger = logging.getLogger(__name__)


@celery_app.task(name='refresh_hq_datasource_task')
def refresh_hq_datasource_task(domain, datasource_id, display_name, export_path, datasource_defn, user_id):
Expand All @@ -16,34 +13,3 @@ def refresh_hq_datasource_task(domain, datasource_id, display_name, export_path,
AsyncImportHelper(domain, datasource_id).mark_as_complete()
raise
os.remove(export_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the commit message:

so that it can get the OAuth token to authenticate with HQ.

Speaking out of inexperience in this area, but can't we simply pass the token to the celery task?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just closing the loop here. It needs the request session. We could serialize it and pass it to the task, but something about that feels awkward to me. I'm happy to be educated though. :) )


@celery_app.task(name='subscribe_to_hq_datasource_task')
def subscribe_to_hq_datasource_task(domain, datasource_id):
from superset.config import BASE_URL
from hq_superset.hq_requests import HQRequest, HqUrl
from hq_superset.models import HQClient

if HQClient.get_by_domain(domain) is None:
hq_request = HQRequest(
url=HqUrl.subscribe_to_datasource_url(domain, datasource_id)
)

client_id, client_secret = HQClient.create_domain_client(domain)

response = hq_request.post({
'webhook_url': f'{BASE_URL}/hq_webhook/change/',
'token_url': f'{BASE_URL}/oauth/token',
'client_id': client_id,
'client_secret': client_secret,
})
if response.status_code == 201:
return
if response.status_code < 500:
logger.error(
f"Failed to subscribe to data source {datasource_id} due to the following issue: {response.data}"
)
if response.status_code >= 500:
logger.exception(
f"Failed to subscribe to data source {datasource_id} due to a remote server error"
)
38 changes: 33 additions & 5 deletions hq_superset/utils.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import ast
import logging
import os
from contextlib import contextmanager
from datetime import date, datetime
Expand All @@ -19,6 +20,8 @@
SESSION_OAUTH_RESPONSE_KEY = "oauth_response"
ASYNC_DATASOURCE_IMPORT_LIMIT_IN_BYTES = 5_000_000 # ~5MB

logger = logging.getLogger(__name__)


def get_datasource_export_url(domain, datasource_id):
return f"a/{domain}/configurable_reports/data_sources/export/{datasource_id}/?format=csv"
Expand Down Expand Up @@ -191,9 +194,7 @@ def get_datasource_file(path):

def download_datasource(domain, datasource_id):
import superset

from hq_superset.hq_requests import HQRequest, HqUrl
from hq_superset.tasks import subscribe_to_hq_datasource_task

hq_request = HQRequest(
url=HqUrl.datasource_export_url(domain, datasource_id),
Expand All @@ -204,15 +205,42 @@ def download_datasource(domain, datasource_id):

filename = f"{datasource_id}_{datetime.now()}.zip"
path = os.path.join(superset.config.SHARED_DIR, filename)

with open(path, "wb") as f:
f.write(response.content)

subscribe_to_hq_datasource_task.delay(domain, datasource_id)

return path, len(response.content)


def subscribe_to_hq_datasource(domain, datasource_id):
from superset.config import BASE_URL
from hq_superset.hq_requests import HQRequest, HqUrl
from hq_superset.models import HQClient

if HQClient.get_by_domain(domain) is None:
hq_request = HQRequest(
url=HqUrl.subscribe_to_datasource_url(domain, datasource_id)
)

client_id, client_secret = HQClient.create_domain_client(domain)

response = hq_request.post({
'webhook_url': f'{BASE_URL}/hq_webhook/change/',
'token_url': f'{BASE_URL}/oauth/token',
'client_id': client_id,
'client_secret': client_secret,
})
if response.status_code == 201:
return
if response.status_code < 500:
logger.error(
f"Failed to subscribe to data source {datasource_id} due to the following issue: {response.data}"
)
if response.status_code >= 500:
logger.exception(
f"Failed to subscribe to data source {datasource_id} due to a remote server error"
)


def get_datasource_defn(domain, datasource_id):
from hq_superset.hq_requests import HQRequest, HqUrl

Expand Down
2 changes: 2 additions & 0 deletions hq_superset/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
get_hq_database,
get_schema_name_for_domain,
refresh_hq_datasource,
subscribe_to_hq_datasource,
)

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -120,6 +121,7 @@ def trigger_datasource_refresh(domain, datasource_id, display_name):
)
return redirect("/tablemodelview/list/")

subscribe_to_hq_datasource(domain, datasource_id)
path, size = download_datasource(domain, datasource_id)
datasource_defn = get_datasource_defn(domain, datasource_id)
if size < ASYNC_DATASOURCE_IMPORT_LIMIT_IN_BYTES:
Expand Down