Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webhook #29

Closed
wants to merge 79 commits into from
Closed
Show file tree
Hide file tree
Changes from 58 commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
702e076
Format using Blue, drop unused imports
kaapstorm Nov 21, 2023
065ce5f
Fix unimported exception
kaapstorm Nov 21, 2023
79a30af
isort, drop unused imports
kaapstorm Nov 21, 2023
c23f53a
Define undefined exception, drop unused comment
kaapstorm Nov 21, 2023
bfb8c7f
Move refresh_hq_datasource() out of views.py
kaapstorm Nov 21, 2023
04cc01b
Create webhook endpoint
kaapstorm Nov 22, 2023
10f9436
Happy path up front to improve readability
kaapstorm Nov 24, 2023
a449215
Move `convert_to_array()` to utils
kaapstorm Nov 27, 2023
ced4d8f
Nit: Typo
kaapstorm Nov 27, 2023
e3c6d65
Pull out function
kaapstorm Nov 27, 2023
bcc6d71
Define `update_dataset()`
kaapstorm Nov 27, 2023
4ab6617
Move superset imports locally
kaapstorm Dec 5, 2023
7e1c379
Call `add_view()`
kaapstorm Dec 5, 2023
e9e2207
Add details to README
kaapstorm Dec 5, 2023
591857f
Error handling in DataSetChangeAPI view
kaapstorm Dec 5, 2023
db84723
Define `update_dataset()`
kaapstorm Nov 27, 2023
b6f9ec4
Add `blinker` as a dependency
kaapstorm Dec 5, 2023
6507f64
Add oauth models and authorization api
Charl1996 Nov 28, 2023
26bdaa0
Add scope check and better error handling
Charl1996 Nov 28, 2023
612616f
Update error message
Charl1996 Nov 28, 2023
3281a2b
Pull hq requests out into class
Charl1996 Dec 1, 2023
08b91f6
Add subscriber task and move models to separate file
Charl1996 Dec 4, 2023
405c266
Add require_oauth decorator
Charl1996 Dec 4, 2023
70b280f
Fix accessor of response
Charl1996 Dec 4, 2023
fdaedb7
Minor changes
Charl1996 Dec 4, 2023
934ca9c
Add scope back to Token and add grant types on client
Charl1996 Dec 5, 2023
b0480b3
Undo my hack
Charl1996 Dec 5, 2023
19df6d0
Invoke celery task to subscribe to datasource
Charl1996 Dec 5, 2023
b746cc6
Fix circular import
Charl1996 Dec 6, 2023
7fdfff1
Add BASE_URL to config
Charl1996 Dec 6, 2023
e69828f
Some fixes
Charl1996 Dec 6, 2023
0bbbee6
Append test_download_datasource
Charl1996 Dec 6, 2023
7f5ec64
Add require_oauth decorator to endpoint
Charl1996 Dec 7, 2023
4279f4c
Remove breakpoint
Charl1996 Dec 7, 2023
3338ffe
Uncomment presumably useful line
Charl1996 Dec 7, 2023
b6dd2a4
models.py: Support Python 3.8
kaapstorm Dec 8, 2023
a838483
`setup_hq_db()` assumes test DB is Postgres
kaapstorm Dec 8, 2023
2250eb7
Add comments to clarify methods
Charl1996 Dec 8, 2023
5e919ca
Make use of timedelta days instead of seconds
Charl1996 Dec 8, 2023
c6f6bac
Use more secure method for generating a client_secret
Charl1996 Dec 8, 2023
43cd6b6
Increase length of client_secret
Charl1996 Dec 8, 2023
8524beb
Merge pull request #30 from dimagi/cs/SC-3069-subscribe-to-ds-changes
kaapstorm Dec 8, 2023
950077e
Revert "Add `blinker` as a dependency"
kaapstorm Dec 8, 2023
f0a820a
Move DataSetChangeAPI to api module
kaapstorm Dec 13, 2023
faa8269
--wip-- Troubleshooting: Upgrade Superset to master
kaapstorm Dec 13, 2023
1a3679d
--wip-- Config, and test nitpicks
kaapstorm Jan 25, 2024
76ce919
Merge branch 'master' into nh/webhook
kaapstorm Jan 29, 2024
8a9e263
Fix tests
kaapstorm Jan 29, 2024
fa34441
Merge branch 'master' into nh/webhook
kaapstorm Feb 9, 2024
e4cdb41
Document rookie mistake
kaapstorm Feb 16, 2024
571c3bb
Better error message
kaapstorm Feb 16, 2024
9765272
"CommCare HQ" has a space
kaapstorm Feb 16, 2024
afb1ee6
Give an error message if unable to connect to HQ
kaapstorm Feb 19, 2024
bc06e03
Set HQ Data URI in `superset_config.py`
kaapstorm Feb 19, 2024
f287af4
isort
kaapstorm Feb 19, 2024
1154a80
Use SQLAlchemy bind for HQ Data db
kaapstorm Feb 19, 2024
73e32d4
Add migration
kaapstorm Feb 19, 2024
ef72e21
Add how to create a migration to docs
kaapstorm Feb 19, 2024
54f5d31
Call `subscribe_to_hq_datasource()` synchronously
kaapstorm Feb 20, 2024
00c244c
`abort()` requires a code
kaapstorm Feb 20, 2024
b4502e0
Dependency inversion: Create services layer
kaapstorm Feb 20, 2024
76ac389
Dependency inversion: Move util to model method
kaapstorm Feb 21, 2024
801b962
Move URL functions into a module
kaapstorm Feb 20, 2024
4aea365
Use symmetric encryption for client secrets
kaapstorm Feb 20, 2024
f8b13eb
Don't hardcode URLs
kaapstorm Feb 21, 2024
b59dd0f
Move OAuth2 server code to its own module
kaapstorm Feb 24, 2024
d81eba0
A few nits
kaapstorm Feb 24, 2024
c2a43e8
Simplify check_client_secret()
kaapstorm Feb 24, 2024
1a52c35
Add `AUTHLIB_INSECURE_TRANSPORT` to README.md
kaapstorm Feb 24, 2024
1d6c6e4
Avoid Celery pickle error
kaapstorm Feb 24, 2024
1f7dc4b
Drop `get_explore_database()`
kaapstorm Feb 24, 2024
133b7ee
Fix DataSetChange
kaapstorm Feb 24, 2024
b2aa3cb
Add API endpoints to domain-excluded views
kaapstorm Feb 24, 2024
e66e002
Simplify update_dataset()
kaapstorm Feb 24, 2024
83538b7
Cast data to column types
kaapstorm Feb 24, 2024
a11b348
Small refactor `update_dataset()` fixes insert
kaapstorm Feb 24, 2024
b375cb5
Don't bother with non-pk index column
kaapstorm Feb 24, 2024
e7d921e
Fix decorator
kaapstorm Feb 24, 2024
8a91695
Token extends OAuth2TokenMixin
kaapstorm Feb 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 50 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,14 @@ This is a Python package that integrates Superset and CommCare HQ.
Local Development
-----------------

Follow below instructions.
### Preparing CommCare HQ

### Setup env
The 'User configurable reports UI' feature flag must be enabled for the
domain in CommCare HQ, even if the data sources to be imported were
created by Report Builder, not a UCR.


### Setting up a dev environment

While doing development on top of this integration, it's useful to
install this via `pip -e` option so that any changes made get reflected
Expand Down Expand Up @@ -51,11 +56,12 @@ directly without another `pip install`.
Read through the initialization instructions at
https://superset.apache.org/docs/installation/installing-superset-from-scratch/#installing-and-initializing-superset.

Create the database. These instructions assume that PostgreSQL is
running on localhost, and that its user is "commcarehq". Adapt
accordingly:
Create a database for Superset, and a database for storing data from
CommCare HQ. Adapt the username and database names to suit your
environment.
```bash
$ createdb -h localhost -p 5432 -U commcarehq superset_meta
$ createdb -h localhost -p 5432 -U postgres superset
$ createdb -h localhost -p 5432 -U postgres superset_hq_data
```

Set the following environment variables:
Expand All @@ -64,10 +70,11 @@ $ export FLASK_APP=superset
$ export SUPERSET_CONFIG_PATH=/path/to/superset_config.py
```

Initialize the database. Create an administrator. Create default roles
Initialize the databases. Create an administrator. Create default roles
and permissions:
```bash
$ superset db upgrade
$ superset db upgrade --directory hq_superset/migrations/
$ superset fab create-admin
$ superset load_examples # (Optional)
$ superset init
Expand All @@ -78,27 +85,8 @@ You should now be able to run superset using the `superset run` command:
```bash
$ superset run -p 8088 --with-threads --reload --debugger
```
However, OAuth login does not work yet as hq-superset needs a Postgres
database created to store CommCare HQ data.


### Create a Postgres Database Connection for storing HQ data

- Create a Postgres database. e.g.
```bash
$ createdb -h localhost -p 5432 -U commcarehq hq_data
```
- Log into Superset as the admin user created in the Superset
installation and initialization. Note that you will need to update
`AUTH_TYPE = AUTH_DB` to log in as admin user. `AUTH_TYPE` should be
otherwise set to `AUTH_OAUTH`.
- Go to 'Data' -> 'Databases' or http://127.0.0.1:8088/databaseview/list/
- Create a database connection by clicking '+ DATABASE' button at the top.
- The name of the DISPLAY NAME should be 'HQ Data' exactly, as this is
the name by which this codebase refers to the Postgres DB.

OAuth integration should now be working. You can log in as a CommCare
HQ web user.
You can now log in as a CommCare HQ web user.


### Importing UCRs using Redis and Celery
Expand Down Expand Up @@ -129,6 +117,41 @@ code you want to test will need to be in a module whose dependencies
don't include Superset.


### Creating a migration

You will need to create an Alembic migration for any new SQLAlchemy
models that you add. The Superset CLI should allow you to do this:

```shell
$ superset db revision --autogenerate -m "Add table for Foo model"
```

However, problems with this approach have occurred in the past. You
might have more success by using Alembic directly. You will need to
modify the configuration a little to do this:

1. Copy the "HQ_DATA" database URI from `superset_config.py`.

2. Paste it as the value of `sqlalchemy.url` in
`hq_superset/migrations/alembic.ini`.

3. Edit `env.py` and comment out the following lines:
```
hq_data_uri = current_app.config['SQLALCHEMY_BINDS'][HQ_DATA]
decoded_uri = urllib.parse.unquote(hq_data_uri)
config.set_main_option('sqlalchemy.url', decoded_uri)
```

Those changes will allow Alembic to connect to the "HD Data" database
without the need to instantiate Superset's Flask app. You can now
autogenerate your new table with:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳


```shell
$ cd hq_superset/migrations/
$ alembic revision --autogenerate -m "Add table for Foo model"
```


Upgrading Superset
------------------

Expand Down
6 changes: 4 additions & 2 deletions hq_superset/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ def flask_app_mutator(app):
# Import the views (which assumes the app is initialized) here
# return
from superset.extensions import appbuilder
from . import api, hq_domain, views

from . import hq_domain, views
appbuilder.add_view(views.HQDatasourceView, 'Update HQ Datasource', menu_cond=lambda *_: False)
appbuilder.add_view(views.SelectDomainView, 'Select a Domain', menu_cond=lambda *_: False)
appbuilder.add_api(api.OAuth)
appbuilder.add_api(api.DataSetChangeAPI)
app.before_request_funcs.setdefault(None, []).append(
hq_domain.before_request_hook
)
Expand Down Expand Up @@ -40,4 +42,4 @@ def override_jinja2_template_loader(app):
'images'
))
blueprint = Blueprint('Static', __name__, static_url_path='/static/images', static_folder=images_path)
app.register_blueprint(blueprint)
app.register_blueprint(blueprint)
126 changes: 126 additions & 0 deletions hq_superset/api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
import json
from datetime import datetime, timedelta
from http import HTTPStatus

from authlib.integrations.flask_oauth2 import (
AuthorizationServer,
ResourceProtector,
)
from authlib.oauth2.rfc6749 import grants
from authlib.oauth2.rfc6750 import BearerTokenValidator
from flask import jsonify, request
from flask_appbuilder.api import BaseApi, expose
from flask_appbuilder.baseviews import expose_api
from sqlalchemy.orm.exc import NoResultFound
from superset import db
from superset.extensions import appbuilder, csrf
from superset.superset_typing import FlaskResponse
from superset.views.base import handle_api_exception, json_error_response

from .models import DataSetChange, HQClient, Token
from .utils import update_dataset

require_oauth = ResourceProtector()
app = appbuilder.app


def query_client(client_id):
return HQClient.get_by_client_id(client_id)


def save_token(token, request):
client = request.client
client.revoke_tokens()

expires_at = datetime.utcnow() + timedelta(days=1)
tok = Token(
client_id=client.client_id,
expires_at=expires_at,
access_token=token['access_token'],
token_type=token['token_type'],
scope=client.domain,
)
db.session.add(tok)
db.session.commit()


class HQBearerTokenValidator(BearerTokenValidator):
def authenticate_token(self, token_string):
return db.session.query(Token).filter_by(access_token=token_string).first()


require_oauth.register_token_validator(HQBearerTokenValidator())

authorization = AuthorizationServer(
app=app,
query_client=query_client,
save_token=save_token,
)
authorization.register_grant(grants.ClientCredentialsGrant)


class OAuth(BaseApi):

def __init__(self):
super().__init__()
self.route_base = "/oauth"

@expose("/token", methods=('POST',))
def issue_access_token(self):
try:
response = authorization.create_token_response()
except NoResultFound:
return jsonify({"error": "Invalid client"}), 401

if response.status_code >= 400:
return response

data = json.loads(response.data.decode("utf-8"))
return jsonify(data)


class DataSetChangeAPI(BaseApi):
"""
Accepts changes to datasets from CommCare HQ data forwarding
"""

MAX_REQUEST_LENGTH = 10_485_760 # reject >10MB JSON requests

def __init__(self):
self.route_base = '/hq_webhook'
self.default_view = 'post_dataset_change'
super().__init__()

# http://localhost:8088/hq_webhook/change/
@expose_api(url='/change/', methods=('POST',))
@handle_api_exception
@csrf.exempt
@require_oauth
def post_dataset_change(self) -> FlaskResponse:
if request.content_length > self.MAX_REQUEST_LENGTH:
return json_error_response(
HTTPStatus.REQUEST_ENTITY_TOO_LARGE.description,
status=HTTPStatus.REQUEST_ENTITY_TOO_LARGE.value,
)

try:
request_json = json.loads(request.get_data(as_text=True))
change = DataSetChange(**request_json)
update_dataset(change)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitty: Thoughts on moving update_dataset into DataSetChange and chaning the name to DataSet so that that object represents a dataset, much like django's models represent a Row?

return self.json_response(
'Request accepted; updating dataset',
status=HTTPStatus.ACCEPTED.value,
)
except json.JSONDecodeError:
return json_error_response(
'Invalid JSON syntax',
status=HTTPStatus.BAD_REQUEST.value,
)
except (TypeError, ValueError) as err:
return json_error_response(
str(err),
status=HTTPStatus.BAD_REQUEST.value,
)
# `@handle_api_exception` will return other exceptions as JSON
# with status code 500, e.g.
# {"error": "CommCare HQ database missing"}
2 changes: 2 additions & 0 deletions hq_superset/const.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# The name of the database for storing data related to CommCare HQ
HQ_DATA = "HQ Data"
1 change: 1 addition & 0 deletions hq_superset/hq_domain.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ def is_user_admin():
return security_manager.is_admin()



def ensure_domain_selected():
# Check if a hq_domain cookie is set
# Ensure necessary roles, permissions and DB schemas are created for the domain
Expand Down
48 changes: 48 additions & 0 deletions hq_superset/hq_requests.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import superset
from hq_superset.oauth import get_valid_cchq_oauth_token


class HqUrl:
@classmethod
def datasource_export_url(cls, domain, datasource_id):
return f"a/{domain}/configurable_reports/data_sources/export/{datasource_id}/?format=csv"

@classmethod
def datasource_list_url(cls, domain):
return f"a/{domain}/api/v0.5/ucr_data_source/"

@classmethod
def datasource_details_url(cls, domain, datasource_id):
return f"a/{domain}/api/v0.5/ucr_data_source/{datasource_id}/"

@classmethod
def subscribe_to_datasource_url(cls, domain, datasource_id):
return f"a/{domain}/configurable_reports/data_sources/subscribe/{datasource_id}/"


class HQRequest:

def __init__(self, url):
self.url = url

@property
def oauth_token(self):
return get_valid_cchq_oauth_token()

@property
def commcare_provider(self):
return superset.appbuilder.sm.oauth_remotes["commcare"]

@property
def api_base_url(self):
return self.commcare_provider.api_base_url

@property
def absolute_url(self):
return f"{self.api_base_url}{self.url}"

def get(self):
return self.commcare_provider.get(self.url, token=self.oauth_token)

def post(self, data):
return self.commcare_provider.post(self.url, data=data, token=self.oauth_token)
1 change: 1 addition & 0 deletions hq_superset/migrations/README
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Generic single-database configuration.
Loading