Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Ability to add labels to each job (not table/model) based on profiles.yml property #1366

Open
3 tasks done
moseleyi opened this issue Oct 9, 2024 · 3 comments
Open
3 tasks done
Labels
enhancement New feature or request

Comments

@moseleyi
Copy link

moseleyi commented Oct 9, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt-bigquery functionality, rather than a Big Idea better suited to a discussion

Describe the feature

I would like to use one Service Account connection to BigQuery. The problem with this however is that the logs would not show which person actually runs dbt. dbt already adds dbt_invocation_id to all queries as labels and I would like to be able to configure a label in profiles.yml, that is also added to all queries.

def raw_execute(
        self,
        sql,
        use_legacy_sql=False,
        limit: Optional[int] = None,
        dry_run: bool = False,
    ):
        conn = self.get_thread_connection()
        client = conn.handle

        fire_event(SQLQuery(conn_name=conn.name, sql=sql, node_info=get_node_info()))

        labels = self.get_labels_from_query_comment()

        labels["dbt_invocation_id"] = get_invocation_id()

        job_params = {
            "use_legacy_sql": use_legacy_sql,
            "labels": labels,
            "dry_run": dry_run,
        }

I found this code when labels are added. Imaging we add labels property in profiles:

project:
  method: service_account
  threads: 4
  ...
  labels:
    dbt_user: "somebody"

Then in Log Explorer in GCP I can differentiate between people if this were added to the labels. I wouldn't have to use ADC or other short-lived credentials, or create separate service account for each user.

Describe alternatives you've considered

Creating a fork of the bigquery connector and adding it by myself.

Who will this benefit?

Anybody using bigquery with service account connection that would like to still have user-level details in the logs or add any other labels to all queries

Are you interested in contributing this feature?

Yes

Anything else?

No response

@moseleyi moseleyi added enhancement New feature or request triage labels Oct 9, 2024
@moseleyi moseleyi changed the title [Feature] Ability to add project-level labels to each job [Feature] Ability to add labels to each job (not table/model) Oct 9, 2024
@moseleyi moseleyi changed the title [Feature] Ability to add labels to each job (not table/model) [Feature] Ability to add labels to each job (not table/model) based on profiles.yml property Oct 9, 2024
@amychen1776
Copy link

Hello @moseleyi, thank you for opening up this issue! I'm curious to ask you why you would want to use only one service account to auth into BQ? This does not align with our best practices especially for security.

@moseleyi
Copy link
Author

Authentication would still be using ADC but the permissions are doe via Service Account. This is because GCP allows to use user accounts for authentication and service accounts for permission - it's called service account impersonation. It's a bridge between having multiple user accounts or one service account. The first one can become clunky if you have to set permissions for each user, the second, meaning one service account, is not compliant with financial regulators.

I want to add labels that would show up in GCP logs what is the user name running the queries. Unfortunately with ADC + Impersonation, it's the service account email that shows up in logs
.

https://cloud.google.com/docs/authentication/use-service-account-impersonation

@amychen1776
Copy link

amychen1776 commented Oct 30, 2024

Thank you for that explanation! That was very helpful. I will put this into consideration but will not be able to prioritize this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants