Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] First iteration of gitStream configuration #10305

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions .cm/gitstream.cm
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# -*- mode: yaml -*-
manifest:
version: 1.0

automations:
# Enforces a certain format on PR titles
enforce_pr_title:
Copy link
Member

@oliviertassinari oliviertassinari Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to build GitHub Actions directly? It could look like in https://github.com/mnajdova/github-action-required-labels/blob/main/index.js or https://github.com/mui/mui-public/blob/master/.github/workflows/ensure-triage-label.yml#L21.

I mean, what's the need for an abstraction? Does GitHub Actions has a similar level of abstraction APIs but with one less layer of indirection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could definitely build our own actions this way. gitStream does provide a few very unique features, but those could be replicated by a proprietary solution for sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main benefit form gitStream is that it allows us to collect metrics from PRs.

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main benefit form gitStream is that it allows us to collect metrics from PRs.

So it sound unrelated to the PR titles convention problem?

What I like about https://github.com/mnajdova/github-action-required-labels/blob/main/index.js#L39 is how the error messages are clear. The goal is so that the community doesn't struggle when it fails. If we can have a clear experience, great. If it's too much effort with gitstream, I think it shouldn't be implemented with it

Copy link

@BenLloydPearson BenLloydPearson Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this feedback about using gitStream vs. GitHub Actions, we would love to hear anything else that you come across. I think you'll find gitStream to be significantly more flexible and easy to maintain vs. build separate GitHub Actions for each of the automations you need, but I want to know if this is not your experience. Related to this, we have a blog post that breaks down many of the use cases we share in common with GitHub Actions, so that might help you better understand how we stack up against them. We are also planning to create a more detailed feature breakdown to provide a better comparison.

This example might help with your evaluation for a required labels automation: Enforce Required Labels

It's also fairly easy to configure error messages using custom expressions. For example, here is an automation that detects the use of deprecated APIs and posts related information as a comment. You could adapt this to alert users about specific labels, PR title naming conventions, or anything else.

if:
- {{ pr.title | match(regex=titlePolicy.titleRegex) | nope }}
Comment on lines +8 to +9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think we should have a clear error message, I doubt a regex able to provide a fine grain level.
  • I think it should fail if there is a leading dot.
  • I think it should fail if there isn't a space after the closing ].
  • I think it can work with allowing anything inside the [ ] out of simplicity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this essentially fail all release and dependencies bumping PRs? 🤔
Maybe we could exclude those here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dep, yes. For the release we can start doing [release] vX.X.X but it's too much. Can we just exclude those from cases?

Copy link
Member Author

@michelengelen michelengelen Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the regex for that.
It now allows following patterns:

### matching
v8.211.2
v1.2.3-alpha-release
Bump Package to v1.2.34
[pickers] Expressive PR summary

### not matching
.[pickers] Expressive PR summary
[pickers]Expressive PR summary
[] Expressive PR summary
(pickers) Expressive PR summary
[pickers]: Expressive PR summary

v1.2
release v1.2.3
v12.23.343434.3333

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change this to be "commitMessageAction": "[dependencies] Bump",:

"commitMessageAction": "Bump",

We could also add [release] for releases, the core does this from time to time: https://github.com/mui/material-ui/pulls?q=is%3Apr+label%3Arelease+is%3Aclosed.

No real preferences.

Copy link

@BenLloydPearson BenLloydPearson Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want custom error messages for PR titles, you may be able to accomplish that with a custom expression like this:

requirements:
  - name: version
  regex: {{foo.regex}}
  error: You are missing a version number
  - name: picker
  regex: {{bar.regex}}
  error: You are missing a picker label.

Then, you can use a for loop to iterate through each one of these to create a list of errors that can be posted as a comment. Lmk if you need help creating this.

run:
- action: request-changes@v1
args:
comment: |
All PRs must be titled according to our semantic naming policy: `[Area]: <short summary>`
michelengelen marked this conversation as resolved.
Show resolved Hide resolved

Area must be one of the following:
* DataGrid
* Pickers
* Charts
* TreeView
* Chore
Copy link
Contributor

@romgrk romgrk Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use "docs" and "core" in lowercase. I think we've been using "core" as "chore" also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Pickers] is written [pickers] because its represents all the pickers component, not one component in particular.
This is to be coherent with the labels.

So I would be in favor of using [pickers], [charts].
For the grid, we can use either [datagrid], [data-grid] or [DataGrid] because there is only one component so it's slightly different (same for the TreeView).

Imho, always using non-capitalized names is more coherent, but I don't have a strong preference.

On the pickers, we are also using [fields] to talk about the keyboard editing components, but I'm fine simplifying and always using [pickers].

We also have some specific prefix like [test] or [github], but using [core] everytime is also fine for me.

We also have [l10n], and for that one I think it's a good idea to use the set of component, it would clarify what is the impacted component.

We also have [codemod] that we should probably keep since it's a standalone package.

And finally, some PR's don't have any prefix, like the release version (named vX.X.X) and the renovate PRs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of going all lowercase for consistency. Also feels a bit less noisy to me. smashedlowercase seems ok, we don't have complex/long names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even just [grid] would be fine, less noisy, and more place for the title.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[datagrid] would be better. There is a component grid in the core so it might create confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No hard preference, but to me, for a specific React component, like [DataGrid] or [TreeView] using the capitalized names make more sense, as it would be inline with the same convention React uses for component names, but when it refers to a more abstract set of components like [pickers], using non-capitalized seem more logical.

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to document the convention used in the past in https://www.notion.so/mui-org/GitHub-PRs-7112d03a6c4346168090b29a970c0154?pvs=4#258c9e963dc445348562a52a62052941.

  • [pickers]
  • [date picker]
  • [tree view]
  • [TreeItem]

Are valid under this model, the rest is not. We might want to change, going from:

Screenshot 2023-09-15 at 18 01 52

to:

Screenshot 2023-09-15 at 18 02 00

# Add a label that indicates how many minutes it will take to review the PR.
estimated_time_to_review:
if:
- true
run:
- action: add-label@v1
args:
label: "{{ calc.etr }} min review"
LukasTy marked this conversation as resolved.
Show resolved Hide resolved
color: {{ colors.error if (calc.etr >= 20) else (colors.warning if (calc.etr >= 5) else colors.success) }}
# Post a comment that lists the best experts for the files that were modified.
explain_code_experts:
if:
- { { pr.labels | match(term='Auto assign') | some } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if the reviewer would be assigned only from within the organization?
Most of our community contributions are for localization. Imagine a case, where some community member introduces a new locale (translations), and later, someone amends it.
Would this tool would probably assign the author of the file as the reviewer? 🤔

It's not a blocker as this is an opt-in option, but just an idea if we ever want to enable this by default. 🤔

P.S. Correct me if I'm wrong, but would this action assign a reviewer? In one place it says that it does, in another, there is no mention of it... 🤷

Copy link
Member

@mnajdova mnajdova Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to configure load balancing of requests of reviews? I am afraid that we may end up with requesting review from only 2-3 team members (the ones that are working on the repository longest). Or can we have a max number of request of review someone can have at certain point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great question and I definitely do not have an answer to that as of now. I will dig the docs to find out and contact the gitStream devs 👍🏼

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand gitStream, it doesn't support the auto-assign logic for issues? We might need to developer something custom for issues at which point we could reapply the same logic with PRs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lots of great questions in this thread, I'll try to answer all of them.

Do you know if the reviewer would be assigned only from within the organization?

I will double-check on this, but I'm not sure if this is possible yet. If not, we may be able to prioritize adding this capability.

Would this tool would probably assign the author of the file as the reviewer?

Generally speaking, yes, if you use the codeExperts Filter function. This returns up to 2 potential reviewers based on past activity, excluding the PR author. However, you can configure gitStream to treat specific resources differently. For example, you could configure a different set of rules for translation resources that assign a specific team or individual.

P.S. Correct me if I'm wrong, but would this action assign a reviewer? In one place it says that it does, in another, there is no mention of it...

This is correct, the current config we have here wouldn't assign a reviewer, but that's just a matter of using the add-reviewers automation action.

Is there a way to configure load balancing of requests of reviews?

There are a few ways to handle this:

  1. you can assign a team to review the PR and rely on GitHub's built-in configuration to determine who gets assigned (there are options for round robin and load balance).
  2. You can use intersection to compare the list returned from Code Experts to a pre-defined list of reviewers. You could also set a maximum threshold for Code Experts to avoid assigning reviews to people who mostly contribute new code.
  3. You can use the nunjucks random function to assign a reviewer from a pre-defined list.

As I understand gitStream, it doesn't support the auto-assign logic for issues?

Correct, gitStream lacks any integrations with GitHub Issues. Its something we expect to add in the future, but have been waiting for an excuse to prioritize it. If you can share more details about this use case we can figure out what it would take to put this on the roadmap.

run:
- action: explain-code-experts@v1
args:
gt: 10
# Assign a random reviewer when the 'Share knowledge' label gets added
share_knowledge:
if:
- { { pr.labels | match(term='Share Knowledge') | some } }
run:
- action: add-reviewers@v1
args:
reviewers: { { repo | codeExperts(gt=30, lt=60) | random } }
- action: add-comment@v1
args:
comment: |
gitStream has assigned a reviewer to increase knowledge sharing on this PR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very curious to see how gitStream can assign a relevant reviewer depending on the modified files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me too

the way it works is that it looks at the git log and determines this based on the number of added and removed lines in the related files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One limit of this approach is that refactoring can totally wipe old contributions (if the content is moved from file to file), but I guess that it must work well enough to be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can select the pool of people to chose at random for that would be ideal as people might switch teams but still be invited to review PRs from components they no longer maintain.

Copy link
Member Author

@michelengelen michelengelen Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats an interesting approach ... I probably need to dig the docs about this as i have not seen a mention of restricting the pool to choose from. Maybe I'll need to contact the devs about that

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think who we ask for reviews is a key element to implementing https://www.notion.so/mui-org/Product-owners-836c72feffcc4cb2b735b51527e6991a. So I think that PR reviewer's assignment should have the very first criterion as the GitHub label used.

Following https://www.notion.so/mui-org/GitHub-community-issues-PRs-Tier-1-12a84fdf50e44595afc55343dac00fca?pvs=4#c6b06804e0ac40c3aa2b5b5c16b202bf, then it's about finding the list of maintainers, and lastly, about figuring out among this pool of maintainers would be the most relevant reviewers, balancing people who already have the context, with the need to increase codebase awareness for the other maintainers.

In this case, we are talking about { { pr.labels | match(term='Share Knowledge') | some } } so it's different, but I'm not sure to understand the use case for this Share Knowledge label. The product owners model integrates the notion of knowledge sharing by having one owner and maintainers, the maintainers after the first 2 as like associates for that scope.

Maybe it's more for purely technical topics, non product related? (which is maybe 5% of the PRs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like step one is identifying the correct reviewers for PRs, so I recommend checking out the options we highlight on review assignment. In particular, you can create watchlists to notify teams or individuals based on modified resources, assign someone from the author's team, or flag specific locations for additional required reviews.

From there, you can create additional automations to apply labels based on the people who are assigned to the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maintainers list per GitHub labels is available through a Notion database API call.

# Apply a label that indicates when a PR deletes files
# This uses the `has` custom expression found at the bottom of this file
label_deleted_files:
Copy link
Member

@oliviertassinari oliviertassinari Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is deleting files important enough to have label?

Suggested change
label_deleted_files:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not happen too often that files are deleted ... this should just raise awareness that this should be taken into account while reviewing.

I am 5/10 that this is really proving to be helpful myself, so i am not against removing it

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm asking because I can't connect deleting a file to a cases in the past where I wish I didn't miss it.

I think that the more labels a PR has, the more likely one of its label will be missed, as we don't have the mental capacity to read a large number of labels. So to consider the tradeoff. It could backfire.

if:
- { { has.deleted_files } }
run:
- action: add-label@v1
args:
label: '⚠️ deleted-files'
color: colors.warning

# This is used in the `label_deleted_files` automation
has:
deleted_files: { { source.diff.files | map(attr='new_file') | match(term='/dev/null') | some } }

# The next function calculates the estimated time to review and makes it available in the automation above.
calc:
etr: {{ branch | estimatedReviewTime }}

# define some colors to be used in the labels
colors:
error: '#f44336' # mui palette error.main
warning: '#ffa726' # mui palette warning.main
success: '#66bb6a' # mui palette success.main
info: '#29b6f6' # mui palette info.main
primary: '#e3f2fd' # mui palette primary.main

# define the regex the PR title is tested against to enforce a title format
titlePolicy:
titleRegex: /\[(DataGrid|Pickers|Charts|TreeView)\]:\s*\w+.*/gm

49 changes: 49 additions & 0 deletions .github/workflows/gitstream.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Code generated by gitStream GitHub app - DO NOT EDIT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fyi, I could be wrong, but I believe this file should end in .yml, not .yaml.


name: gitStream workflow automation
run-name: |
/:\ gitStream: PR #${{ fromJSON(fromJSON(github.event.inputs.client_payload)).pullRequestNumber }} from ${{ github.event.inputs.full_repository }}

michelengelen marked this conversation as resolved.
Show resolved Hide resolved
on:
workflow_dispatch:
inputs:
client_payload:
description: The Client payload
required: true
full_repository:
description: the repository name include the owner in `owner/repo_name` format
required: true
head_ref:
description: the head sha
required: true
base_ref:
description: the base ref
required: true
installation_id:
description: the installation id
required: false
resolver_url:
description: the resolver url to pass results to
required: true
resolver_token:
description: Optional resolver token for resolver service
required: false
default: ''

jobs:
gitStream:
timeout-minutes: 5
runs-on: ubuntu-latest
name: gitStream workflow automation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't we need explicit write permissions here to allow for the action to write comments and/or change labels? 🤔
https://github.com/mui/mui-x/blob/master/.github/workflows/no-response.yml#L19

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question ... I think the gitStream application that needs to be hooked to the org does have that permission (once approved and hooked)

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to be installed on the GitHub organization? I had understood that gitstream can run directly in GitHub actions, from where it gains access to the required permission.

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I wonder, is gitStream open-source? As I understand https://linearb.io/pricing "gitStream Admin Users" we have to pay $50k/year (~100 unique contributors per month) which feels disconnected from the value. I mean, whatever we improve in X should very likely benefit the other repos, so I'm looking at the org in general.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installed

Copy link

@BenLloydPearson BenLloydPearson Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oliviertassinari These are great questions, and we need to clarify the language on the pricing page about this. I need to confirm the details, but the admin user limit only applies to GitLab integrations, which require a different installation process from GitHub. Even then, it's only a limit on the number of users who can manage the gitStream app installation, everything else can be configured by anyone with write access to the repo. I'm checking with our marketing team to confirm these details and we'll make this clearer.

To that end, gitStream is completely free for GitHub, and doesn't require any subscription fees for your organization users or admins. It's not open source, but we believe that supporting the open source community will be critical to gitStream's long-term success, so we want to support communities like this one.

LinearB pricing is independent from gitStream, so you only need to purchase the LinearB platform if you're interested in engineering productivity metrics, resource allocation, or project delivery tracking. You don't need to purchase these to use gitStream.

Update: the line about gitStream admin users on the pricing page was based on outdated assumptions about how gitStream is designed. We removed this line from the pricing page.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenLloydPearson thanks for the clarification.

steps:
- name: Evaluate Rules
uses: linear-b/gitstream-github-action@v1
id: rules-engine
with:
full_repository: ${{ github.event.inputs.full_repository }}
head_ref: ${{ github.event.inputs.head_ref }}
base_ref: ${{ github.event.inputs.base_ref }}
client_payload: ${{ github.event.inputs.client_payload }}
installation_id: ${{ github.event.inputs.installation_id }}
resolver_url: ${{ github.event.inputs.resolver_url }}
resolver_token: ${{ github.event.inputs.resolver_token }}