Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] First iteration of gitStream configuration #10305

Closed
wants to merge 6 commits into from

Conversation

michelengelen
Copy link
Member

This adds the first iteration of the gitStream workflow.

This very much serves as groundwork for a first test and following discussion around this improvement (and if it is proving to be a valid help as that)

@michelengelen michelengelen added core Infrastructure work going on behind the scenes scope: code-infra Specific to the core-infra product labels Sep 11, 2023
@michelengelen michelengelen self-assigned this Sep 11, 2023
@michelengelen michelengelen changed the title []: added first iteration of gitStream configuration [Chore]: added first iteration of gitStream configuration Sep 11, 2023
@mui-bot
Copy link

mui-bot commented Sep 11, 2023

Netlify deploy preview

Netlify deploy preview: https://deploy-preview-10305--material-ui-x.netlify.app/

Updated pages

No updates.

These are the results for the performance tests:

Test case Unit Min Max Median Mean σ
Filter 100k rows ms -210 80.1 -27.2 -70.28 95.38
Sort 100k rows ms 633.2 1,396.6 1,386.7 1,173.42 285.598
Select 100k rows ms 686.3 839.1 697.5 724.74 57.596
Deselect 100k rows ms 138.3 259.2 202.9 195.98 47.603

Generated by 🚫 dangerJS against 2b65d77

.cm/gitstream.cm Outdated
Comment on lines 17 to 21
* DataGrid
* Pickers
* Charts
* TreeView
* Chore
Copy link
Contributor

@romgrk romgrk Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use "docs" and "core" in lowercase. I think we've been using "core" as "chore" also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Pickers] is written [pickers] because its represents all the pickers component, not one component in particular.
This is to be coherent with the labels.

So I would be in favor of using [pickers], [charts].
For the grid, we can use either [datagrid], [data-grid] or [DataGrid] because there is only one component so it's slightly different (same for the TreeView).

Imho, always using non-capitalized names is more coherent, but I don't have a strong preference.

On the pickers, we are also using [fields] to talk about the keyboard editing components, but I'm fine simplifying and always using [pickers].

We also have some specific prefix like [test] or [github], but using [core] everytime is also fine for me.

We also have [l10n], and for that one I think it's a good idea to use the set of component, it would clarify what is the impacted component.

We also have [codemod] that we should probably keep since it's a standalone package.

And finally, some PR's don't have any prefix, like the release version (named vX.X.X) and the renovate PRs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of going all lowercase for consistency. Also feels a bit less noisy to me. smashedlowercase seems ok, we don't have complex/long names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even just [grid] would be fine, less noisy, and more place for the title.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[datagrid] would be better. There is a component grid in the core so it might create confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No hard preference, but to me, for a specific React component, like [DataGrid] or [TreeView] using the capitalized names make more sense, as it would be inline with the same convention React uses for component names, but when it refers to a more abstract set of components like [pickers], using non-capitalized seem more logical.

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to document the convention used in the past in https://www.notion.so/mui-org/GitHub-PRs-7112d03a6c4346168090b29a970c0154?pvs=4#258c9e963dc445348562a52a62052941.

  • [pickers]
  • [date picker]
  • [tree view]
  • [TreeItem]

Are valid under this model, the rest is not. We might want to change, going from:

Screenshot 2023-09-15 at 18 01 52

to:

Screenshot 2023-09-15 at 18 02 00

.cm/gitstream.cm Outdated
- action: add-comment@v1
args:
comment: |
gitStream has assigned a reviewer to increase knowledge sharing on this PR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very curious to see how gitStream can assign a relevant reviewer depending on the modified files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me too

the way it works is that it looks at the git log and determines this based on the number of added and removed lines in the related files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One limit of this approach is that refactoring can totally wipe old contributions (if the content is moved from file to file), but I guess that it must work well enough to be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can select the pool of people to chose at random for that would be ideal as people might switch teams but still be invited to review PRs from components they no longer maintain.

Copy link
Member Author

@michelengelen michelengelen Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats an interesting approach ... I probably need to dig the docs about this as i have not seen a mention of restricting the pool to choose from. Maybe I'll need to contact the devs about that

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think who we ask for reviews is a key element to implementing https://www.notion.so/mui-org/Product-owners-836c72feffcc4cb2b735b51527e6991a. So I think that PR reviewer's assignment should have the very first criterion as the GitHub label used.

Following https://www.notion.so/mui-org/GitHub-community-issues-PRs-Tier-1-12a84fdf50e44595afc55343dac00fca?pvs=4#c6b06804e0ac40c3aa2b5b5c16b202bf, then it's about finding the list of maintainers, and lastly, about figuring out among this pool of maintainers would be the most relevant reviewers, balancing people who already have the context, with the need to increase codebase awareness for the other maintainers.

In this case, we are talking about { { pr.labels | match(term='Share Knowledge') | some } } so it's different, but I'm not sure to understand the use case for this Share Knowledge label. The product owners model integrates the notion of knowledge sharing by having one owner and maintainers, the maintainers after the first 2 as like associates for that scope.

Maybe it's more for purely technical topics, non product related? (which is maybe 5% of the PRs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like step one is identifying the correct reviewers for PRs, so I recommend checking out the options we highlight on review assignment. In particular, you can create watchlists to notify teams or individuals based on modified resources, assign someone from the author's team, or flag specific locations for additional required reviews.

From there, you can create additional automations to apply labels based on the people who are assigned to the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maintainers list per GitHub labels is available through a Notion database API call.

@DanailH DanailH changed the title [Chore]: added first iteration of gitStream configuration [core]: added first iteration of gitStream configuration Sep 11, 2023
@DanailH DanailH changed the title [core]: added first iteration of gitStream configuration [core]: First iteration of gitStream configuration Sep 11, 2023
@oliviertassinari oliviertassinari changed the title [core]: First iteration of gitStream configuration [core] First iteration of gitStream configuration Sep 11, 2023
Copy link
Member

@oliviertassinari oliviertassinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see we are iterating on the automation 👍

Maybe we should implement one workflow at a time, not sure about batching 3 here, it could be faster to review, merge.

.cm/gitstream.cm Outdated
gitStream has assigned a reviewer to increase knowledge sharing on this PR.
# Apply a label that indicates when a PR deletes files
# This uses the `has` custom expression found at the bottom of this file
label_deleted_files:
Copy link
Member

@oliviertassinari oliviertassinari Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is deleting files important enough to have label?

Suggested change
label_deleted_files:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not happen too often that files are deleted ... this should just raise awareness that this should be taken into account while reviewing.

I am 5/10 that this is really proving to be helpful myself, so i am not against removing it

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm asking because I can't connect deleting a file to a cases in the past where I wish I didn't miss it.

I think that the more labels a PR has, the more likely one of its label will be missed, as we don't have the mental capacity to read a large number of labels. So to consider the tradeoff. It could backfire.

Comment on lines +8 to +9
if:
- {{ pr.title | match(regex=titlePolicy.titleRegex) | nope }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think we should have a clear error message, I doubt a regex able to provide a fine grain level.
  • I think it should fail if there is a leading dot.
  • I think it should fail if there isn't a space after the closing ].
  • I think it can work with allowing anything inside the [ ] out of simplicity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this essentially fail all release and dependencies bumping PRs? 🤔
Maybe we could exclude those here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dep, yes. For the release we can start doing [release] vX.X.X but it's too much. Can we just exclude those from cases?

Copy link
Member Author

@michelengelen michelengelen Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the regex for that.
It now allows following patterns:

### matching
v8.211.2
v1.2.3-alpha-release
Bump Package to v1.2.34
[pickers] Expressive PR summary

### not matching
.[pickers] Expressive PR summary
[pickers]Expressive PR summary
[] Expressive PR summary
(pickers) Expressive PR summary
[pickers]: Expressive PR summary

v1.2
release v1.2.3
v12.23.343434.3333

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change this to be "commitMessageAction": "[dependencies] Bump",:

"commitMessageAction": "Bump",

We could also add [release] for releases, the core does this from time to time: https://github.com/mui/material-ui/pulls?q=is%3Apr+label%3Arelease+is%3Aclosed.

No real preferences.

Copy link

@BenLloydPearson BenLloydPearson Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want custom error messages for PR titles, you may be able to accomplish that with a custom expression like this:

requirements:
  - name: version
  regex: {{foo.regex}}
  error: You are missing a version number
  - name: picker
  regex: {{bar.regex}}
  error: You are missing a picker label.

Then, you can use a for loop to iterate through each one of these to create a list of errors that can be posted as a comment. Lmk if you need help creating this.


automations:
# Enforces a certain format on PR titles
enforce_pr_title:
Copy link
Member

@oliviertassinari oliviertassinari Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to build GitHub Actions directly? It could look like in https://github.com/mnajdova/github-action-required-labels/blob/main/index.js or https://github.com/mui/mui-public/blob/master/.github/workflows/ensure-triage-label.yml#L21.

I mean, what's the need for an abstraction? Does GitHub Actions has a similar level of abstraction APIs but with one less layer of indirection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could definitely build our own actions this way. gitStream does provide a few very unique features, but those could be replicated by a proprietary solution for sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main benefit form gitStream is that it allows us to collect metrics from PRs.

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main benefit form gitStream is that it allows us to collect metrics from PRs.

So it sound unrelated to the PR titles convention problem?

What I like about https://github.com/mnajdova/github-action-required-labels/blob/main/index.js#L39 is how the error messages are clear. The goal is so that the community doesn't struggle when it fails. If we can have a clear experience, great. If it's too much effort with gitstream, I think it shouldn't be implemented with it

Copy link

@BenLloydPearson BenLloydPearson Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this feedback about using gitStream vs. GitHub Actions, we would love to hear anything else that you come across. I think you'll find gitStream to be significantly more flexible and easy to maintain vs. build separate GitHub Actions for each of the automations you need, but I want to know if this is not your experience. Related to this, we have a blog post that breaks down many of the use cases we share in common with GitHub Actions, so that might help you better understand how we stack up against them. We are also planning to create a more detailed feature breakdown to provide a better comparison.

This example might help with your evaluation for a required labels automation: Enforce Required Labels

It's also fairly easy to configure error messages using custom expressions. For example, here is an automation that detects the use of deprecated APIs and posts related information as a comment. You could adapt this to alert users about specific labels, PR title naming conventions, or anything else.

.cm/gitstream.cm Outdated Show resolved Hide resolved
michelengelen and others added 3 commits September 12, 2023 09:00
Co-authored-by: Olivier Tassinari <[email protected]>
Signed-off-by: Michel Engelen <[email protected]>
Co-authored-by: Olivier Tassinari <[email protected]>
Signed-off-by: Michel Engelen <[email protected]>
- title workflow now checks for start of string and allows any word in the brackets
- comment for title workflow changed to reflect the changes in the regex
@michelengelen
Copy link
Member Author

Great to see we are iterating on the automation 👍

Maybe we should implement one workflow at a time, not sure about batching 3 here, it could be faster to review, merge.

I did remove the workflows for deleted files and random assignee to clean this up a bit.
The regex is now only allowing the requested format.

I do agree that this could also be done by custom actions ... I am open to try that as well to have a "counter-POC" as ground for further discussions around this

Copy link
Member

@LukasTy LukasTy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement, let's proceed with experimenting with it. 👍

# Post a comment that lists the best experts for the files that were modified.
explain_code_experts:
if:
- { { pr.labels | match(term='Auto assign') | some } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if the reviewer would be assigned only from within the organization?
Most of our community contributions are for localization. Imagine a case, where some community member introduces a new locale (translations), and later, someone amends it.
Would this tool would probably assign the author of the file as the reviewer? 🤔

It's not a blocker as this is an opt-in option, but just an idea if we ever want to enable this by default. 🤔

P.S. Correct me if I'm wrong, but would this action assign a reviewer? In one place it says that it does, in another, there is no mention of it... 🤷

Copy link
Member

@mnajdova mnajdova Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to configure load balancing of requests of reviews? I am afraid that we may end up with requesting review from only 2-3 team members (the ones that are working on the repository longest). Or can we have a max number of request of review someone can have at certain point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great question and I definitely do not have an answer to that as of now. I will dig the docs to find out and contact the gitStream devs 👍🏼

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand gitStream, it doesn't support the auto-assign logic for issues? We might need to developer something custom for issues at which point we could reapply the same logic with PRs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lots of great questions in this thread, I'll try to answer all of them.

Do you know if the reviewer would be assigned only from within the organization?

I will double-check on this, but I'm not sure if this is possible yet. If not, we may be able to prioritize adding this capability.

Would this tool would probably assign the author of the file as the reviewer?

Generally speaking, yes, if you use the codeExperts Filter function. This returns up to 2 potential reviewers based on past activity, excluding the PR author. However, you can configure gitStream to treat specific resources differently. For example, you could configure a different set of rules for translation resources that assign a specific team or individual.

P.S. Correct me if I'm wrong, but would this action assign a reviewer? In one place it says that it does, in another, there is no mention of it...

This is correct, the current config we have here wouldn't assign a reviewer, but that's just a matter of using the add-reviewers automation action.

Is there a way to configure load balancing of requests of reviews?

There are a few ways to handle this:

  1. you can assign a team to review the PR and rely on GitHub's built-in configuration to determine who gets assigned (there are options for round robin and load balance).
  2. You can use intersection to compare the list returned from Code Experts to a pre-defined list of reviewers. You could also set a maximum threshold for Code Experts to avoid assigning reviews to people who mostly contribute new code.
  3. You can use the nunjucks random function to assign a reviewer from a pre-defined list.

As I understand gitStream, it doesn't support the auto-assign logic for issues?

Correct, gitStream lacks any integrations with GitHub Issues. Its something we expect to add in the future, but have been waiting for an excuse to prioritize it. If you can share more details about this use case we can figure out what it would take to put this on the roadmap.

Comment on lines +8 to +9
if:
- {{ pr.title | match(regex=titlePolicy.titleRegex) | nope }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this essentially fail all release and dependencies bumping PRs? 🤔
Maybe we could exclude those here?

.cm/gitstream.cm Show resolved Hide resolved
gitStream:
timeout-minutes: 5
runs-on: ubuntu-latest
name: gitStream workflow automation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't we need explicit write permissions here to allow for the action to write comments and/or change labels? 🤔
https://github.com/mui/mui-x/blob/master/.github/workflows/no-response.yml#L19

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question ... I think the gitStream application that needs to be hooked to the org does have that permission (once approved and hooked)

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to be installed on the GitHub organization? I had understood that gitstream can run directly in GitHub actions, from where it gains access to the required permission.

Copy link
Member

@oliviertassinari oliviertassinari Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I wonder, is gitStream open-source? As I understand https://linearb.io/pricing "gitStream Admin Users" we have to pay $50k/year (~100 unique contributors per month) which feels disconnected from the value. I mean, whatever we improve in X should very likely benefit the other repos, so I'm looking at the org in general.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installed

Copy link

@BenLloydPearson BenLloydPearson Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oliviertassinari These are great questions, and we need to clarify the language on the pricing page about this. I need to confirm the details, but the admin user limit only applies to GitLab integrations, which require a different installation process from GitHub. Even then, it's only a limit on the number of users who can manage the gitStream app installation, everything else can be configured by anyone with write access to the repo. I'm checking with our marketing team to confirm these details and we'll make this clearer.

To that end, gitStream is completely free for GitHub, and doesn't require any subscription fees for your organization users or admins. It's not open source, but we believe that supporting the open source community will be critical to gitStream's long-term success, so we want to support communities like this one.

LinearB pricing is independent from gitStream, so you only need to purchase the LinearB platform if you're interested in engineering productivity metrics, resource allocation, or project delivery tracking. You don't need to purchase these to use gitStream.

Update: the line about gitStream admin users on the pricing page was based on outdated assumptions about how gitStream is designed. We removed this line from the pricing page.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenLloydPearson thanks for the clarification.

@DanailH
Copy link
Member

DanailH commented Sep 12, 2023

Nice work! To be honest I'm also in favor of just enabling gitStream for the analytical PR data to understand more about what is happening even if we don't use the workflows. I recall that I tried to enable the free version on our repo but I didn't have permission to link it (cc @oliviertassinari)

@oliviertassinari
Copy link
Member

oliviertassinari commented Sep 15, 2023

gitStream for the analytical PR data

@DanailH What is this?

Looking at linearb.io, I didn't know there were so many options:

Screenshot 2023-09-15 at 19 00 40

@DanailH
Copy link
Member

DanailH commented Sep 16, 2023

@DanailH What is this?

@oliviertassinari I was referencing these 2 pages

More specifically under the Improve Reporting tab in the first link and the DORA metrics.

@oliviertassinari
Copy link
Member

oliviertassinari commented Sep 16, 2023

@DanailH ah ok, so as a replacement for what we used to have in the free plan https://www.swarmia.com/? What's the rational for one vs. the other? For example Swarmia looks less expensive. Maybe some have an open source plan?

I feel that analytics and issue/pr automations should have two distinct solutions. At least, I fail to see the value in the two to be intertwined. If there is indeed little value, then I think each problem should be solved in isolation. For example, maybe getstream for some PR automation, maybe Uplevel for some analytics.

@michelengelen
Copy link
Member Author

@DanailH ah ok, so as a replacement for what we used to have in the free plan https://www.swarmia.com/? What's the rational for one vs. the other? For example Swarmia looks less expensive. Maybe some have an open source plan?

I feel that analytics and issue/pr automations should have two distinct solutions. At least, I fail to see the value in the two to be intertwined. If there is indeed little value, then I think each problem should be solved in isolation. For example, maybe getstream for some PR automation, maybe Uplevel for some analytics.

@oliviertassinari i did write a comment on the notion page about this PR: I think we could probably benefit from using GitHub actions instead and then, as mentioned above by you, collect metrics from an isolated solution. Maybe Apache devlake is an option as well, since it is free and open-source (we just need a hosting for that)

@BenLloydPearson
Copy link

Hi everyone 👋

I run developer relations at LinearB and work on the gitStream team, thanks for giving our tool a look! I see some wonderful ideas and questions in this thread and after checking with @michelengelen I wanted to offer my suppport.

I'll respond to some of the questions I see here directly, but you can also feel free to @ mention me directly for anything else.

@@ -0,0 +1,51 @@
# Code generated by gitStream GitHub app - DO NOT EDIT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fyi, I could be wrong, but I believe this file should end in .yml, not .yaml.

@michelengelen michelengelen changed the base branch from master to next November 6, 2023 14:23
@michelengelen michelengelen deleted the poc/gitstream branch June 24, 2024 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Infrastructure work going on behind the scenes scope: code-infra Specific to the core-infra product
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants