Skip to content

[CI] Add design document for post submit testing #512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

boomanaiden154
Copy link
Contributor

This patch adds the design document outlining how we plan on implementing post submit testing for the premerge configuration along with the motivation and alternatives considered.

This patch adds the design document outlining how we plan on implementing
post submit testing for the premerge configuration along with the
motivation and alternatives considered.
@boomanaiden154 boomanaiden154 requested a review from cmtice July 25, 2025 20:28
llvm-zorg repository under the `buildbot/` folder. These configurations are run
on Buildbot workers that are hosted by the community.

It is important that we test the premerge configuration postcommit as well. We
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-work this paragraph. Motivations should start by stating what the problem is you're trying to solve (sending false positive notifications to authors about builds/tests already failing at head). Once the problem is explained (including WHY it's important/impactful to solve this problem), then you explain how you plan to solve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the phrase "for certain types of automation" is so vague as to be virtually useless. I would either add more details or omit it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked the paragraph to talk about what we want to do in the end at the top and then go into the how.

Removed the vague wording.

postcommit testing is performed through the Buildbot infrastructure. The main
Buildbot instance for LLVM has a web instance hosted at
[lab.llvm.org](https:/lab.llvm.org). When a new commit lands in `main` the
Buildbot instance (sometimes referred to as the Buildbot master) will trigger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also mention that the current postcommit buildbots don't run separate instances for each commit -- they bundle multiple commits together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends upon the configuration and bundling commits is explicitly discouraged in https://llvm.org/docs/HowToAddABuilder.html. I've added some text to describe this.

available (able to tolerate an entire cluster going down), similar to the
premerge testing. The builders will be configured to use a script that will
launch builds on each commit to `main` in a similar configuration to the one run
premerge. The test configuration is intended to be close to the premerge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a similar configuration to the one run premerge" => "the same configuration used for premerge testing (for PR commits); for direct git commit commits, it will use that same configuration that would have been used with PR testing, if the commit had gone through a PR."

Replace the sentence "The test configuration is intended to be close to the premerge configuration but will be different in some key ways." with:
"The post submit testing is intended to be very close to the premerge testing, but will be different in some key ways."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the distinction between directly pushed commits/commits from PRs is useful here. I've changed up the wording to say the commit will be tested as if it is going through the premerge pipeline, with some minor differences.

Ack on the second point, changed up the wording.

### Testing Configuration

The testing configuration will be as close to the premerge configuration as
possible. We will be running all tests inside the same container with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First sentence. "The actual testing configuration will be the same as that used by the premerge testing."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using testing configuration to include the set of projects that we're testing (which will be different) and the environment.

The environment should be as close as possible, but I don't think there's any easy way to guarantee that it's the same.

same scripts (the `monolithic-linux.sh` and `monolithic-windows.sh` scripts).
However, there will be one main difference between the premerge and postcommit
configuration. In the postcommit configuration we propose testing all projects
on every commit rather than only testing the projects that themselves changed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 107 should become:

"testing. In the postcommit testing, we propose testing all projects"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the text slightly to make it a bit more clear.

I believe you were trying to get at replacing configuration with testing? I think configuration is more accurate here and more consistent with the rest of the paragraph (especially after adding the definition at the beginning).

I can change it up if it is inhibiting clarity though.

that passes all tests that lands afterwards, and then another MLIR change, a
notification will also be sent out on the second MLIR change because the
clang-tidy change turned the build back to green. We also explicitly do not
test certain projects even though their dependencies change, and while we do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THe MLIR example needs clarification: 1). You need to make it clear that you're talking about postsubmit testing only. 2). You need to explain more about turning red/green -- that reach postsubmit run will mark all tests/builds as either red or green, and THAT is why you need to run all the tests every time: to avoid passing in one part from incorrectly marking failures in a different part as passing when they're not.

At least that's what I think your example is trying to explain, but it's too terse to make that clear to someone who's not already familiar with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence "We also explicitly do not test certain projects..." I think that sentence is referring to premerge testing rather than postsubmit? But you need` to make that clear. Also this becomes as run-on sentence. I'd take the last phrase and make it a separate sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote the example to better explain the exact sequence of events and the logic controlling them. I can add some background on buildbot notifications in the beginning if that's helpful, but I hope the new example should make things clear enough.

Copy link
Contributor

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments from just skimming it, I don't know too much about the implementation myself.

Really like that you're documenting all this, I see those incident reports too, great stuff.


## Background/Motivation

LLVM has two types of testing upstream: premerge and postcommit. The premerge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default assumption for this document is going to be that "LLVM" refers to upstream llvm, so you can just say "testing" I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit redundant, but I'd prefer to keep it in there. There will probably be internal (to Google) audiences reading this who might not be as familiar with everything upstream and we have a lot of downstream CI, so keeping it explicit might make it more clear for someone and only be redundant for someone else.

`main` is also important for certain kinds of automation, like the planned
premerge testing advisor that will let contributors know if tests failing in
their PR are already failing at `main` and that it should be safe to merge
despite the given failures.
Copy link
Contributor

@DavidSpickett DavidSpickett Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are some very large paragraphs. The content is fine but a line break between each "point" made would make it easier to read through.

new cluster to efficiently utilize resources. To do this, we plan on having the
AnnotatedBuilder script launch builds as kubernetes pods. This allows for
kubernetes to assign the builds to nodes and also allows autoscaling through
the same mechanism that Github autoscales. This allows for us to quickly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think you mean the same mechanism that LLVM testing on GitHub uses, not GitHub the company specifically uses it.

problematic only when combined. This means we need to test the premerge
configuration postcommit as well so that we can determine the state of `main`
(in terms of whether the build passed/failed and what tests failed, if any) at
any given point in time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommendation: Add a sentence at the end of this paragraph about the premerge advisor (which you reference in the Data Storage section as if it had been mentioned previously). Something like "We can then use this data to implement a premerge advisor, to help prevent sending false positive test failure notifications to users."

@cmtice
Copy link
Contributor

cmtice commented Jul 31, 2025

I think this is looking pretty good now; I've just left a few more minor comments inline. :-)

@boomanaiden154 boomanaiden154 requested a review from cmtice July 31, 2025 22:44
Copy link
Contributor

@cmtice cmtice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please do not merge until others have had a chance to review and give feedback.

@cmtice
Copy link
Contributor

cmtice commented Aug 21, 2025

Does anybody else have any feedback for this design? If not, I will go ahead an approve it...

had dependencies that changed. We propose this for two main reasons. Firstly,
Buildbot does not have support for heterogenous build configurations. This means
that testing a different set of projects within a single builder depending upon
the contents of the commit could easily cause problems. More notifications could
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of problems?
From buildbot point of view, you would track the entire project (so no path-based-filtering), but the script could very well disable some tests based on the commit diff. Buildbot should be oblivious to this.

tests. The problem introduced in commit A still exists, so some tests fail.
No new tests fail. Since the buildbot was previously green due to the
interspersed clang-tidy commit, a notification is still sent out to the
author of commit C.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent example, thanks for that! Is this what you meant with "could easily cause problems." above?

supports a [REST API](https://docs.buildbot.net/latest/developer/rest.html) that
would allow for easily querying the state of a commit in `main`.

For the proposed premerge advisor that tells the user what tests/build failures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this refers to a "proposed premerge advisor" but that's the first time this is mentioned in the doc, and the casual reader having no context (me) will be lost. Is there a link to some proposal that should be added here?

This section feel like a bit out-of-place for a document describing the "post-merge testing" flow IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this feature looks really nice otherwise)

@joker-eph
Copy link
Contributor

joker-eph commented Aug 21, 2025

Looks nice, thanks for the write-up! (and all the work that went behind setting up the infra)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants