Skip to content

Conversation

xrstf
Copy link
Contributor

@xrstf xrstf commented May 28, 2025

Summary

There is IMHO no point in forcing everyone, especially every single CI job, to download and compile each and every tool being used to build kcp. Especially getting and building golangci-lint takes forever. Even worse, newer golangci-lint versions might use Go versions that kcp does not yet support, forcing us to hold off updating the linter until we also update Go. And non-Go tools (like yq) were impossible to sanely install (we already cheated with verify_boilerplate.py, which skips the whole hack/tools versioning approach).

I replaced the go-install.sh with a script that we're already using to great effect in the api-syncagent and kcp-operator repos. Applications for which no binaries are available will still be installed using go install.

I entirely removed compiling apigen, instead I replaced it with go run. I know of at least one other person than me who also runs into the issue of working on the kcp API and forgetting to make clean before doing codegen again. There is simply no need to build a binary for this thing.

What Type of PR Is This?

/kind cleanup

Release Notes

NONE

@kcp-ci-bot kcp-ci-bot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. dco-signoff: yes Indicates the PR's author has signed the DCO. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 28, 2025
@xrstf xrstf force-pushed the download-binaries branch from 93c8674 to c276c3d Compare May 28, 2025 12:28
…inaries

There is IMHO no point in forcing everyone, especially every single CI job, to download *and compile*
each and every tool being used to build kcp. Especially getting and building golangci-lint takes
forever. Even worse, newer golangci-lint versions might use Go versions that kcp does not yet
support, forcing us to hold off updating the linter until we also update Go.

I entirely removed compiling apigen, instead I replaced it with `go run`. I know of at least one
other person than me who also runs into the issue of working on the kcp API and forgetting to
`make clean` before doing codegen again. There is simply no need to build a binary for this thing.

On-behalf-of: @SAP [email protected]
@xrstf xrstf force-pushed the download-binaries branch from c276c3d to 4bf32c9 Compare May 28, 2025 12:44
@xrstf xrstf requested a review from embik May 28, 2025 12:58
Copy link
Member

@xmudrii xmudrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@kcp-ci-bot kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label May 28, 2025
@kcp-ci-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 924a86dbef495511ad268533e1ea3c2109d35538

@kcp-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xmudrii
Once this PR has been reviewed and has the lgtm label, please ask for approval from xrstf. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@stevekuznetsov
Copy link
Contributor

Did we never move this repo over to .bingo? Far superior approach until you're on 1.24 and can use the tool directive in go.mod I think. Module cache and Makefile cache makes the process simple and quick.

@xrstf
Copy link
Contributor Author

xrstf commented May 28, 2025

Even Go's tool can AFAIK only be used for Go tools and requires re-compiling, two of the things I want to solve with this PR :) See for example https://golangci-lint.run/welcome/install/#install-from-sources

In general I am worried about mixing a project's dependencies with those of the tools it uses.

@stevekuznetsov
Copy link
Contributor

@xrstf sure, both the tool directive and .bingo semantically separate the two sets of dependencies, that's part of the raison d'entre.

Aren't all the tools in this PR Go-based?

@xrstf
Copy link
Contributor Author

xrstf commented May 28, 2025

verify_boilerplate isn't 😁 Also some of our scripts already use jq, which is also not Go.

@xrstf
Copy link
Contributor Author

xrstf commented May 28, 2025

Can you link to this .bingo solution/software? I am not familiar with it yet. :) Or is it just the spirit of setting $GOBIN?

@stevekuznetsov
Copy link
Contributor

This is the bingo project. Effectively the spiritual precursor to the tool directive. There's a Go implementation of jq here.

@stevekuznetsov
Copy link
Contributor

It's been a long time since I touched verify_boilerplate.py so sorry if I'm mis-remembering, but I think you may be able to get the same output with addlicense (here) which is a fully-Go tool.

@stevekuznetsov
Copy link
Contributor

It also looks like they have a Go-based runner for the verify-boilerplate stuff: https://github.com/kubernetes-sigs/release-utils/blob/main/mage/boilerplate.go

@ntnn
Copy link
Member

ntnn commented Jun 21, 2025

I've toyed around with bingo and I quite like it. It solves all the usual problems and even generates files for Makefiles and shell scripts to source:

incude .bingo/Variables.mk

.PHONY: lint
lint:
	$(GOLANGCI_LINT) run ./...

It's also very unintrusive in how it installs the dependencies - it's really just a binary dropped into GOBIN.
It feels a lot cleaner than many other solutions.

And with the tool directory bingo can be used as go tool bingo ..., so the tool itself doesn't need to be installed, instead a new developer can get all tools with go tool bingo get.

Also this approach allows automatic dependency updates with e.g. with renovate with very little customManager magic.

The only drawback is that it doesn't solve the problem for non-go tools - but how many non-go tools are there really that this would apply to?

On jq - next to the jq in Go Steve linked there is also yq which I had been using for a while because it also handles other types (SGML-likes, CSV, TOML, ...):
https://github.com/mikefarah/yq

Edit: Even better - the makefiles are bingo-less and just use go build .. - so a makefile can be:

incude .bingo/Variables.mk

.PHONY: lint
lint: $(GOLANGCI_LINT)
	$(GOLANGCI_LINT) run ./...

e.g.:

>make imports
(re)installing /home/ntnn/golang/bin/goimports-v0.34.1-0.20250610205101-c26dd3ba555e
/home/ntnn/golang/bin/goimports-v0.34.1-0.20250610205101-c26dd3ba555e -w -l .

And that will cause the tool to be build in exactly the version as pinned by bingo. Only whoever updates bingo will need to have bingo installed technically.
So the whole shebang with go tool bingo wouldn't even be needed.

Not sure how the prow runners would cache this though - but there's probably a way to have the prow runners have a persistent/cached GOBIN.

@ntnn
Copy link
Member

ntnn commented Jun 22, 2025

Indeed, bingo dependencies can be handled by renovate:
https://github.com/ntnn/renovate-config/blob/acd433ce245302491d3e5a8cb3c790ab19c18850/default.json#L20-L30
ntnn/mermaid-kube-live#6

Although whats missing is updating the tool-specific go.sum file, that would still require a human to go and update and add the commit, but at least a notification for an update is coming in automatically.

Edit: That also wasn't too hard to automate with a workflow.
The reusable workflow: https://github.com/ntnn/actions/blob/main/.github/workflows/bingo.yaml
The invocation: https://github.com/ntnn/mermaid-kube-live/blob/main/.github/workflows/bingo.yaml

@xmudrii
Copy link
Member

xmudrii commented Jun 22, 2025

I'm -1 for bingo. In theory, it seems like a solid project. However, it doesn't to be maintained much or at all. The last commit is from 6 months ago, there are only 3 commits in 2024, dependencies are not being updated, there's a lot of open issues, and PRs are not really getting any attention. From the supply chain security standpoint, it's a disaster.

Anything that we might need more from bingo, there's a solid chance we'll be left on our own, having to fork it, etc. The approach that @xrstf proposed is plain simple and has only a single dependency that we all have (curl). And it's very universal, it works with any tool, I'm not sure if that's the case with bingo.

@ntnn
Copy link
Member

ntnn commented Jun 22, 2025

From the supply chain security standpoint, it's a disaster.

That is true - but bingo itself isn't injecting itself in the execution, the recipes run go build and the shell script only contain the path to the pinned binary.
The only supply chain attacks here would be present without bingo as well.
Technically the curl approach would yield a bigger security hole as it would be pulling prebuilt binaries from github releases, which can and have been compromised before.

The bingo approach also only has a single dependency - namely go, except for whoever adds or updates the tools.

bingo is just a wrapper around some go mod and go get calls and templating the makefile and shell script.

The last commit is from 6 months ago, there are only 3 commits in 2024 [...]

A tool doesn't need to be constantly changed to be useful. GNU coreutils have a handful releases every year (usually once per quarter) with most tools not being touched at all, and it's still software running everywhere.

As for dependencies - the tool is mostly using the stdlib/xtra with some exceptions for semver parsing, shell execution and cobra/pflag.
Not keeping these on the newest releases at all times is not massive problem imho. Sure it would be better but it wouldn't be a dealbreaker.

As for issues - many of those are suggestions. If you filter for bug there's exactly three bugs, and one of them is that the minimum go version in the mod file isn't updated to whatever go version was used to template the mod file.
Valid of course, but not the worst offender.

@ntnn
Copy link
Member

ntnn commented Jun 22, 2025

And that the curl approach only requires curl isn't quite true - unzip and tar are also required to unpack the archives (and tbh on my linux machine I have 7z and not unzip installed because 7z has less issues with the varying zip formats).

@xmudrii
Copy link
Member

xmudrii commented Jun 22, 2025

That is true - but bingo itself isn't injecting itself in the execution, the recipes run go build and the shell script only contain the path to the pinned binary.
The only supply chain attacks here would be present without bingo as well.

This is technically incorrect -- anything that you run in CI/CD has a risk of supply chain attacks (and there's a very solid chance that we'll be using bingo in our CI/CD jobs).

Technically the curl approach would yield a bigger security hole as it would be pulling prebuilt binaries from github releases, which can and have been compromised before.

If we're paranoid, we could add checksum checks to the bash script, that should be very trivial.

The bingo approach also only has a single dependency - namely go, except for whoever adds or updates the tools.

bingo is just a wrapper around some go mod and go get calls and templating the makefile and shell script.

That's again technically incorrect. First, number of dependencies doesn't matter (e.g. the project itself could be compromised), second, there's actually a couple of dependencies used by that project: https://github.com/bwplotka/bingo/blob/eebab197b1b510268e8658e9526b907dd009ff96/go.mod

On top of that, given that releases are infrequent, the project is built using a non-supported version of Go, so it could be affected by the Go vulnerabilities as well (at some point).

A tool doesn't need to be constantly changed to be useful. GNU coreutils have a handful releases every year (usually once per quarter) with most tools not being touched at all, and it's still software running everywhere.

As for dependencies - the tool is mostly using the stdlib/xtra with some exceptions for semver parsing, shell execution and cobra/pflag.
Not keeping these on the newest releases at all times is not massive problem imho. Sure it would be better but it wouldn't be a dealbreaker.

We can't compare any Go tool with GNU coreutils on technical and security levels. First of all, coreutils are actively maintained. If there was a vulnerability to be discovered, it would be fixed with a patch released pretty much immediately. coreutils are written mainly in C and C is not as prone to stdlib vulnerabilities as is Go. Updating Go and releasing often is important to keep a Go tool safe and sound. Also, the coreutils are mostly dynamically linked, so if there was a vulnerability in a dependency, updating that dependency on your system would keep you safe (this is not the case for most of Go tools which are statically linked by default).

There's a very big difference between not needing to add new features constantly and project being effectively unmaintained. For Go tools, it's clear what are the minimal requirements to consider it maintained: keep dependencies up to date and built semi-often with a supported Go toolchain version. None of that is happening in case of bingo, the last release is from 2023. On top of that, issues and PRs are being ignored.

I have to make it very clear, I have a lot of respect for the author of that tool. But dependency management is something very critical to me and I don't feel confident in using something that is not maintained for that, especially for the project such as kcp. As I said above, anything that needs to be fixed, is it a vulnerability, bug, or even a new feature that we need, there's an extremely high chance that we would be on our own (i.e. fork and maintain the tool ourselves). The shell scripts that @xrstf proposed don't need any of that, they pose very little risk (much less than bingo), and they're pretty much maintenance-free.

And that the curl approach only requires curl isn't quite true - unzip and tar are also required to unpack the archives (and tbh on my linux machine I have 7z and not unzip installed because 7z has less issues with the varying zip formats).

In my books, tar, zip, curl, wget and some more are considered the bare minimum for any development -- most of users will have them installed.

@ntnn
Copy link
Member

ntnn commented Jun 22, 2025

This is technically incorrect -- anything that you run in CI/CD has a risk of supply chain attacks (and there's a very solid chance that we'll be using bingo in our CI/CD jobs).

That is what I wanted to express, yes - anything that isn't 100% in our control will yield some risk. Be it through libraries, tools or infrastructure we use.

That's again technically incorrect. First, number of dependencies doesn't matter (e.g. the project itself could be compromised), second, there's actually a couple of dependencies used by that project: bwplotka/bingo@eebab19/go.mod

I said as much a bit further down - but that is quintessentially what bingo is; a templater for the go.mod, Makefile and shell script; and the dependencies don't matter once the files are generated.

On top of that, given that releases are infrequent, the project is built using a non-supported version of Go, so it could be affected by the Go vulnerabilities as well (at some point).

That only matters if prebuilt binaries are used - we still decide which Go version we use to build and run the tools.
Which is in part why I adore the simplicity of this design.

We can't compare any Go tool with GNU coreutils on technical and security levels. First of all, coreutils are actively maintained. If there was a vulnerability to be discovered, it would be fixed with a patch released pretty much immediately. coreutils are written mainly in C and C is not as prone to stdlib vulnerabilities as is Go. Updating Go and releasing often is important to keep a Go tool safe and sound. Also, the coreutils are mostly dynamically linked, so if there was a vulnerability in a dependency, updating that dependency on your system would keep you safe (this is not the case for most of Go tools which are statically linked by default).

There's a very big difference between not needing to add new features constantly and project being effectively unmaintained. For Go tools, it's clear what are the minimal requirements to consider it maintained: keep dependencies up to date and built semi-often with a supported Go toolchain version. None of that is happening in case of bingo, the last release is from 2023. On top of that, issues and PRs are being ignored.

Imho calling a project unmaintained when it isn't implementing more outside of it's intended feature scope is a bit harsh - but I agree on the dependencies.

I have to make it very clear, I have a lot of respect for the author of that tool. But dependency management is something very critical to me and I don't feel confident in using something that is not maintained for that, especially for the project such as kcp. As I said above, anything that needs to be fixed, is it a vulnerability, bug, or even a new feature that we need, there's an extremely high chance that we would be on our own (i.e. fork and maintain the tool ourselves). The shell scripts that @xrstf proposed don't need any of that, they pose very little risk (much less than bingo), and they're pretty much maintenance-free.

bingo is not involved outside of generating the files. At worst these can be edited manually if it comes down to it.

The coreutils comparison is playing devils advocate - though the base comparison still stands with corutils being dynamically linked just like bingo can be built/run with a current go version (ignoring the dependencies that could bear updating).

But putting bingo aside - I'm more in favor of the way bingo pins the tools with separate mod files rather than bingo itself.

With binary releases we rely on the tool authors a) publishing releases and b) providing binaries (and c) these binaries not being manipulated during the builds) - and even if a feature or fix that is needed for kcp is merged a new release might be weeks out unless it is crucial.
The approach with separate mod files and the Makefile and shell script being updated with the correct versions solves these problems.
a) and b) are moot as this allows to pin arbitrary commits and c) is moot as Go validates hash sums of the modules.

I don't think bingo is inherently better than the shell script, please don't misunderstand me - I just think the approach of bingo is worth exploring because it feels so go native and avoids the problems relying on releases as well as shared dependencies from the go tool approach.

With the tool directive the same solution could be realized with a shell script or a Go tool that only uses the stdlib - the only thing that is required is something that generates the .mod files, Makefile and script and runs the go mod tidy:

module _

tool github.com/golangci/golangci-lint/cmd/golangci-lint

require github.com/golangci/golangci-lint/v2 v2.1.6 
go mod tidy -modfile=golangci-lint.mod

As said - I just really like the approach.

@xmudrii
Copy link
Member

xmudrii commented Jun 22, 2025

That is what I wanted to express, yes - anything that isn't 100% in our control will yield some risk. Be it through libraries, tools or infrastructure we use.

And hence we can reduce this risk by putting things in our control and ownership. This is not always worth, but if it requires effectively no effort to maintain (as in this example), we should just do it.

With binary releases we rely on the tool authors a) publishing releases and b) providing binaries (and c) these binaries not being manipulated during the builds) - and even if a feature or fix that is needed for kcp is merged a new release might be weeks out unless it is crucial.
The approach with separate mod files and the Makefile and shell script being updated with the correct versions solves these problems.
a) and b) are moot as this allows to pin arbitrary commits and c) is moot as Go validates hash sums of the modules.

I don't think bingo is inherently better than the shell script, please don't misunderstand me - I just think the approach of bingo is worth exploring because it feels so go native and avoids the problems relying on releases as well as shared dependencies from the go tool approach.

I think these are great examples of the premature optimization. We're now talking about things that we don't have a problem with at all, but that might hypothetically happen. However, that always costs something, and in this case it's about using yet another tool. Not only that there are concerns that I outlined above, but everyone working with kcp (dependencies) will have to fiddle with bingo (and I'm sure for most, this is a very unknown tool).

Also, we'll still have to keep this, because my understanding (and please correct me if I'm wrong), is that bingo is only for Go tools, while we have non-Go tools as well.

That said, I'm strongly in favor of keeping it simple and rolling with this approach for now. It's already fully implemented, it's super easy to use, and any benefits that we might gain from any other solution are marginal.

@ntnn
Copy link
Member

ntnn commented Jun 22, 2025

I think these are great examples of the premature optimization. We're now talking about things that we don't have a problem with at all, but that might hypothetically happen. However, that always costs something, and in this case it's about using yet another tool. Not only that there are concerns that I outlined above, but everyone working with kcp (dependencies) will have to fiddle with bingo (and I'm sure for most, this is a very unknown tool).

I don't know if I would call that premature optimization :D
It's a problem I've faced before, though given how (mostly) mature the tools are we use I don't think it'll come up here.
I'd more lean towards the original goal being premature optimization as it was to remove the need that every developer and CI job would have to download and compile the tools - after all that's why we use module and compile caches.

Also, we'll still have to keep this, because my understanding (and please correct me if I'm wrong), is that bingo is only for Go tools, while we have non-Go tools as well.

To which the argument again would be that the question is how many non-go tools we do have, because I only see jq and that is easily replaceable with Go tools like yq (and was already replaced by yq in api-syncagent: https://github.com/kcp-dev/api-syncagent/blob/479ebf5c5c4dc1e6a7ecd0247db4839ac92fcb61/Makefile#L92-L98)

That said, I'm strongly in favor of keeping it simple and rolling with this approach for now. It's already fully implemented, it's super easy to use, and any benefits that we might gain from any other solution are marginal.

I agree, I think we have discussed the pros and cons enough - also api-syncagent and kcp-operator already have the same script as @xrstf had pointed out; the PR was just a cleanup to update the kcp repository.

We'd just need an owner to approve the PR @mjudeikis @embik

@kcp-ci-bot kcp-ci-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 23, 2025
@kcp-ci-bot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mjudeikis
Copy link
Contributor

I dont mind either way. Either bingo, this or something else. Any improvments to the system are welcome.
1st - its not touching core
2nd - its trasnperant to developers (I hope) and I dont need to learn operate new script it just same commands .
3rd. we already using it.

lgtm from my side.

@mjudeikis
Copy link
Contributor

@xrstf merge? what is the consensus here?

@xrstf
Copy link
Contributor Author

xrstf commented Aug 11, 2025

Merge after rebase and after adding some sort of checksum validation.

I was blocked by the workspace auth feature PR and will now go back to this PR here and kube-bind and stuff.

@xrstf
Copy link
Contributor Author

xrstf commented Aug 25, 2025

This PR is still on my backburner, right now I am struggeling coming up with a nice ay to incorporate checksums into this. Even if I had a file with the checksums for golangci-lint in 8 different [arch,os] combos, updating these whenever someone bumps golangci-lint is nontrivial, mostly because the download script has no idea about the possible URL patterns it is called with.

@xrstf xrstf marked this pull request as draft August 25, 2025 08:35
@kcp-ci-bot kcp-ci-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has signed the DCO. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants