Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Submodules step 2/2: draw the rest of the owl (#5609)
# What Step 1 was in PR #5597 This second commit essentially: 1. splits the gcloud-only dependencies into a submodule which imports the main `github.com/uber/cadence` module, and not the reverse 2. also splits the "main" server build (`cmd/server`) into a submodule which imports the gcloud plugin and the main module (so that binary is unchanged) 3. removes `cloud.google.com/*` dependencies from the main `go.mod` (requires both 1 and 2) 4. adjusts things to work like they did before Ultimately reducing our "main" (truly required) dependencies, while not making major changes to using or developing Cadence. # Why Part on principle, and part due to internal version conflicts. In principle, "plugins" are "optional" and we should not be forcing _all_ optional dependencies on _all_ users of _any_ of Cadence. Splitting dependencies into choose-your-own-adventure submodules is simply good library design for the ecosystem, and it's something we should be doing more of. In practice, this is essentially being forced because the gcloud archiver plugin pulls in some `cloud.google.com/*` stuff, which we upgraded, which forces breaking `google.golang.org/genproto` upgrades, which conflicts with other things in our internal monorepo. Removing this requirement (we do not use the gcloud archiver internally) allows us to continue using these incompatible libraries elsewhere, unblocking us while we wait for them to upgrade. This may also be true for [waves in the direction of The Internet] all of You, so rejoice! Our pain is your gain. We now no longer force you to use these libraries. We will likely be doing this with more submodules in the future, this is just the first proof-of-concept and establishes a pattern to follow for the others. The client library will very likely also be doing this (or possibly in other repos) to separate out some of its more problematic dependencies. # How I intend to write a more detailed doc soon, but as a tl;dr this "create a new submodule to separate optional dependencies" can be recreated for other future submodules by following this sequence: ## 1: make sure all imports "into" the to-be-a-submodule folder are reversed, and instead go "out" from the submodule to the "main" module. E.g. PR #5597 converts a hardcoded switch statement into a registration system, and then this commit uses that to reverse the direction of the code's imports. This is generally best done _up front_, because very nearly all of it can be done without any dependency changes, which leaves you with only a small window where your IDE does not quite work (while tidying the modules the first time). This will also let you make a separate commit with most or all of your code changes in isolation, which is easier to review and verify as non-changing. Technically you can put this off until later, but you may have to modify your code while you have no working refactoring / autocomplete / navigation tools, and that's just unnecessary pain. ## 2: create a `go.mod` in the to-be-a-submodule folder That must look something like: ``` module github.com/uber/cadence/some/folder // don't depend on a released library, depend on the current code: replace github.com/uber/cadence => ../../ // and the thrift replace, any other replaces that may be necessary in other `go.mod`s, etc ``` ## 3: `go mod tidy` in the repo root / the main `go.mod` And confirm it removes the dependencies that only the submodule uses. In particular, this **MUST NOT** add the new submodule to the dependencies, or you've got a main -> submodule import somewhere. Grep for it if so, it should be very easy to find, and reverse that dependency somehow. ## 4: `go mod tidy` in the new submodule. If this pulls in incompatible versions, I've had good luck "seeding" the new `go.mod` file with the contents of the main `go.mod` file. It seems to keep versions more stable than replaces + `go mod tidy` alone. (these incompatible versions are _probably problematic somehow_ and may cause issues in the future, but we can at least use this trick or `replace` as necessary to ignore them for now) ## 5: add your new module to the root `go.work` so gopls and some other CLI tools continue to work. Plus any makefile / CI / etc changes necessary. Hopefully you can just mimic existing multi-module stuff. ## And that's essentially it! In summary: - adjust your code - make a new `go.mod` - tidy everything - adjust tooling as needed Bundle as much of ^ this as you care into a single commit, and it should all work _as long as anyone using the submodule also pulls the same version of all other modules_ (because that's what you developed against, with `replace ... => ../`). Only same-version will be checked in CI, so only same-version should be expected to work, though other combinations _may_ work in practice. # How others can use these new submodules Because we _require_ `replace` directives, a "simple" `go get github.com/uber/cadence[/or/submodules]` does not work and essentially never has and never will. That's fine, they just have to add our replaces in a custom `go.mod`: ``` module their/project // replace thrift (and any other replacements at version X, check our go.mod) replace github.com/apache/thrift => github.com/apache/thrift v0.0.0-20161221203622-b2a4d4ae21c7 ``` And use a main.go to include the submodules: ```go // optionally make it not buildable by default, makes some tools happier //go:build just_for_gomod_purposes package main import ( _ "github.com/uber/cadence/some/submodule" _ "github.com/uber/cadence/other/submodules/etc" _ "github.com/uber/cadence/cmd/server" // the main module, if you want it ) func main() {} // does not need to be used / will not be run ``` And then `go get github.com/uber/cadence/{some/submodule,other/submodules/etc,cmd/server,}@version` to get all involved modules at the same version (as one command is most likely to work) which will merge all dependencies, update your `go.mod`, and you can now `go run github.com/uber/cadence/cmd/server` and it should work. Or start using the various modules to make a new custom `main.go` or whatever is needed. If necessary, you can also `replace` all `github.com/uber/cadence`-hosted modules with the same version / SHA, which should definitely force it to work. Updating works similarly - `get` them all at the same version, make sure your replaces are up to date (as `go get` will not adjust those), and it should Just Work™ like it did with a single module. --- This was originally planned to be landed over multiple commits to allow submodules to refer to the previously-merged commit(s) to set up dependencies, but relative `replace` directives simplify this a LOT, and bring a MUCH better development experience. The main pros/cons of using a multi-commit + "refer to other-module@HEAD^ rather than relative paths" approach is: - pro: people can `go get the/submodule@version` and they _could_ be guaranteed to get a working set of dependencies - ... at _some_ SHAs anyway. in principle it could be e.g. at every released version or `@latest`, but it cannot be guaranteed at _all_ SHAs. - our thrift replace means this currently does not work anyway, and has not for a long time - con: submodules can only refer to the _previous_ commit of other modules, as it must be a pushed SHA - this applies during development too, essentially requiring you to push a fork and `replace cadence => fork version` to do normal development - this "requires" multiple commits on master/somewhere permanent to truly release something stable. or they can `go get` the same version of all modules. - con: more complex dev and CI as things always refer to the _previous_ state at best, not the current - this can be worked around by manually `go get`ting / replacing in CI as needed to ensure the same versions, but that can become quite complex. Filesystem-relative requires allow ^ all this in a single commit, immediately, without pushing, with the primary tradeoff being "_all_ submodule-users must manually ensure all repo-hosted modules are at the same version" rather than just _some_ (most of the time). That's pretty easy to automate, and `go get` doesn't work out-of-the-box anyway due to thrift, so we're not really losing anything by requiring that.
- Loading branch information