Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: implement jj gerrit send for sending changes to Gerrit #2845

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

thoughtpolice
Copy link
Collaborator

@thoughtpolice thoughtpolice commented Jan 18, 2024

Warning

This PR is very rough, and needs significant polish to handle edge cases and other small nits, but functions for basic uses just fine, I think. Please only use it on copies of your existing repositories, or throwaway repositories that you can jj undo on. This warning will be removed once I feel it's more stable.

Please remember: Do not taunt Happy Fun Ball.

This implements the most basic workflow for submitting changes to Gerrit, through a verb called 'send'. Send was chosen to avoid conflicting terminology with terms like "submit" (which merges a change in Gerrit) or "review" which is ambiguous.

Given a list of revsets (specified by multiple -r options), this will parse the footers of every commit, collect them, insert a Change-Id (if one doesn't already exist), and then push them into the given remote.

Because the argument is a revset, you may submit entire trees of changes at once, including multiple trees of independent changes, e.g.

jj gerrit send -r foo:: -r baz::

The remote to push into is, by default, main@origin, and will translate to refs/for/main at that remote. I.e. it is expected that you are using Gerrit as the source of truth, of course. You can specify another remote using the argument --for however, e.g. --for main@gerrit which will change the remote. This syntax is intended to resemble existing syntax for remote references in jj for the sake of UX consistency, rather than blindly exposing the Git refs/ syntax to users, which isn't needed, practically speaking.

There are many other improvements that can be applied on top of this, including a ton of consistency and "does this make sense?" checks. However, it is flexible and a good starting point, and you can in fact both submit and cycle reviews with this interface.

@thoughtpolice thoughtpolice self-assigned this Jan 18, 2024
@thoughtpolice thoughtpolice changed the title cli: implement jj gerrit mail for submitting changes to Gerrit cli: implement jj gerrit mail for sending changes to Gerrit Jan 18, 2024
Copy link
Owner

@martinvonz martinvonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still haven't reviewed the last commit, but it's getting late, so here's what I have for now.

cli/src/cli_util.rs Outdated Show resolved Hide resolved
lib/src/footer.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
lib/src/footer.rs Outdated Show resolved Hide resolved
// to ensure we parse the footer in an unambiguous manner; this avoids cases
// where a colon in the body of the message is mistaken for a footer line
let mut footer = IndexMap::new();
let lines: Vec<&str> = body.trim().lines().rev().collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the first line could be skipped by .next() and could do .rev().map().take_while() or something.

cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
/// the form `branch@remote`, where `branch` is the target branch name that
/// the change will be submitted to, and `remote` is the remote to push to.
/// This remote MUST be a Git push URL for your Gerrit instance.
#[arg(long = "for", short = 'f', default_value = "main@origin")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be better to use trunk() as the default. We can't determine the remote if the trunk revision had multiple remote branches, though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So while trunk() probably does make some sense since it tracks what the upstream would be implicitly (without hardcoding pre-existing branch/remote names), I think it's rather confusing because revsets are already used to specify the trees you want to push, and can be specified multiple times, and trunk() looks quite like one at a glance. For example,

jj gerrit mail -r 'open() ~ empty()' -f 'trunk()'

reads quite strangely to me, especially because trunk() would be rather special in this context where it actually specifies a pair of (branch, remote) to send the change to; it's not clear at a glance what that means. I think that the branch@remote syntax is more familiar in this case since it's what shows up in the default log output, it's used for all other @git refs, etc. At the same time, it's also reasonably easy to translate to Gerrit's git push origin HEAD:refs/for/main so it's not too unfamiliar. It's half-way between us and them.

Related, but what I would actually like to do is have this default be over-rideable in the repository configuration, so that people can also do things like set the URL to the Gerrit instance. e.g.

jj config set --repo gerrit.default-remote canon@upstream

or something of this manner to override main@origin as the default --for. We could also have a nested TOML list in the same manner so that remotes can have individually specified default branch targets, e.g. set gerrit.<remote>.default-for "master" where <remote> can be any Git remote.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jj gerrit mail -r 'open() ~ empty()' -f 'trunk()'

Yeah, it's a bit odd. I don't think --for should support a revset in general, but I mean the default could be derived from the trunk(). If the user configured per-repository trunk(), it would be something like trunk() = "foo@origin", so the purpose is quite similar to gerrit.default-remote. Maybe we should have a parsable configuration for that, and both trunk() and the gerrit --for should default to it?

BTW, is this origin in --for foo@origin the gerrit endpoint? The project I worked before had a gerrit-specific SSH remote, and we usually don't pull from it. In that setup, the user-visible remote branch is something like master@origin, but the remote to push to is gerrit.

cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
// mark the old commit as rewritten to itself, but only because it
// was a --dry-run, so we can give better error messages later
old_to_new.insert(original_commit.clone(), (original_commit.clone(), true));
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just rewrite the commits, and discard the transaction? iirc, git push --dry-run -c does a similar thing.

} else {
write!(ui.stderr(), "Skipped Change-Id (it already exists) for ")?;
}
tx.write_commit_summary(ui.stderr_formatter().as_mut(), new)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, this loop can be moved to the previous "rewriting" loop, and we wouldn't have to maintain the mapping between old and new commits.

I think the commit summary can be printed by tx.base_workspace_helper() to suppress scary markers.

cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
@necauqua
Copy link
Collaborator

From an outside perspective of a non-gerrit user, the word 'mail' in something seemingly unrelated to email (it is, right?) feels.. really weird

Just my input, I've no real arguments for or against

@martinvonz
Copy link
Owner

martinvonz commented Jan 23, 2024

From an outside perspective of a non-gerrit user, the word 'mail' in something seemingly unrelated to email (it is, right?) feels.. really weird

What do you think we should call it? @thoughtpolice's first suggestion was review. I think the main problem with that is that it sounds more like "perform review" than "send for review". send-for-review seems too long. Or is it okay given tab completion? Any other options? get-review? (Maybe just review is still the best?)

@thoughtpolice
Copy link
Collaborator Author

I think send as Yuya mentioned in his review wasn't bad, either. I admit, I find mail to be very "fun", so I quite like it. (Maybe that's the inner Haskell programmer coming out...) I'd prefer we avoid hyphenated-words but, that's mostly aesthetic, I admit.

I also plan on adding some other commands to fetch changes from Gerrit too (though sending them is the most important since fetching them has some workarounds.) Not sure how that factors in to the vocabulary, but something to consider...

@joyously
Copy link

joyously commented Jan 23, 2024

I suggest send also, since that's what Breezy uses. See Breezy doc.

Mail or create a merge-directive for submitting changes.

Oh, and Git has send-email and send-pack,

git-send-email - Send a collection of patches as emails
git-send-pack - Push objects over Git protocol to another repository

and Darcs has send.

Prepare a bundle of patches to be applied to some target repository.

@benbrittain
Copy link
Collaborator

I too am very supportive of send over mail. There might be a better word but I appreciate not overloading the gerrit terminology.

@thoughtpolice thoughtpolice changed the title cli: implement jj gerrit mail for sending changes to Gerrit cli: implement jj gerrit send for sending changes to Gerrit Feb 9, 2024
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 2 times, most recently from bcf8a9c to 699346f Compare February 14, 2024 22:53
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 5 times, most recently from 8a40482 to 08e210b Compare February 23, 2024 03:16
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 2 times, most recently from c6b5c67 to da821aa Compare April 22, 2024 05:52
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 3 times, most recently from c16fc43 to 3afc3e4 Compare May 7, 2024 17:00
@matts1
Copy link
Collaborator

matts1 commented May 14, 2024

This PR appears to be maintained, constantly rebasing on top of main, but it's been two months since the last comment.

Could we either get this merged, or come up with a list of blockers that we need resolved before it can be merged? I see the tests aren't passing on CI - is that the only blocker?

From what I could tell after looking at the PR, there are some minor improvements that could be made, but as @thoughtpolice said, it works, and I think it'd be nice to submit now and iterate on this later.

@martinvonz
Copy link
Owner

As it happens, there was a thread about this on Discord today. @benbrittain also wanted to this merged. IIUC, the main reason it's not merged is that @thoughtpolice has run into some libgit2 (?) bug once in a while which manifests like this: https://stackoverflow.com/questions/16586642/git-unpack-error-on-push-to-gerrit. It still seems useful even if that happens sometimes.

It looks like there are some unresolved comments too, and maybe we'll want another review of the code since it's been a while. I basically don't remember anything from my previous rounds of review. Do you have time to review it, @matts1?

@matts1
Copy link
Collaborator

matts1 commented May 14, 2024

Sure, I can review - I'll take a look once CI is passing.

Copy link
Collaborator

@mlcui-corp mlcui-corp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random drive-by comments.

lib/src/footer.rs Show resolved Hide resolved
///
/// In this case, there are four footer lines: two `Co-authored-by` lines, one
/// `Reviewed-by` line, and one `Change-Id` line.
pub fn get_footer_lines(body: &str) -> Vec<FooterEntry> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth handling this case from git-interpret-trailers' man page?

The may be split over multiple lines with each subsequent line starting with at least one whitespace, like the "folding" in RFC 822. Example:

key: This is a very long value, with spaces and
  newlines in it.

Gerrit code uses RevCommit#getFooterLines which seems to handle this case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Yes, I'll look into fixing this.

cli/src/commands/gerrit/send.rs Show resolved Hide resolved
cli/src/commands/gerrit/send.rs Show resolved Hide resolved
}

// case 2
if let Ok(remote) = config.get_string("gerrit.default_remote") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should case 3 and case 4 come before case 2? If the user sets gerrit.default_remote to foo, and there's only one remote bar, should we just upload to bar? I don't think there's a right or wrong answer here, but whatever we decide, I think we should add a comment to the code to show that the behaviour is intentional.

Following on from that, should case 4 be combined with case 2? We could just make it let remote = config.get_string("gerrit.default_remote").unwrap_or("gerrit").

cli/src/commands/gerrit/send.rs Show resolved Hide resolved
if footer.iter().filter(|entry| entry.0 == "Change-Id").count() > 1 {
writeln!(
ui.warning_default(),
"warning: multiple Change-Id footers in commit {}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You called this an error in the comment above, but this isn't an error - it looks like it'll specifically just skip uploading this one commit, but upload all others.

I'd personally prefer we made this into an actual error. I think that if, for example, we have a parent commit which has duplicate change IDs, and a child commit which is valid, uploading the child without the parent seems like a bad idea.


// check the change-id format is correct in any case
if cid.len() != 41 || !cid.starts_with('I') {
writeln!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - I'd prefer an error

// them, and I realized my old signoff.sh script created invalid
// ones, so this is a helpful barrier.

continue; // fallthrough
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of these continues make it very difficult to understand the code path. I'd prefer we use if/else instead.

If it's too nested that way or something like that, putting it into functions may make it easier

Err(user_error(err.to_string()))
}
// XXX (aseipp): more cases to handle here?
_ => Err(user_error(err.to_string())),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a user error?

"warning: internal git error while pushing to gerrit: {}",
err
)?;
Err(user_error(err.to_string()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a user error

///
/// This command modifies each commit in the revset to include a `Change-Id`
/// footer in its commit message if one does not already exist. Note that
/// this ID is NOT compatible with jj IDs, and is Gerrit-specific.
Copy link
Collaborator

@matts1 matts1 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't remember if this was discussed earlier, but have we considered making this a transient hash of the JJ change ID that isn't actually stored in the committee description?

I've implemented something roughly similar to what you've done, and one problem I had was that when I run JJ split you end up with, by default, two commits with the same gerrit change ID.

That being said, I consider this a very good improvement over the current behaviour, so I'd recommend we submit as is even if we were to change it to that in the future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like using jj change hex as Change-Id (see #4477 (comment)) but I'm not sure if we can get away with not storing it explicitly -- splits could get very confusing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even git interactions could get weird (git commit --amend --no-edit, for example).

Copy link
Collaborator

@matts1 matts1 Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a hybrid approach where we don't store it explicitly automatically, but if you have manually stored it explicitly then we use that Change-ID would work relatively well. It'd also work well with splits by providing reasonable defaults, but could be overridden if required.

Re splits, it should work just fine as long as the user knows exactly what's gonna happen with the split (specifically, jj split to associate the change-ID with the second one, and jj commit to associate the Change-ID with the first one). Maybe we need something like jj change-id swap foo bar or jj change-id set -r foo <change-id>.

lilyball added a commit to lilyball/jj that referenced this pull request Oct 19, 2024
This merges PR martinvonz#3191 (OpenSSH) and PR martinvonz#2845 (Gerrit) in order to build
from the combination.
lilyball added a commit to lilyball/jj that referenced this pull request Oct 19, 2024
This merges PR martinvonz#3191 (OpenSSH) and PR martinvonz#2845 (Gerrit) in order to build
from the combination.
Natural extension of the existing `[T]` instance.

Signed-off-by: Austin Seipp <[email protected]>
Change-Id: Ib7f6fd829096b2cac8e3d7b9471a92ddb76a621b
To be used for parsing `Change-Id`s from commits, in service of Gerrit
support.

Signed-off-by: Austin Seipp <[email protected]>
Change-Id: I434d76b1229b36b815622ad7409ced3a405cbe22
This implements the most basic workflow for submitting changes to Gerrit,
through a verb called 'send'. This verb is intended to be distinct from the word
'submit', which for Gerrit means 'merge a change into the repository.'

Given a list of revsets (specified by multiple `-r` options), this will parse
the footers of every commit, collect them, insert a `Change-Id` (if one doesn't
already exist), and then push them into the given remote.

Because the argument is a revset, you may submit entire trees of changes at
once, including multiple trees of independent changes, e.g.

    jj gerrit send -r foo:: -r baz::

There are many other improvements that can be applied on top of this, including
a ton of consistency and "does this make sense?" checks. However, it is flexible
and a good starting point, and you can in fact both submit and cycle reviews
with this interface.

Signed-off-by: Austin Seipp <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.