Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools/read-version: fix the tool so that it can handle version parsing errors #4234

Merged
merged 1 commit into from
Jul 19, 2023

Conversation

ani-sinha
Copy link
Contributor

git describe may not return version/tags in the format that the read-version tool expects. Make the tool robust so that it can gracefully handle version strings that are not in the regular format.

@ani-sinha
Copy link
Contributor Author

The test is failing because version_long is None since:

version_long": "23.2-50-ge2a7178a",

does not conform to X.Y.Z-xxx-gHASH format.

tools/read-version Outdated Show resolved Hide resolved
@ani-sinha ani-sinha force-pushed the fix-find-version branch 2 times, most recently from 53313a3 to 9efc1e4 Compare July 10, 2023 09:42
@TheRealFalcon
Copy link
Member

@ani-sinha , can you give me a few examples of the version numbers that you're trying to use and how/where they are failing? I just want to be aware of the use cases and testing appropriately.

@ani-sinha
Copy link
Contributor Author

ani-sinha commented Jul 12, 2023

@ani-sinha , can you give me a few examples of the version numbers that you're trying to use and how/where they are failing? I just want to be aware of the use cases and testing appropriately.

For example, in our RHEL repo, we are seeing this:

$ python3 tools/read-version
version_long: cloud-init-23.1.1-4.el8-0-g609debd4
version: cloud-init-23.1.1-4.el8
Traceback (most recent call last):
File "/workspace/rhel-cloud-init/tools/read-version", line 119, in
distance, commit = info.split("-")
^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 2)

Its because the tags start with "cloud-init-" prefix.

$ git describe HEAD
cloud-init-23.1.1-4.el8
$ git describe HEAD~4
cloud-init-23.1.1-2.el8-2-g285d8d80

etc.

@ani-sinha
Copy link
Contributor Author

Traceback (most recent call last):
File "/workspace/rhel-cloud-init/tools/read-version", line 119, in
distance, commit = info.split("-")
^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 2)

tags can be arbitrary. The tool should not fail. It should still find something reasonable to deal with or leave versions empty.

@TheRealFalcon TheRealFalcon self-assigned this Jul 18, 2023
@TheRealFalcon
Copy link
Member

@ani-sinha , thanks for the info. What about something like this?

diff --git a/tools/read-version b/tools/read-version
index 1e4842efb..5d6112dc4 100755
--- a/tools/read-version
+++ b/tools/read-version
@@ -51,6 +51,37 @@ def is_gitdir(path):
     return False
 
 
+def get_extra_details(version, version_long):
+    release = None
+    extra = None
+    commit = None
+    distance = None
+
+    # Should match upstream version number. E.g., 23.1 or 23.1.2
+    short_regex = r"(\d+\.\d+\.?\d*)"
+    # Should match version including upstream version, distance, and commit
+    # E.g., 23.1.2-10-g12ab34cd
+    long_regex = r"(\d+\.\d+\.?\d*){1}.*-(\d+)+-g([a-f0-9]{8}){1}.*"
+
+    short_match = re.search(short_regex, version)
+    long_match = re.search(long_regex, version_long)
+    if long_match:
+        release, distance, commit = long_match.groups()
+        extra = f"-{distance}-g{commit}"
+    elif short_match:
+        release = short_match.groups()[0]
+
+    return {
+        "release": release,
+        "version": version,
+        "version_long": version_long,
+        "extra": extra,
+        "commit": commit,
+        "distance": distance,
+        "is_release_branch_ci": is_release_branch_ci,
+    }
+
+
 use_long = "--long" in sys.argv or os.environ.get("CI_RV_LONG")
 use_tags = "--tags" in sys.argv or os.environ.get("CI_RV_TAGS")
 output_json = "--json" in sys.argv
@@ -106,40 +137,10 @@ else:
     version = src_version
     version_long = ""
 
-# version is X.Y.Z[+xxx.gHASH]
-# version_long is "" or X.Y[.Z]-xxx-gHASH
-# validate the version formats
-if not re.match('^\d+\.\d+\.\d+', version):
-    version = src_version
-
-if version_long and \
-   not re.match('^\d+\.\d+(\.\d+)*-\d+-g*', version_long):
-    version_long = ""
-
-release = version.partition("-")[0]
-extra = None
-commit = None
-distance = None
-
-if version_long:
-    info = version_long.partition("-")[2]
-    extra = f"-{info}"
-    distance, commit = info.split("-")
-    # remove the 'g' from gHASH
-    commit = commit[1:]
-
-data = {
-    "release": release,
-    "version": version,
-    "version_long": version_long,
-    "extra": extra,
-    "commit": commit,
-    "distance": distance,
-    "is_release_branch_ci": is_release_branch_ci,
-}
 
 if output_json:
-    sys.stdout.write(json.dumps(data, indent=1) + "\n")
+    details = get_extra_details(version, version_long)
+    sys.stdout.write(json.dumps(details, indent=1) + "\n")
 else:
     sys.stdout.write(version + "\n")
 

Changes:

  1. The long details are only collected if --json is passed, and it's moved into its own function.
  2. We use regex capture to capture the details we care about, but if we cannot find them, we won't traceback and will continue to use version and version_long as expected.

One added benefit is the --long details work for your provided version number as well.

If it works for you, feel free to git apply it.

@ani-sinha
Copy link
Contributor Author

ani-sinha commented Jul 19, 2023

get_extra_details

Sadly this won't work. I am getting the following when I run my unit tests:

        ###################
        # Invalid version #
        ###################
        'cloud+init-23.1.1-4.el8-15-g6bc5de66' is not valid according to PEP 440.

$ tools/read-version 
cloud-init-23.1.1-4.el8-15-g6bc5de66
$ tools/read-version --json
{
 "release": "23.1.1",
 "version": "cloud-init-23.1.1-4.el8-15-g6bc5de66",
 "version_long": "cloud-init-23.1.1-4.el8-15-g6bc5de66",
 "extra": "-15-g6bc5de66",
 "commit": "6bc5de66",
 "distance": "15",
 "is_release_branch_ci": false
}

@ani-sinha
Copy link
Contributor Author

@ani-sinha , thanks for the info. What about something like this?

diff --git a/tools/read-version b/tools/read-version
index 1e4842efb..5d6112dc4 100755
--- a/tools/read-version
+++ b/tools/read-version
@@ -51,6 +51,37 @@ def is_gitdir(path):
     return False
 
 
+def get_extra_details(version, version_long):
+    release = None
+    extra = None
+    commit = None
+    distance = None
+
+    # Should match upstream version number. E.g., 23.1 or 23.1.2
+    short_regex = r"(\d+\.\d+\.?\d*)"
+    # Should match version including upstream version, distance, and commit
+    # E.g., 23.1.2-10-g12ab34cd
+    long_regex = r"(\d+\.\d+\.?\d*){1}.*-(\d+)+-g([a-f0-9]{8}){1}.*"
+
+    short_match = re.search(short_regex, version)
+    long_match = re.search(long_regex, version_long)
+    if long_match:
+        release, distance, commit = long_match.groups()
+        extra = f"-{distance}-g{commit}"
+    elif short_match:
+        release = short_match.groups()[0]
+
+    return {
+        "release": release,
+        "version": version,
+        "version_long": version_long,
+        "extra": extra,
+        "commit": commit,
+        "distance": distance,
+        "is_release_branch_ci": is_release_branch_ci,
+    }
+
+
 use_long = "--long" in sys.argv or os.environ.get("CI_RV_LONG")
 use_tags = "--tags" in sys.argv or os.environ.get("CI_RV_TAGS")
 output_json = "--json" in sys.argv
@@ -106,40 +137,10 @@ else:
     version = src_version
     version_long = ""
 
-# version is X.Y.Z[+xxx.gHASH]
-# version_long is "" or X.Y[.Z]-xxx-gHASH
-# validate the version formats
-if not re.match('^\d+\.\d+\.\d+', version):
-    version = src_version
-
-if version_long and \
-   not re.match('^\d+\.\d+(\.\d+)*-\d+-g*', version_long):
-    version_long = ""
-
-release = version.partition("-")[0]
-extra = None
-commit = None
-distance = None
-
-if version_long:
-    info = version_long.partition("-")[2]
-    extra = f"-{info}"
-    distance, commit = info.split("-")
-    # remove the 'g' from gHASH
-    commit = commit[1:]
-
-data = {
-    "release": release,
-    "version": version,
-    "version_long": version_long,
-    "extra": extra,
-    "commit": commit,
-    "distance": distance,
-    "is_release_branch_ci": is_release_branch_ci,
-}
 
 if output_json:
-    sys.stdout.write(json.dumps(data, indent=1) + "\n")
+    details = get_extra_details(version, version_long)
+    sys.stdout.write(json.dumps(details, indent=1) + "\n")
 else:
     sys.stdout.write(version + "\n")
 

Changes:

  1. The long details are only collected if --json is passed, and it's moved into its own function.
  2. We use regex capture to capture the details we care about, but if we cannot find them, we won't traceback and will continue to use version and version_long as expected.

One added benefit is the --long details work for your provided version number as well.

If it works for you, feel free to git apply it.

@TheRealFalcon I made some small changes on top and this seems to work

diff --git a/tools/read-version b/tools/read-version
index ebc46462..1990fa5b 100755
--- a/tools/read-version
+++ b/tools/read-version
@@ -58,10 +58,10 @@ def get_extra_details(version, version_long):
     distance = None
 
     # Should match upstream version number. E.g., 23.1 or 23.1.2
-    short_regex = r"(\d+\.\d+\.?\d*)"
+    short_regex = r"^(\d+\.\d+\.?\d*)"
     # Should match version including upstream version, distance, and commit
     # E.g., 23.1.2-10-g12ab34cd
-    long_regex = r"(\d+\.\d+\.?\d*){1}.*-(\d+)+-g([a-f0-9]{8}){1}.*"
+    long_regex = r"^(\d+\.\d+\.?\d*){1}.*-(\d+)+-g([a-f0-9]{8}){1}.*"
 
     short_match = re.search(short_regex, version)
     long_match = re.search(long_regex, version_long)
@@ -70,6 +70,8 @@ def get_extra_details(version, version_long):
         extra = f"-{distance}-g{commit}"
     elif short_match:
         release = short_match.groups()[0]
+    else:
+        version = src_version
 
     return {
         "release": release,
@@ -137,10 +139,11 @@ else:
     version_long = ""
 
 
+details = get_extra_details(version, version_long)
+
 if output_json:
-    details = get_extra_details(version, version_long)
     sys.stdout.write(json.dumps(details, indent=1) + "\n")
 else:
-    sys.stdout.write(version + "\n")
+    sys.stdout.write(details["version"] + "\n")
 
 sys.exit(0)

@TheRealFalcon
Copy link
Member

I see, thanks. I forgot about the PEP 440 aspect.

One more attempt if you don't mind. This is against my previous patch, not the patch you responded with, but it takes into account what you provided in yours:

diff --git a/tools/read-version b/tools/read-version
index 5d6112dc4..e2efd9e58 100755
--- a/tools/read-version
+++ b/tools/read-version
@@ -51,7 +51,7 @@ def is_gitdir(path):
     return False
 
 
-def get_extra_details(version, version_long):
+def get_version_details(version, version_long):
     release = None
     extra = None
     commit = None
@@ -138,10 +138,18 @@ else:
     version_long = ""
 
 
+details = get_version_details(version, version_long)
+
 if output_json:
-    details = get_extra_details(version, version_long)
     sys.stdout.write(json.dumps(details, indent=1) + "\n")
 else:
-    sys.stdout.write(version + "\n")
+    output = ""
+    if details["release"]:
+        output += details["release"]
+    if details["extra"]:
+        output += details["extra"]
+    if not output:
+        output = src_version
+    sys.stdout.write(output + "\n")
 
 sys.exit(0)

I'd like to not use src_version unless we absolutely have to, because more useful information can be obtained from git. It makes sense as a fallback, but given the version numbers you showed, we can still obtain all of that information from your versions.

…g errors

git describe may not return version/tags in the format that the read-version
tool expects. Make the tool robust so that it can gracefully handle
version strings that are not in the regular format.
We use regex to capture the details we care about, but if we cannot find them,
we won't traceback and will continue to use version and version_long as
expected.

Signed-off-by: Ani Sinha <[email protected]>
@ani-sinha
Copy link
Contributor Author

I see, thanks. I forgot about the PEP 440 aspect.

One more attempt if you don't mind. This is against my previous patch, not the patch you responded with, but it takes into account what you provided in yours:

diff --git a/tools/read-version b/tools/read-version
index 5d6112dc4..e2efd9e58 100755
--- a/tools/read-version
+++ b/tools/read-version
@@ -51,7 +51,7 @@ def is_gitdir(path):
     return False
 
 
-def get_extra_details(version, version_long):
+def get_version_details(version, version_long):
     release = None
     extra = None
     commit = None
@@ -138,10 +138,18 @@ else:
     version_long = ""
 
 
+details = get_version_details(version, version_long)
+
 if output_json:
-    details = get_extra_details(version, version_long)
     sys.stdout.write(json.dumps(details, indent=1) + "\n")
 else:
-    sys.stdout.write(version + "\n")
+    output = ""
+    if details["release"]:
+        output += details["release"]
+    if details["extra"]:
+        output += details["extra"]
+    if not output:
+        output = src_version
+    sys.stdout.write(output + "\n")
 
 sys.exit(0)

I'd like to not use src_version unless we absolutely have to, because more useful information can be obtained from git. It makes sense as a fallback, but given the version numbers you showed, we can still obtain all of that information from your versions.

@TheRealFalcon yes this is perfect! Works for us.

Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and for working through this with me!

@TheRealFalcon TheRealFalcon merged commit 6543c88 into canonical:main Jul 19, 2023
25 checks passed
TheRealFalcon pushed a commit to TheRealFalcon/cloud-init that referenced this pull request Jul 26, 2023
…g errors (canonical#4234)

git describe may not return version/tags in the format that the read-version
tool expects. Make the tool robust so that it can gracefully handle
version strings that are not in the regular format.
We use regex to capture the details we care about, but if we cannot find them,
we won't traceback and will continue to use version and version_long as
expected.

Signed-off-by: Ani Sinha <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants