Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global image for external #5553

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

judovana
Copy link
Contributor

@judovana judovana commented Sep 4, 2024

This PR is enabling any possible image to be fed to external tests.
User can set it via EXTERNAL_AQA_IMAGE variable. Eg EXTERNAL_AQA_IMAGE=my.local.repo/on/hdd/fedora:40. Any legal subset -eg just fedora is allowed.
The properties model was quite enhanced to work properly in multi distroenviroenment. This may be also good base for future localhost runs. I had tested all features on default ubuntu based temurin and on fedora:39. To tune it for centos/rhel will be quite a fun. But full tuning should not be part of this PR.

The build_all.sh is not affected by this change.

this is currently draft, becasue for some reasonnot yet found by me, the mounted jdk in fedora ... is not here. The default works. I wil continue to work on this, but any feedback appreciated.

Close #5555

to make it readable, find-able and more generic
instead of hardcoded ubuntu/ubi images a generic approach was set
generic_packages are those which ahve same name on all supported iamges
REGEX_packages cen set  special packages per distribution
eg ubuntu_packages will be installed on ubuntu only
eg (fedora|ubi|rhel|centos)_packages will be iisntalled on RH systems

This already supports versioned ones, so eg
centos:10_packages will indeed restrict the packages to os centos 10

This introduced hackish way how to handle hardcoded criu-openj9-images
in this multi-image environment
@judovana
Copy link
Contributor Author

judovana commented Sep 4, 2024

@sophia-guo / @smlambert as we talked recently

@judovana judovana marked this pull request as draft September 4, 2024 18:07
@sophia-guo
Copy link
Contributor

@judovana thanks for the work on this PR. Those enhancements are nice. They are addressing two different issues would it be nice to separate the PR into two ( one for global image, one for common property)? Also as we talked recently to keep our workflow organized and ensure each change is properly tracked and reviewed, could you please open two issues and reference them in the respective PRs?

@judovana
Copy link
Contributor Author

judovana commented Sep 5, 2024

The two parts can not go separtely in parallel. Maybe the properties part can go in first, but the geenric images depends on it. I would rather keep it togehther. They do not have sense one by one. (the btohering with first, if the second will not go in or if it will need major changes is not worthy)

Sure, I will create an issue. Thanx a lot for reminder.

@judovana
Copy link
Contributor Author

judovana commented Sep 5, 2024

#5555

It seems that default contianer was not searching directly in, as the
linking on rest of the system led to it. Now all images I tried properly
tests mounted java if it is there.

if not, default search is still on and works fine for both default with
mount, without mount or for other images with java on path without mount
@judovana judovana marked this pull request as ready for review September 5, 2024 15:08
@judovana
Copy link
Contributor Author

judovana commented Sep 5, 2024

Few external tests passes on few jdks on few images.... I will start to roll up some test matrix on this but tbh not sure how it is complete... Many external tests needs forking for newer jdks, also tuning up the depndencies will be fun....

What is required to double check to make this pass through?

@karianna karianna marked this pull request as draft September 6, 2024 07:13
# os is ubi, and test is criu
# temporarily all ubi based testing use internal base image
image_name="$DOCKER_REGISTRY_URL/$base_docker_registry_dir"
tag="latest"
EXTERNAL_AQA_IMAGE="${image_name}:${tag}"
Copy link
Contributor

@sophia-guo sophia-guo Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If EXTERNAL_AQA_IMAGE needs to be set? If yes, should move to line 114 for both temurin and openj9 ? Looks like this is not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unhappy workaround for a corner case I was unable to manage. The only correct sollution to this is described in middle of initial description of issue #5555

The initial change should affect at least the second. To affect the build_all.sh on top of that, should be refactoring, which would remove all the for test in ${supported_tests} do for vm in ${supported_jvms} do for os in ${supported_os} do for package in ${supported_packages} do for build in ${supported_builds} do build_image.sh ${test} ${version} ${vm} ${os} ${package} ${build} to actually do all the looping through ${supported_jvms}, ${supported_os}, ${supported_packages},  ${supported_builds} first, to prepare set of (fully) qualified container IDs and then loop through them, via shortened for test in ${supported_tests} do build_image.sh ${test} ${image} I would probably like to addres it in different issue and different purpose as usage of build-all.sh is unclear to me.

This triple-if is changing to much to be served by the methods in provider.sh (without adding similar triple if there).
If the EXTERNAL_AQA_IMAGE is set in this combination, the exit is called. If not (and that is the only expected case as you need to set three parameters out of any defaults to actually overwrite them by this last-of-three ifs the setup of EXTERNAL_AQA_IMAGE is ensuring the values detected later (which are no longer consistent with the set ones because of those ifs) are correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To construct the EXTERNAL_AQA_IMAGE artificially and globaly, is quite a good idea, and I was playing with it a lot. The reason why I have not included it, are non deterministic OSes. Ubuntu was hardcoded, to be there for jdk-xy-temurin/openj9, because there is no record of ubuntu in name. From my limited knowledge, this is the only case where it is actually missing. Eg redhat java images still have ubi in name, and so on.

If there will be more usage of EXTERNAL_AQA_IMAGE then current baseurl, path, name, os variables, then I guess such a crossroad table would be necessary, but for now I decided to not implement it.

@sophia-guo
Copy link
Contributor

sophia-guo commented Sep 6, 2024

What is required to double check to make this pass through?

Based on the changes, for temurin default image I would suggest to run a build with tomcat tests. The test job doesn't need to pass. It should be good as long as the generated Dockerfile looks good. Running locally it should be good with both default and EXTERNAL_AQA_IMAGE=my.local.repo/on/hdd/fedora:40. For openj9 external tests you may need @LongyuZhang help to run a few grinder to ensure it works for openj9.

@judovana
Copy link
Contributor Author

judovana commented Sep 6, 2024

What is required to double check to make this pass through?

Based on the changes, for temurin default image I would suggest to run a build with tomcat tests. The test job doesn't need to pass. It should be good as long as the generated Dockerfile looks good. Running locally it should be good with both default and EXTERNAL_AQA_IMAGE=my.local.repo/on/hdd/fedora:40. For openj9 external tests you may need @LongyuZhang help to run a few grinder to ensure it works for openj9.

The default image(s) were not changed, and generated docekrfile remains identical!-) tat was moreover primary goal during this PR. The only chnage which is affecting the defaults is the search in /opt first, but it did not have anmy siode effect as far as I was able to check. I will try tomcat.

@judovana judovana marked this pull request as ready for review September 9, 2024 09:46
@judovana
Copy link
Contributor Author

jacoco jdk21: fails coorectly (needs different jacaoco sources: https://ci.adoptium.net/view/Test_grinder/job/Grinder/10846/console

15:36:05  [ERROR] Source option 7 is no longer supported. Use 8 or later.
15:36:05  [ERROR] Target option 7 is no longer supported. Use 8 or later.

(todo, fix??)
jacoco jdk17: https://ci.adoptium.net/view/Test_grinder/job/Grinder/10842/tapResults/
jacoco jdk11: https://ci.adoptium.net/view/Test_grinder/job/Grinder/10845/tapResults/
jacoco jdk8: https://ci.adoptium.net/view/Test_grinder/job/Grinder/10844/tapResults/

@judovana
Copy link
Contributor Author

jacoco jdk21: fails coorectly (needs different jacaoco sources: https://ci.adoptium.net/view/Test_grinder/job/Grinder/10846/console

15:36:05  [ERROR] Source option 7 is no longer supported. Use 8 or later.
15:36:05  [ERROR] Target option 7 is no longer supported. Use 8 or later.

(todo, fix??)

#5575

@judovana
Copy link
Contributor Author

tomcats on jdk11 (both)
https://ci.adoptium.net/view/Test_grinder/job/Grinder/10849/console
https://ci.adoptium.net/view/Test_grinder/job/Grinder/10850/console

It was failing for a while because of some apt error, i started debug.. .and then it suudenly started to pass... (thats why two) Let me know if yuou need more grinders.

Copy link
Contributor

@LongyuZhang LongyuZhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests for CRIU failed, looks like the container os and package install was not aligned after the change. It tries to use apt-get install inside ubi image (hyc_grinder_43087).

16:44:37       [exec] STEP 1/23: FROM sys-rt-docker-local/ubi8-with-criu/linux_x86-64-ubi8-criu:latest
16:44:37       [exec] STEP 2/23: ENV RESULT_COMMENT="IN CONTAINER(not-as-root/podman)"
16:44:37       [exec] --> fb4b96608433
16:44:37       [exec] STEP 3/23: ARG CRIU-UBI-PORTABLE-CHECKPOINT_TAG=
16:44:37       [exec] --> b2cd85d4a861
16:44:37       [exec] STEP 4/23: RUN apt-get update 	&& apt-get install -qq -y --no-install-recommends software-properties-common 	&& apt-get install -qq -y --no-install-recommends gnupg 	&& add-apt-repository ppa:ubuntu-toolchain-r/test 	&& apt-get update 	&& apt-get install -y --no-install-recommends git wget unzip tar curl   	&& rm -rf /var/lib/apt/lists/*
16:44:39       [exec] /bin/sh: apt-get: command not found
16:44:39       [exec] Error: building at STEP "RUN apt-get update 	&& apt-get install -qq -y --no-install-recommends software-properties-common 	&& apt-get install -qq -y --no-install-recommends gnupg 	&& add-apt-repository ppa:ubuntu-toolchain-r/test 	&& apt-get update 	&& apt-get install -y --no-install-recommends git wget unzip tar curl   	&& rm -rf /var/lib/apt/lists/*": while running runtime: exit status 127
16:44:39  
16:44:39  BUILD FAILED
``

@judovana
Copy link
Contributor Author

Tests for CRIU failed, looks like the container os and package install was not aligned after the change. It tries to use apt-get install inside ubi image (hyc_grinder_43087).

16:44:37       [exec] STEP 1/23: FROM sys-rt-docker-local/ubi8-with-criu/linux_x86-64-ubi8-criu:latest
16:44:37       [exec] STEP 2/23: ENV RESULT_COMMENT="IN CONTAINER(not-as-root/podman)"
16:44:37       [exec] --> fb4b96608433
16:44:37       [exec] STEP 3/23: ARG CRIU-UBI-PORTABLE-CHECKPOINT_TAG=
16:44:37       [exec] --> b2cd85d4a861
16:44:37       [exec] STEP 4/23: RUN apt-get update 	&& apt-get install -qq -y --no-install-recommends software-properties-common 	&& apt-get install -qq -y --no-install-recommends gnupg 	&& add-apt-repository ppa:ubuntu-toolchain-r/test 	&& apt-get update 	&& apt-get install -y --no-install-recommends git wget unzip tar curl   	&& rm -rf /var/lib/apt/lists/*
16:44:39       [exec] /bin/sh: apt-get: command not found
16:44:39       [exec] Error: building at STEP "RUN apt-get update 	&& apt-get install -qq -y --no-install-recommends software-properties-common 	&& apt-get install -qq -y --no-install-recommends gnupg 	&& add-apt-repository ppa:ubuntu-toolchain-r/test 	&& apt-get update 	&& apt-get install -y --no-install-recommends git wget unzip tar curl   	&& rm -rf /var/lib/apt/lists/*": while running runtime: exit status 127
16:44:39  
16:44:39  BUILD FAILED
``

Thanx! Will elaborate! If I do not resolve this until your timezone, can you please provide mre details how to reproduce it locally?

@judovana
Copy link
Contributor Author

judovana commented Sep 12, 2024

@LongyuZhang I'm trying something like:

sh build_image.sh  jacoco 17 openj9 ubi jdk full linux_x86-64 default 0

and

sh build_image.sh  criu-functional 17 openj9 ubi jdk full linux_x86-64 default 0 

And you seems to be right, it is ARG IMAGE=docker.io/library/eclipse-temurin:openj9-full and then of course the apt is called. Do you also see wrong ARG_IMAGE pelase?

@judovana
Copy link
Contributor Author

judovana commented Sep 28, 2024

The pacakges were moved to global scope. As you could have noticed, all - except your usecase - is passing.
Those two changes can not be separated - In the moment I'm want to support generic containers - thus at least fedora, rhel, centos and ubuntu, I need to specify packages for each of them, becasue the names and sets of packages are different. Hand in hand with it, I do not want to duplicate declarations of most common packages.

Hardcoded ubis are most terrible to adjust to the generic mages, because they were if-else-hardcoded to original code without much thinking. @LongyuZhang please provide reproducers to your case. I would really love to fix it. It seems it is the only failing one.

@judovana
Copy link
Contributor Author

When I'm in maste the test you are pointing out dies:

export BUILD_LIST=external/criu-ubi-portable-checkpoint ; export EXTRA_DOCKER_ARGS="-v $TEST_JDK_HOME:/opt/java/openjdk" ;  export TEST_JDK_HOME=/usr/lib/jvm/java-11-openjdk-11.0.24.0.8-2.portable.jdk.el.x86_64/
make compile
...
    [exec] DOCKERIMAGE_TAG nightly has been recognized.
     [exec] No credential available for container registry, will proceed without login...
     [exec] sudo podman pull ${env.DOCKER_REGISTRY_URL}/ubi8-with-criu/linux_x86-64-ubi8-criu:latest
     [exec] Error: parsing reference "${env.DOCKER_REGISTRY_URL}/ubi8-with-criu/linux_x86-64-ubi8-criu:latest": invalid reference format

When it is in this pr then it dies absolutely the same.

…eneric

all `git wget perl make` install fie on all tried OSes
@judovana
Copy link
Contributor Author

judovana commented Sep 28, 2024

From that I'm guessing that the issue will be in function getMatchingPackagesKeys() { and/or function getImageOs() { , because I'm reading OS here and it seems the os is linux_x86-64-ubi8-criu. The packages declaration is (fedora|ubi|rhel|centos)_packages="git wget perl make" and if anything would go ok, the .*(fedora|ubi|rhel|centos).* matches linux_x86-64-ubi8-criu so something went wrong in this particular usecase, where hardocded UBIs are present. Msot likely the EXTERNAL_AQA_IMAGE, which is workaround for that if-ubi-if-critu unhappy blocks, was not set in time. But without reproducer or full verbose log, I can just guess

Actually the tes which you are unhappy with, have all packages installable on all tried OSes. I can mititgate now the issue by redeclaring it, but it is jsut moving the problem to criu-portable-checkpoint or to improbable future when criu will be tested on system with different package names

You may try again. please?

@judovana
Copy link
Contributor Author

the

 sh build_image.sh  criu-functional 17 openj9 ubi jdk full linux_x86-64 linux_x86-64-ubi8-criu  0 

correctly, although I dont know if it is at least close to reperoducer

@LongyuZhang
Copy link
Contributor

Re: #5553 (comment)
With the new update, the test passed (hyc_grinder 43691).

Since the CRIU tests have passed, I will leave this PR for @sophia-guo @smlambert to check on their side then. Thanks.

@judovana
Copy link
Contributor Author

judovana commented Oct 1, 2024

Thanx! I'm happy reading from crystal ball was successful!-) I 'm still very curious how to reproduce it locally, because I'm afraid hiding it behind "generic_packages" the issu is hidden. As the criu tests are heavily bounded to single container, the real issue maynever popup again, but still:(

@LongyuZhang if you please can share, how can i run that particular tast, I t would be highly appreciated. TYVM!!

@LongyuZhang
Copy link
Contributor

The reason we set this test differently is because our internal machines don't have access to packages needed by CRIU, so the infra team pre-built CRIU into the container. If you have access to packages to build CRIU in ubi, you can follow the CRIU instruction to build the ubi base image for this test. Thanks.

@judovana
Copy link
Contributor Author

judovana commented Oct 7, 2024

@smlambert @sophia-guo hello! Anything more is needed?

Copy link
Contributor

@smlambert smlambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have started reviewing this PR, but have stopped for now and will put a hold on it while I look at the current functionality versus what we used to have. I believe we had this functionality of using other images/OSes 'previously' but must have 'regressed'. (related: #3363).

external/README.md Outdated Show resolved Hide resolved
@@ -0,0 +1,6 @@
# those packages are included in all tests and all distributions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of this file? Seem it will never actually be used, as most external application tests vary so much, it does not have value. Drop it. It can be added in a later PR when and if it becomes needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. But before I do, please check: 5f88915#diff-3e2f7b162bce7417355d4fe0f18e6366b69e6abe5e73356f863274d3a4f9c04bR2 Those generic_packages="git wget unzip tar curl" are used in 99% of tests (see the rest of 5f88915 and search for -ubuntu_packages or -ubi_packages). I think such base really should be added. (note, this is not disagreeing with removal, just asking for brainstroming. I will remove it solemnly on this comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smlambert Even if practically unused, I would liek to keep this file. Is is extremly helpful when new distribution is tried. It simply allows you to declare its specific dependencies on one place, and then you can easily pinpoint it. And if this file remains unused for future, then they can be tuned up for individual suites alter.

external/build_all.sh Outdated Show resolved Hide resolved
external/build_all.sh Outdated Show resolved Hide resolved
external/external.sh Outdated Show resolved Hide resolved
@judovana
Copy link
Contributor Author

judovana commented Oct 7, 2024

I have started reviewing this PR, but have stopped for now and will put a hold on it while I look at the current functionality versus what we used to have. I believe we had this functionality of using other images/OSes 'previously' but must have 'regressed'. (related: #3363).

Right you are, there was such a support, but it needed IFs for every distribution. I had made it as general as possible. There is only one real if left, which distinguish apt and dnf systems. I would like to get rid also of it no longer run (another property in test.properties I guess). But this PR is already too big, so not (yet)added.
In addition, that support expected to pass each part of container URI as it separated aprts, which - considering number of those parts - was very uneasy to read, use or enhance.

@judovana
Copy link
Contributor Author

decalred -> declared fixed. Any luxk with build_all.sh ?

@judovana
Copy link
Contributor Author

fixed waht seemed to be fixable...
@smlambert Any news on "the hold" ? It keeps to be missing feature...
@sophia-guo You are ok with it as it is, or any more hints?

Thanx a lot both!

external/build_all.sh Outdated Show resolved Hide resolved
external/common_functions.sh Outdated Show resolved Hide resolved
@@ -212,6 +224,9 @@ print_ant_install() {
local file=$1
local ant_version=$2
local os=$3
if isExternalImageEnabled ; then
os=$(getImageOs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on line 512 this os variable is in Uppercase - which one should it be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have nothing in common. The OS was introduced 4 years ago, os was introduced 2 years ago. They both are local. I handle the same additional logic to both of them, as both share the value of current os. I can refactor to be both lowercase (As are other local variables). But I do not strong preffernece, nor do I want to add more complexity to this PR. Still Sharp eye you have. TY!

Both probably share that theirs authors are originally form Asia origin :)

# ================= WARNING ================= WARNING ================= WARNING ================= #"
else
echo \
" # Info Info Info Info Info Info Info Info Info Info Info Info Info Info Info Info Info Info Info #
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this whitespacing produce a consistent output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had aligned itto that "====" format Shelly suggested. They provide as consistent output as possible. Do you have any more exact concern with them? Like being to long or simiarly?

@@ -3,5 +3,5 @@ test_command="mvn clean verify"
test_results="testResults"
tag_version="v0.8.7"
environment_variable="MODE=java"
ubuntu_packages="git wget tar"
maven_version="3.8.5"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move to 3.9.9?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No way in this PR:( But otherwise I would love to.

external/provider.sh Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

provide a way to allow to run external tests on any container
5 participants