Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI for rolling-version on humble/iron #30

Merged
merged 18 commits into from
Mar 15, 2024

Conversation

christophfroehlich
Copy link
Contributor

@christophfroehlich christophfroehlich commented Mar 3, 2024

Accompanying

Any better ideas for the names of the workflows?

I don't think we want to fight the releases of ignition/gz on humble. Therefore, I excluded ign/gz_ros2_control

@saikishor
Copy link
Member

I'm not sure calling them unstable is the right word. How about rolling compatibility on iron/humble?. If needed, We can also add unsupported keyword to the naming.

What do you think?

@christophfroehlich
Copy link
Contributor Author

christophfroehlich commented Mar 3, 2024

I was looking for a term not used in the ROS ecosystem (like testing), but unstable is something typical in the debian world. But I'm open for other namings. But for sure it will be unstable :D

@christophfroehlich
Copy link
Contributor Author

@saikishor any idea why building controller_manager fails here? https://github.com/ros-controls/ros2_control_ci/actions/runs/8133285062/job/22224687946?pr=30

@saikishor
Copy link
Member

This happened to me earlier in my machine, I believe the reason is it has an installed version of ros2_control_test_assets that is causing the issue. I solved it in my local machine by uninstalling the installed version

@christophfroehlich
Copy link
Contributor Author

Ah, you are right. It is installed with gazebo_ros2_control (see question in initial post)

  executing command [apt-get install -y -qq ros-humble-gazebo-ros2-control]
  Setting up ros-humble-ros2-control-test-assets (2.39.1-1jammy.20240217.030243) ...
  Setting up ros-humble-control-msgs (4.4.0-1jammy.20240217.061208) ...
  Setting up ros-humble-realtime-tools (2.5.0-1jammy.20240217.075900) ...
  Setting up ros-humble-controller-manager-msgs (2.39.1-1jammy.20240217.051450) ...
  Setting up ros-humble-hardware-interface (2.39.1-1jammy.20240217.081024) ...
  Setting up ros-humble-controller-interface (2.39.1-1jammy.20240217.081539) ...
  Setting up ros-humble-controller-manager (2.39.1-1jammy.20240217.082030) ...
  Setting up ros-humble-gazebo-ros2-control (0.4.6-1jammy.20240217.083418) ...

But do we have an issue with overriding the binary version?

@christophfroehlich
Copy link
Contributor Author

And at least one test fails on iron :(

   [ RUN      ] TestLoadController.spawner_test_type_in_param
   ...
     terminate called after throwing an instance of 'std::runtime_error'
      what():  Can not get command interface configuration until the controller is configured.

@saikishor
Copy link
Member

Ah, you are right. It is installed with gazebo_ros2_control (see question in initial post)

Ideally, if we use the upstream ros2_control, ideally it should work directly. We need to test it. In worst case, we can skip this as you suggested as we don't have to add compatibility with it

But do we have an issue with overriding the binary version?

I believe yes, we might have an issue with the overriding binary versions. I think it is better to do it without any installed versions.

@saikishor
Copy link
Member

And at least one test fails on iron :(

   [ RUN      ] TestLoadController.spawner_test_type_in_param
   ...
     terminate called after throwing an instance of 'std::runtime_error'
      what():  Can not get command interface configuration until the controller is configured.

That's weird, it needs to work. I can take a look later

Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christophfroehlich Can't we use reusable builds in this case on Iron and Humble?

@christophfroehlich
Copy link
Contributor Author

christophfroehlich commented Mar 4, 2024

@christophfroehlich Can't we use reusable builds in this case or Iron and Humble?

sorry, what do you mean? Creating workflows in all our repos to build it there? sure, but we didn't want it to run on PR or pushs, so why not to make it centralized here?

@saikishor
Copy link
Member

sorry, what do you mean? Creating workflows in all our repos to build it there? sure, but we didn't want it to run on PR or pushs, so why not to make it centralized here?

Sure!

BTW I've tested building the gazebo_ros2_control upstream version in the workspace, it seems to work directly without any modifications.

@christophfroehlich
Copy link
Contributor Author

OK. but now also the humble workflow fails because of the binary installation of the test assets. I'll have a look these days, how I could fix this

@saikishor
Copy link
Member

Sure. Thank you

@christophfroehlich
Copy link
Contributor Author

@saikishor humble tests are failing because of this incompatibility ros-controls/ros2_control#913 (comment)
can one define conditionals for python imports as well?

@saikishor
Copy link
Member

@saikishor humble tests are failing because of this incompatibility ros-controls/ros2_control#913 (comment)
can one define conditionals for python imports as well?

I'm sorry about it. I didn't run the tests locally. I didn't think about the python part. You are right. If I find some time, I'll get onto it

@christophfroehlich
Copy link
Contributor Author

christophfroehlich commented Mar 13, 2024

The humble jobs fail because of gz_ros2_control, but this is not included anywhere. I think this is imported from the cache because the normal stack build and this job share the cache? not sure about the github.run_id, but the rest is the same: target_ws-humble--testing-reusable_industrial_ci_with_cache--8256893443

CACHE_PREFIX: ${{ inputs.ros_distro }}-${{ inputs.upstream_workspace }}-${{ inputs.ros_repo }}-${{ github.job }}

key: target_ws-${{ env.CACHE_PREFIX }}-${{ hashFiles('**/CMakeLists.txt', '**/package.xml') }}-${{ github.run_id }}

I don't want to test gz/ign_ros2_control on humble here. (see the errors we have with the normal stack build).
@fmauch any suggestion how to fix this?

@saikishor
Copy link
Member

It is weird that it is caching gz_ros2_control package and running tests, strangely it is hard to reproduce locally on a clean humble docker. It would be nice to figure it out and fix it before merging this

@fmauch
Copy link
Contributor

fmauch commented Mar 14, 2024

The humble jobs fail because of gz_ros2_control, but this is not included anywhere. I think this is imported from the cache because the normal stack build and this job share the cache? not sure about the github.run_id, but the rest is the same: target_ws-humble--testing-reusable_industrial_ci_with_cache--8256893443

CACHE_PREFIX: ${{ inputs.ros_distro }}-${{ inputs.upstream_workspace }}-${{ inputs.ros_repo }}-${{ github.job }}

key: target_ws-${{ env.CACHE_PREFIX }}-${{ hashFiles('**/CMakeLists.txt', '**/package.xml') }}-${{ github.run_id }}

I don't want to test gz/ign_ros2_control on humble here. (see the errors we have with the normal stack build). @fmauch any suggestion how to fix this?

Yes, that's definitively a problem. The target-ws is restored by the key

target_ws-${{ env.CACHE_PREFIX }}-${{ hashFiles('**/CMakeLists.txt', '**/package.xml') }}

with ${{ env.CACHE_PREFIX }} being

${{ inputs.ros_distro }}-${{ inputs.upstream_workspace }}-${{ inputs.ros_repo }}-${{ github.job }}

resulting in

target_ws-humble--main-reusable_industrial_ci_with_cache--8261042993

Given the -- the github.job seems to be empty, so as soon as another cache was built without touching a CMakeLists.txt or package.xml that will be reused.

I don't quite understand, why it is empty, though, I would expect it to be stack-build-on-humble since that's also the name shown on the Github UI.


Edit: Re-reading the documentation makes that clear:

The job_id of the current job.
Note: This context property is set by the Actions runner, and is only available within the execution steps of a job. Otherwise, the value of this property will be null.

Since we're not inside a step when defining the prefix it is empty.


Edit2: Thinking about this that might be a larger problem. That means that an action run from a PR might re-use the cache from the master branch as long (as long as neither a CMakeLists.txt or package.xml are touched). I think, github.ref should get in there, too.

@christophfroehlich
Copy link
Contributor Author

No, I think the github.job is reusable_industrial_ci_with_cache: It takes the job from the reusable workflow, not that from the calling workflow.

@christophfroehlich
Copy link
Contributor Author

Edit2: Thinking about this that might be a larger problem. That means that an action run from a PR might re-use the cache from the master branch as long (as long as neither a CMakeLists.txt or package.xml are touched). I think, github.ref should get in there, too.

but using github.ref would mean that PRs don't use the cache from other branches. And if we have a restore-key without github.ref, then we would have the same issue as in this PR.

@fmauch
Copy link
Contributor

fmauch commented Mar 14, 2024

No, I think the github.job is reusable_industrial_ci_with_cache: It takes the job from the reusable workflow, not that from the calling workflow.

Not sure, but either way if the job id isn't available... github.workflow could be an alternative.

@christophfroehlich
Copy link
Contributor Author

Maybe we could use a composable action instead of a reusable workflow? Then the job-id would be unique as we would like it.

@fmauch
Copy link
Contributor

fmauch commented Mar 14, 2024

but using github.ref would mean that PRs don't use the cache from other branches.

True, but do we want that? I guess, we want that for PR runs, right? We could add github.base_ref for those.

And if we have a restore-key without github.ref, then we would have the same issue as in this PR.

The restore key obviously has to go hand-in hand.

@fmauch
Copy link
Contributor

fmauch commented Mar 14, 2024

Maybe we could use a composable action instead of a reusable workflow? Then the job-id would be unique as we would like it.

That does indeed not sound too bad...

@christophfroehlich
Copy link
Contributor Author

christophfroehlich commented Mar 14, 2024

but using github.ref would mean that PRs don't use the cache from other branches.

True, but do we want that? I guess, we want that for PR runs, right? We could add github.base_ref for those.

PRs targeting master should use the cache from the master branch, but not the other way round. Gitlab does not use caches from PRs, how does that work with github?

base_ref is what we want for PRs, but how would that work with scheduled builds or push events?

The base_ref or target branch of the pull request in a workflow run. This property is only available when the event that triggers a workflow run is either pull_request or pull_request_target.

Maybe we could use a composable action instead of a reusable workflow? Then the job-id would be unique as we would like it.

That does indeed not sound too bad...

if the jobs are unique for different branches, then this would solve the case from above?

@fmauch
Copy link
Contributor

fmauch commented Mar 14, 2024

if the jobs are unique for different branches, then this would solve the case from above?

Not necessarily. Caches from PRs and scheduled builds would still get mixed up. That might be fine in most cases, but might as well lead to strange errors as the one mentioned above.

I guess, the cleanest would be:

  • Use a proper job identifier (e.g. github.workflow)
  • Use three possbile restore keys:
    • ...${{ github.base_ref }}-${{ github.ref }}
    • ...${{ github.base_ref }}
    • ...${{ github.ref }}

This way on a PR it would search for a cache e.g. ...master-my_feature and if it doesn't find that during the first build use ...master, while for a scheduled build it would use ...master, since ...master- and ... don't exist.

That could work, right?

Changing things to actual actions rather than reusable workflow could be a separate discussion, I think.

@christophfroehlich
Copy link
Contributor Author

christophfroehlich commented Mar 14, 2024

if the jobs are unique for different branches, then this would solve the case from above?

Not necessarily. Caches from PRs and scheduled builds would still get mixed up. That might be fine in most cases, but might as well lead to strange errors as the one mentioned above.

I guess, the cleanest would be:

* Use a proper job identifier (e.g. `github.workflow`)

* Use three possbile restore keys:
  
  * `...${{ github.base_ref }}-${{ github.ref }}`
  * `...${{ github.base_ref }}`
  * `...${{ github.ref }}`

This way on a PR it would search for a cache e.g. ...master-my_feature and if it doesn't find that during the first build use ...master, while for a scheduled build it would use ...master, since ...master- and ... don't exist.

That could work, right?

not sure, what will be github.workflow from within the reusable workflow?

The name of the workflow. If the workflow file doesn't specify a name, the value of this property is the full path of the workflow file in the repository.
from https://docs.github.com/en/actions/learn-github-actions/contexts

Changing things to actual actions rather than reusable workflow could be a separate discussion, I think.

Maybe it just solves the problem above, because the github.job then names the calling job.

@fmauch
Copy link
Contributor

fmauch commented Mar 14, 2024

not sure, what will be github.workflow from within the reusable workflow?

Indeed that's an interesting question, but it should at lease be available in contrast to the job's name.

Maybe it just solves the problem above, because the github.job then names the calling job.

It will still not be available outside of the steps, right? Or will it because it is part of the step of the calling job?

I guess it is time to prototype something up to verify, I probably won't find time for that before the weekend.

@christophfroehlich
Copy link
Contributor Author

I added gz_ros2_control now for the humble build, and it succeeds now as well. I'd merge this PR if ok for you @fmauch @saikishor, or do you have other naming suggestions?
I'll open an issue for the cache topic.

@saikishor
Copy link
Member

Nice great job guys. CI is passing now!

Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the naming. Thanks for the changes and good work guys @christophfroehlich @fmauch 👏

Copy link
Contributor

@fmauch fmauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that looks good.

@christophfroehlich christophfroehlich merged commit eb41b3f into master Mar 15, 2024
2 checks passed
@christophfroehlich christophfroehlich deleted the add_rolling_for_humble branch March 15, 2024 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants