-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[URGENT] Reducing our usage of GitHub Runners #14376
[URGENT] Reducing our usage of GitHub Runners #14376
Comments
This PR disables all CI Jobs for macOS and Windows, to reduce GitHub Cost. Details here: apache#14376
This PR disables all CI Jobs for macOS and Windows, to reduce GitHub Cost. Details here: apache/nuttx#14376
As commented by @xiaoxiang781216:
I suggest that we monitor the GitHub Cost after disabling macOS and Windows Jobs. It's possible that macOS and Windows Jobs are contributing a huge part of the cost. We could re-enable and simplify them after monitoring. |
One of the methods proposed by, if I remember correctly @btashton, is to replace many simple configurations for some boards (mostly for peripherals testing) with one large |
@raiden00pl Yep I agree. Or we could test a complex target like |
Here's another comment about macOS and Windows by @yamt: #14377 (comment) |
sorry, let me ask a dumb question. |
@yamt It's probably a special plan negotiated by ASF and GitHub? It's not mentioned in the ASF Policy for GitHub Actions: https://infra.apache.org/github-actions-policy.html I find this "contract" a little strange. Why are all ASF Projects subjected to the same quotas? And why can't we increase the quota if we happen to have additional funding? Update: More info here: https://cwiki.apache.org/confluence/display/INFRA/GitHub+self-hosted+runners
Update 2: This sounds really complicated. I'd rather use my own Mac Mini to execute the NuttX CI Tests, once a day? |
do you know if the macos/windows premium applies as usual?
yea, i guess projects have very different sizes/demands. |
Is there any merit in "farming out" CI tests to those with boards? I think there was a discussion about NuttX owning a suite of boards but not sure where that got to - and would depend on just 1 or 2 people managing it. As an aside, is there a guide to self-running CI? As I work on a custom board it would be good for me to do this occasionally but I have noi idea where to start! |
@TimJTi Here's how I do daily testing on Milk-V Duo S SBC: https://lupyuen.github.io/articles/sg2000a |
And I just RTFM...the "official" guide is here so I'll review both and hopefully get it working - and submit any tweaks/corrections/enhancements I find are needed to the NuttX "How To" documentation |
These work, but it does not describe the entire CI, just how to run pytest checks for |
Yes let's cut what we can (but to keep at least minimal functional configure, build, syntax testing) and see what are the cost reduction. We need to show Apache we are working on the problem. So far optimitzations did not cut the use and we are in danger of loosing all CI :-( On the other hand that seems not fair to share the same CI quota as small projects. NuttX is a fully featured RTOS working on ~1000 different devices. In order to keep project code quality we need the CI. Maybe its time to rethink / redesign from scratch the CI test architecture and implementation? |
Another problem is that people very often send unfinished undescribed PRs that are updated without a comment or request that triggers whole big CI process several times :-( Some changes are sometimes required and we cannot avoid that this is part of the process. But maybe we can make something more "adaptive" so only minimal CI is launched by default, preferably only in area that was changed, then with all approvals we can make one manual trigger final big check before merge? Long story short: We can switch CI test runs to manual trigger for now to see how it reduces costs. I would see two buttons to start Basic and Advanced (maybe also Full = current setup) CI. |
@cederom Maybe our PRs should have a Mandatory Field: Which NuttX Config to build, e.g. |
People often cant fill even one single sentence to describe Summary, Impact, Testing :D This may be detected automatically.. or we can just see what architecture is the cheapest one and use it for all basic tests..? |
Often contributors use CI to test all configuration instead of testing changes locally. On one hand I understand this because compiling all configurations on a local machine takes a lot of time, on the other hand I'm not sure if CI is for this purpose (especially when we have limits on its use).
It won't work. Users are lazy, and in order to choose what needs to be compiled correctly, you need a comprehensive knowledge of the entire NuttX, which is not that easy. |
So it looks like for now, where dramatic steps need to be taken, we need to mark all PR as drafts and start CI by hand when we are sure all is ready for merge? o_O |
[like] Jerpelea, Alin reacted to your message:
…________________________________
From: CeDeROM ***@***.***>
Sent: Thursday, October 17, 2024 2:11:13 PM
To: apache/nuttx ***@***.***>
Cc: Jerpelea, Alin ***@***.***>; Comment ***@***.***>
Subject: Re: [apache/nuttx] [URGENT] Reducing our usage of GitHub Runners (Issue #14376)
So it looks like for now, where dramatic steps need to be taken, we need to mark all PR as drafts and start CI by hand when we are sure all is ready for merge? o_O — Reply to this email directly, view it on GitHub, or unsubscribe. You
So it looks like for now, where dramatic steps need to be taken, we need to mark all PR as drafts and start CI by hand when we are sure all is ready for merge? o_O
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/apache/nuttx/issues/14376*issuecomment-2419664709__;Iw!!JmoZiZGBv3RvKRSx!60hNhJMIXMMxTP8-Zr9RteOSJ2PJTdGpwx0nE8SOkWeV1d0uxP1v0N860U_WVI_zv-r-PhDE2T6b-zIlN3CrJpLbOg$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AB32XCU22ONPLOEL6JKVC2LZ37AQDAVCNFSM6AAAAABQC44TO2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJZGY3DINZQHE__;!!JmoZiZGBv3RvKRSx!60hNhJMIXMMxTP8-Zr9RteOSJ2PJTdGpwx0nE8SOkWeV1d0uxP1v0N860U_WVI_zv-r-PhDE2T6b-zIlN3DcSsTpzw$>.
You are receiving this because you commented.Message ID: ***@***.***>
|
This PR disables all CI Jobs for macOS and Windows, to reduce GitHub Cost. Details here: apache/nuttx#14376
When we submit or update a Complex PR that affects All Architectures (Arm, RISC-V, Xtensa, etc): CI Workflow shall run only half the jobs. Previously CI Workflow will run `arm-01` to `arm-14`, now we will run only `arm-01` to `arm-07`. When the Complex PR is Merged: CI Workflow will still run all jobs `arm-01` to `arm-14` Simple PRs with One Single Arch / Board will build the same way as before: `arm-01` to `arm-14` This is explained here: apache#14376 Note that this version of `arch.yml` has diverged from `nuttx-apps`, since we are unable to merge apache#14377
5 Days to Freedom: Yesterday we consumed 14 Full-Time GitHub Runners. That's 56% of the ASF Quota for Full-Time Runners. Looking good!
|
Initial STM32H5 Commit Initial commit of what I deemed essential files for bringing up the STM32H5. src/stm32h5/hardware files were edited by me, but need review. files in src/stm32h5 all need review and edits. include/stm32h5 files need review, some were edited by me. Add Nucleo-H563ZI Folder Add the board folder for the nucleo-h563zi. Right now this is largely a copy of the stm32l562e-dk configuration. Some files may be deleted in the future. Also made minor modifications to arch/arm/src/stm32h5/Kconfig file. hardware/stm32h562xx_rcc.h update Finished register and bit mapping for STM32H5 RCC Rename hardware/stm32h5_rcc.h Renamed stm32h562xx_rcc.h to stm32h5_rcc.h. The RCC register is the same for all versions of the STM32H5. Defined rcc_enableperipherals functions Defined all the functions wihtin rcc_enableperipherals. Getting started on stm32h5_stdclockconfig. Incremental STM32H5 RCC Updates Incremental Updates apache#2 Added stm32h5_lse.c and stm32h5_lsi.c files. Incremental updates to board.h, stm32h5xx_rcc.c, and hardware/stm32h5_rcc.h Incremental Updates apache#3 Added stm32h5_hsi48.c and stm32h5_hsi48.h files. Incremental updates to board.h, stm32h5xx_rcc.c, and hardware/stm32h5_rcc.h. Renamed hardware crs file. Fixed lse.c and lsi.c for STM32H5. Incremental Updates apache#4 Updated setting of VOS for STM32H5. Added HSIDIV definition to hardware/stm32h5_rcc.h for potential of changing HSIDIV from default. Changed board.h to use HSI of 32 MHz, which is the default. We still set SYSCLK to the max of 250MHz. First STM32H5 PWR Commit Rewrote hardware/stm32h5_pwr.h. Added stm32h5_pwr.c and stm32h5_pwr.h. Made minor changes to RCC files based on PWR peripheral. PWR Peripheral Changes Removed enablesmps function. LDO or SMPS is decided by hardware. Removed enablepwrclk. There is no PWREN for the STM32H5. Rewrote adustvcore. vcore must be adjusted incrementally. Incremental Updates apache#5 Changed stm32 to stm32h5 in pwr.c. Added additional logic for selecting PLL sources. Added additional logic for enabling LSE or LSI. Set VCORE properly with stm32h5_pwr.c function. Fixes to adjustvcore function. STM32H5 Power and RCC cleanup Fixed some errors with private functions and incorrect preprocessor variables. Changed adjustvcore to not select intermediate VOS levels. Figure 49 in RM shows changing directly from VOS3 to VOS1. Added function adjustvos_ext for externally supplied VCORE. However I'm not sure if VOS should be incremented, then voltage incremented, then frequency incremented, or if VOS should be incremented one by one to final setting, then adjust voltage, then frequency. adjustvos does the former. Won't be used in stdclockconfig. STM32H5 serial update This commit primarily adds functionality taken from the stm32g4 lpuart implementation. The template I used, from the stm32l5, already had the LPUART in there but did not calculate the baud correctly. Added more USARTS and UARTS supported by STM32H5. Minor changes to chip.h, stm32h5_start.c, and Kconfig. STM32H5 Serial Update apache#2 Added support for additional USARTS and UARTS on STM32H5. Other minor serial updates. Build Fixes Various fixes to get the stm32h5 arch to build. Many changes to follow. But for now, Nuttx builds. Remove unnecessary hardware files from STM32H5 directory More build changes Even more build fixes Minor fixes in stm32h5_rcc.c and stm32h5_pwr.c. Changed nucleo-h563zi defconfig to use std clock config. This resulted in errors that were fixed here. Also added stm32h5_lse.c and stm32h5_lsi.c to Make.defs. Removed legacy pinmap. It is deprecated and should not be used on new designs. Confirmed hardware crs and i2c files are correct. Will keep them for now. IRQ info for STM32H52, STM32H53, STM32H56, STM32H57 libcxx: fix compile error from ServiceManager.cpp:17: /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/ndk/include_cpp/android/binder_to_string.h:71:24: error: expected nested-name-specifier before numeric constant 71 | template <typename _U> | ^~ /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/ndk/include_cpp/android/binder_to_string.h:71:24: error: expected ‘>’ before numeric constant In file included from /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/aidl/android/os/ConnectionInfo.h:3, from /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/aidl/android/os/IServiceManager.h:3, from /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/aidl/android/os/BnServiceManager.h:4: /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/ndk/include_cpp/android/binder_to_string.h:72:56: error: no matching function for call to ‘declval<1>()’ 72 | static auto _test(int) -> decltype(std::declval<_U>().toString(), std::true_type()); | ~~~~~~~~~~~~~~~~^~ In file included from /home/ligd/platform/dev/nuttx/include/libcxx/__type_traits/is_convertible.h:18, Signed-off-by: ligd <[email protected]> libc string:Separate code. Separate the code that follows the BSD license into independent files. Signed-off-by: yangguangcai <[email protected]> arch/sim/cmake: remove the host specific -U when HOSTSRCS fix macos compile hostfs.c compile issue. /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.0.sdk/usr/include/_string.h:131:62: error: expected function body after function declarator 131 | char *stpncpy(char *__dst, const char *__src, size_t __n) __OSX_AVAILABLE_STARTING(__MAC_10_7, __IPHONE_4_3); | ^ Signed-off-by: buxiasen <[email protected]> Revert "libc/lib_bzero:Add bzero prototype." This reverts commit 908814a. In macos, memset will be automatic optmize to bzero, caused dead loop, as we not using bzero, macro re-define should ablt to cover the requirements. Signed-off-by: buxiasen <[email protected]> arhc/arm64: vector table may be far away form arm64_fatal_handle use 33-bit (+/-4GB) pc-relative addressing to load the address of arm64_fatal_handle Signed-off-by: lipengfei28 <[email protected]> sim: fix asan address space conflict Modify the starting position of the elf segment to 0x5000000 ==2561587==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING. ==2561587==ASan shadow was supposed to be located in the [0x1ffff000-0x3fffffff] range. ==2561587==Process memory map follows: Signed-off-by: yinshengkai <[email protected]> arm64/toolchains:Add the following kasan compilation options Signed-off-by: wangmingrong1 <[email protected]> remove unused variable 'cpu_freq' Signed-off-by: lipengfei28 <[email protected]> drivers/timers/arch_alarm.c: Remove ndelay_accurate Using ONESHOT_CURRENT retrieves the tick number multiplied by tick time; thus it doesn't give the accurate monotonic time - it is quantized by the tick time. This cannot be used as a ndelay timer, it would always loop at least to the end of the ongoing tick. Revert the up_udelay to use the original "coarse" looping. The "accurate" udelay, if such is needed, should either be done under arch specific code, or there should be a function for getting the accurate time that is available for all the platforms. Signed-off-by: Jukka Laitinen <[email protected]> boards/imx93-evk: Define CONFIG_BOARD_LOOPSPERMSEC Use value measured with 1.8GHz CPU speed Signed-off-by: Jukka Laitinen <[email protected]> arch/x86_64:Fix variable used before assignment Signed-off-by: liwenxiang1 <[email protected]> arch/arm64: vector table 2K align Signed-off-by: lipengfei28 <[email protected]> arm/build: suppress LOAD RWX linker warning Add --no-warn-rwx-segments in case of RAM boot mode to linker to suppress the below warning: "nuttx has a LOAD segment with RWX permissions" Signed-off-by: Jinliang Li <[email protected]> arch/arm64/src/imx9/imx9_lpspi.c: Fix 9-16 bit transfers Signed-off-by: Jukka Laitinen <[email protected]> arch/arm64/src/imx9/imx9_lpspi.c: Small cache operation optimization There is no need to invalidate the RX buffer before every transfer. It is never gets dirty, so it is good to invalidate initially after allocation, and after each transfer. Signed-off-by: Jukka Laitinen <[email protected]> libxx: C++ low level library select LIBSUPCXX by default. Signed-off-by: cuiziwei <[email protected]> nuttx/sim: Fix m64 build error. LD: nuttx nuttx.rel: in function `ff_dct32_float_sse2': (.text+0x66f9e): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_32' defined in .bss.ff_cos_32 section in nuttx.rel (.text+0x66fa7): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_32' defined in .bss.ff_cos_32 section in nuttx.rel (.text+0x672a6): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_16' defined in .bss.ff_cos_16 section in nuttx.rel (.text+0x672ae): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_16' defined in .bss.ff_cos_16 section in nuttx.rel nuttx.rel: in function `ff_imdct_calc_sse': (.text+0x67905): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_64' defined in .bss.ff_cos_64 section in nuttx.rel (.text+0x67948): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_128' defined in .bss.ff_cos_128 section in nuttx.rel (.text+0x67988): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_256' defined in .bss.ff_cos_256 section in nuttx.rel (.text+0x679c8): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_512' defined in .bss.ff_cos_512 section in nuttx.rel (.text+0x67a08): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_1024' defined in .bss.ff_cos_1024 section in nuttx.rel (.text+0x67a48): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_2048' defined in .bss.ff_cos_2048 section in nuttx.rel (.text+0x67a88): additional relocation overflows omitted from the output Signed-off-by: cuiziwei <[email protected]> tls.h: list.h should depends on CONFIG_PTHREAD_ATFORK Signed-off-by: ligd <[email protected]> bluetooth: fix bt missing header files nuttx/wqueue.h Signed-off-by: ligd <[email protected]> lib_gdbstub: fix container of Signed-off-by: buxiasen <[email protected]> Signed-off-by: ligd <[email protected]> container_of: fix compile failed cause of list.h not support container_of Signed-off-by: ligd <[email protected]> nuttx/arch:Enabling ARCH_MATH_H is required when compiling sim with the 13.2 version of the toolchain. Signed-off-by: cuiziwei <[email protected]> Signed-off-by: ligd <[email protected]> arm/stm32f401rc-rs485: Add support to WS2812 addressable LED Signed-off-by: Rodrigo Sim <[email protected]> syslog: Don't allow blocking when in signal handler Blocking while running a signal handler is not advisable, instead write the log string character by character. There is also a potential for a deadlock, as discussed in apache#6618 Note: querying for rtcb->sigdeliver is not 100% ideal, as it only tells _if_ a signal handler has been queued, not if it is running. However, it makes syslog safe / usable which is a debug feature anyhow. boards/risc-v: Remove ref to riscv_internal.h `riscv_internal.h` is a private chip level header file, and it should not be included in the board files. Signed-off-by: Huang Qi <[email protected]> boards/esp32s3: Merge MCUboot and "simple-boot" linker scripts To make it easier to keep the linker scripts updated for both MCUboot and "simple-boot", this commit merges them into a single linker script with macros to enable/disable specific sections. task_exit.c: Add missing sched_note_stop() A regression from apache#13728 ; sched_note_stop() is never called for tasks that exit normally via exit(). nuttx: Add LIBSUPCXX_TOOLCHAIN to link the prebuilt library provide by toolchain. Signed-off-by: cuiziwei <[email protected]> serial/gdbstub:Adjust serial port gdbstub Kconfig dependencies Signed-off-by: anjiahao <[email protected]> gdbstub:fix typo Signed-off-by: anjiahao <[email protected]> coredump: coredump_add_memory_region need use flags Signed-off-by: anjiahao <[email protected]> arm64: fix fvp smp faild to boot reason: we should give a busy wait addr This commit fixes the regression from apache#13640 Signed-off-by: hujun5 <[email protected]> CI: Enable sim-02 build when we create or update a Complex PR CI Build Job sim-02 was disabled to reduce our usage of GitHub Runners, to comply with ASF Policy: apache#14376 (comment) However this causes the Scheduled Merge Job to fail, due to reduced CI Checks: https://github.com/NuttX/nuttx/actions/runs/11490041505/job/31980056690#step:7:465 This PR re-enables sim-02 when we create or update a Complex PR. arch/Kconfig: remove ARCH_MATH_H if LIBCXX Because some libraries do require a full libm implementation. Signed-off-by: zhanghongyu <[email protected]> Documentation: migrate README.txt from boards and fixes for mps boards migrate some README.txt form boards/ and fixes for mps boards rst samv7: fix QSPI build Commit 313d6df caused the following build error: CC: fixedmath/lib_b16atan2.c chip/sam_qspi.c: In function 'qspi_memory': chip/sam_qspi.c:1552:7: warning: implicit declaration of function 'IS_ALIGNED' [-Wimplicit-function-declaration] 1552 | IS_ALIGNED((uintptr_t)meminfo->buffer, 4) && | ^~~~~~~~~~ In file included from chip/sam_qspi.c:41: chip/sam_qspi.c: In function 'qspi_alloc': chip/sam_qspi.c:1591:21: warning: implicit declaration of function 'ALIGN_UP' [-Wimplicit-function-declaration] 1591 | return kmm_malloc(ALIGN_UP(buflen, 4)); This was caused by missing include of nuttx.h header defining ALIGN_UP and IS_ALIGNED. Signed-off-by: Michal Lenc <[email protected]> mmcsd: SDIO_CAPS_4BIT_ONLY set buswidth MMCSD_SCR_BUSWIDTH_4BIT uint8_t buswidth:4; /* Bus widths supported (SD only) */ Signed-off-by: zhangshoukui <[email protected]> armv8m/clang.cmake: add armv8m clang config Its makefile is implemented in arch/arm/src/armv8-m/Toolchain.defs as follows: ifeq ($(CONFIG_ARM_TOOLCHAIN_CLANG),y) ifeq ($(CONFIG_ARCH_CORTEXM23),y) TOOLCHAIN_CLANG_CONFIG = armv8m.main_soft_nofp else ifeq ($(CONFIG_ARCH_CORTEXM33),y) ifeq ($(CONFIG_ARCH_FPU),y) TOOLCHAIN_CLANG_CONFIG = armv8m.main_hard_fp else TOOLCHAIN_CLANG_CONFIG = armv8m.main_soft_nofp endif else ifeq ($(CONFIG_ARCH_CORTEXM35P),y) ifeq ($(CONFIG_ARCH_FPU),y) TOOLCHAIN_CLANG_CONFIG = armv8m.main_hard_fp else TOOLCHAIN_CLANG_CONFIG = armv8m.main_soft_nofp endif else ifeq ($(CONFIG_ARCH_CORTEXM55),y) ifeq ($(CONFIG_ARCH_FPU),y) TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_hard_fp else TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_soft_nofp_nomve endif else ifeq ($(CONFIG_ARCH_CORTEXM85),y) ifeq ($(CONFIG_ARCH_FPU),y) TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_hard_fp else TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_soft_nofp_nomve endif endif Signed-off-by: wangmingrong1 <[email protected]> Writing documentation related to SPI slave. Fix build issues Fix xtensa build error with choice LIBSUPCXX by default. Signed-off-by: cuiziwei <[email protected]> sim/cmake: compatible when nuttx COMPILE_OPTIONS is not set yet Signed-off-by: buxiasen <[email protected]> Fix cdcncm printf formatter compiler warning esp32s3: Increase the init task stask size when using NSH After recent changes on nuttx-apps (not limited to, but related to nuttx-apps#2738, for instance), the stack usage for the NSH task increased, causing stack overflows under specific situations (when running `ps` command, for instance). This commit increases the init task stack size to avoid it. Please note that, even before these changes, the stack usage of the NSH task was around 90% and, then, increasing the stack size of it was recommended. kconfig: Add link parameters that can print remaining memory information LD: nuttx Memory region Used Size Region Size %age Used flash: 284272 B 512 KB 54.22% sram1: 13296 B 2 MB 0.63% sram2: 0 GB 2 MB 0.00% CP: nuttx.hex CP: nuttx.bin Signed-off-by: wangmingrong1 <[email protected]> Fixed selection of irq file. Added flash.ld script to nucleo-h563zi/scripts folder. Changed Make.defs to use it. Minor change to Kconfig regarding flash configurations. Various changes Fix include guards.
@lupyuen It looks I made a mistake with some commit messages, that caused our branch to get referenced to a few issues in the apache repo. My apologies. I believe I have removed the commit message references, but if there is anything else I need to do to fix this, please let me know and I will get on it ASAP. |
@stbenn No worries thanks :-) |
4 Days to Festivity: Yesterday we consumed 13 Full-Time GitHub Runners (half of the ASF Quota for GitHub Runners)... Past 7 Days: We used an average of 9 Full-Time GitHub Runners... So we're on track to make ASF very happy on 30 Oct! Let's monitor today... |
Thank you @lupyuen for your amazing work!! Have a good calm weekend :-) :-) |
3 Days to Tranquility: Yesterday was a quiet Saturday (no more Release Builds yay!). We consumed only 4 Full-Time GitHub Runners... Let's hope today will be a peaceful Sunday... |
Something strange about Network Timeouts in our Docker Workflows: First Run fails while downloading something from GitHub:
Second Run fails again, while downloading NimBLE from GitHub:
Third Run succeeds. Why do we keep seeing these errors: GitHub Actions with Docker, can't connect to GitHub itself? Is something misconfigured in our Docker Image? But the exact same Docker Image runs fine on my own Build Farm. It doesn't show any errors. Is GitHub Actions starting our Docker Container with the wrong MTU (Network Packet Size)? 🤔 Meanwhile I'm running a script to Restart Failed Jobs on our NuttX Mirror Repos: restart-failed-job.sh |
2 Days to Transcendence: Yesterday we consumed 10 Full-Time GitHub Runners. We peaked briefly at 21 while compiling a few NuttX Apps. Let's keep on monitoring thanks! |
Monitoring our CI Servers 24 x 7This runs on my 4K TV (Xiaomi 65-inch) all day, all night: When I'm out on Overnight Hikes: I check my phone at every water break: I have GitHub Scripts that will run on Termux Android (remember to
|
Lup's Operations Center =) |
1 Day to Utopia: Yesterday was a busy Monday, we consumed 14 Full-Time GitHub Runners. That's 56% of the ASF Quota for Full-Time Runners... We peaked briefly at 26 Full-Time Runners. Let's hang in there thanks! :-) |
2 days but we should be fine thanks to our Super Hero @lupyuen !! AVE =) |
Thank you so much @cederom! :-) |
Kudos Lup!
Best regards
Alin
…________________________________
Från: Lup Yuen Lee ***@***.***>
Skickat: den 29 oktober 2024 00:18
Till: apache/nuttx ***@***.***>
Kopia: Jerpelea, Alin ***@***.***>; Mention ***@***.***>
Ämne: Re: [apache/nuttx] [URGENT] Reducing our usage of GitHub Runners (Issue #14376)
Thank you so much @cederom! :-) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned. Message ID: <apache/nuttx/issues/14376/2442851421@ github. com>
Thank you so much @cederom<https://urldefense.com/v3/__https://github.com/cederom__;!!JmoZiZGBv3RvKRSx!6c8SScCihQbTxBjmvuht5R5wiFaJwwUNLjUHmSdyVVUDwdnGWnhBnqMjn8oEL0G2KsTpa6P1QLdgkP3tQ2JJCiyo2w$>! :-)
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/apache/nuttx/issues/14376*issuecomment-2442851421__;Iw!!JmoZiZGBv3RvKRSx!6c8SScCihQbTxBjmvuht5R5wiFaJwwUNLjUHmSdyVVUDwdnGWnhBnqMjn8oEL0G2KsTpa6P1QLdgkP3tQ2IwVW-L_w$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AB32XCV6L4G5IE6QCIK4D5TZ53A53AVCNFSM6AAAAABQC44TO2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBSHA2TCNBSGE__;!!JmoZiZGBv3RvKRSx!6c8SScCihQbTxBjmvuht5R5wiFaJwwUNLjUHmSdyVVUDwdnGWnhBnqMjn8oEL0G2KsTpa6P1QLdgkP3tQ2IYhS2h5g$>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
0 Days to Final Audit: ASF Infra Team will be checking on us one last time today! Yesterday was a super busy Tuesday, we consumed 15 Full-Time GitHub Runners (peaked briefly at 31) Past 7 Days: We consumed 12 Full-Time Runners, which is half the ASF Quota of 25 Full-Time Runners yay! FYI: Our "Monthly Bill" for GitHub Actions used to be $18K... Right now our Monthly Bill is $14K. And still dropping! Let's wait for the good news from ASF, thank you everyone! 🙏 |
🙏 🙏 🙏 |
GitHub Actions had some laggy issues just now: https://www.githubstatus.com/incidents/9yk1fbk0qjjc So please ignore the over-inflated data in our report (because everything got lagged). Thanks! |
It's Oct 31 and our CI Servers are still running. We made it yay! 🎉 We got plenty to do:
Thank you everyone for making this happen! 🙏 |
BIG THANK YOU @lupyuen FOR YOUR HELP, TIME, AND PATIENCE!! |
Due to the [recent cost-cutting](apache/nuttx#14376), we are no longer running PR Merge Jobs in the `nuttx` and `nuttx-apps` repos. For this to happen, I am now running a script on my computer that will cancel any PR Merge Jobs that appear: [kill-push-master.sh](https://github.com/lupyuen/nuttx-release/blob/main/kill-push-master.sh) This PR disables PR Merge Jobs permanently, so that we no longer need to run the script. This prevents our CI Charges from over-running, in case the script fails to operate properly.
Due to the [recent cost-cutting](apache#14376), we are no longer running PR Merge Jobs in the `nuttx` and `nuttx-apps` repos. For this to happen, I am now running a script on my computer that will cancel any PR Merge Jobs that appear: [kill-push-master.sh](https://github.com/lupyuen/nuttx-release/blob/main/kill-push-master.sh) This PR disables PR Merge Jobs permanently, so that we no longer need to run the script. This prevents our CI Charges from over-running, in case the script fails to operate properly.
Due to the [recent cost-cutting](#14376), we are no longer running PR Merge Jobs in the `nuttx` and `nuttx-apps` repos. For this to happen, I am now running a script on my computer that will cancel any PR Merge Jobs that appear: [kill-push-master.sh](https://github.com/lupyuen/nuttx-release/blob/main/kill-push-master.sh) This PR disables PR Merge Jobs permanently, so that we no longer need to run the script. This prevents our CI Charges from over-running, in case the script fails to operate properly.
Due to the [recent cost-cutting](apache/nuttx#14376), we are no longer running PR Merge Jobs in the `nuttx` and `nuttx-apps` repos. For this to happen, I am now running a script on my computer that will cancel any PR Merge Jobs that appear: [kill-push-master.sh](https://github.com/lupyuen/nuttx-release/blob/main/kill-push-master.sh) This PR disables PR Merge Jobs permanently, so that we no longer need to run the script. This prevents our CI Charges from over-running, in case the script fails to operate properly.
Hi All: We have an ultimatum to reduce (drastically) our usage of GitHub Actions. Or our Continuous Integration will halt totally in Two Weeks. Here's what I'll implement within 24 hours for
nuttx
andnuttx-apps
repos:When we submit or update a Complex PR that affects All Architectures (Arm, RISC-V, Xtensa, etc): CI Workflow shall run only half the jobs. Previously CI Workflow will run
arm-01
toarm-14
, now we will run onlyarm-01
toarm-07
. (This will reduce GitHub Cost by 32%)When the Complex PR is Merged: CI Workflow will still run all jobs
arm-01
toarm-14
(Simple PRs with One Single Arch / Board will build the same way as before:
arm-01
toarm-14
)For NuttX Admins: Our Merge Jobs are now at github.com/NuttX/nuttx. We shall have only Two Scheduled Merge Jobs per day
I shall quickly Cancel any Merge Jobs that appear in
nuttx
andnuttx-apps
repos. Then at 00:00 UTC and 12:00 UTC: I shall start the Latest Merge Job atnuttxpr
.(This will reduce GitHub Cost by 17%)macOS and Windows Jobs (msys2 / msvc): They shall be totally disabled until we find a way to manage their costs. (GitHub charges 10x premium for macOS runners, 2x premium for Windows runners!)
Let's monitor the GitHub Cost after disabling macOS and Windows Jobs. It's possible that macOS and Windows Jobs are contributing a huge part of the cost. We could re-enable and simplify them after monitoring.
(This must be done for BOTH
nuttx
andnuttx-apps
repos. Sadly the ASF Report for GitHub Runners doesn't break down the usage by repo, so we'll never know how much macOS and Windows Jobs are contributing to the cost. That's why we need CI: Disable all jobs for macOS and Windows #14377)(Wish I could run NuttX CI Jobs on my M2 Mac Mini. But the CI Script only supports Intel Macs sigh. Buy a Refurbished Intel Mac Mini?)
We have done an Analysis of CI Jobs over the past 24 hours:
https://docs.google.com/spreadsheets/d/1ujGKmUyy-cGY-l1pDBfle_Y6LKMsNp7o3rbfT1UkiZE/edit?gid=0#gid=0
Many CI Jobs are Incomplete: We waste GitHub Runners on jobs that eventually get superseded and cancelled
When we Half the CI Jobs: We reduce the wastage of GitHub Runners
Scheduled Merge Jobs will also reduce wastage of GitHub Runners, since most Merge Jobs don't complete (only 1 completed yesterday)
See the ASF Policy for GitHub Actions
The text was updated successfully, but these errors were encountered: