Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threading (?) causing glibc/low level crashes #155

Open
fricpa opened this issue Jul 18, 2024 · 4 comments
Open

Threading (?) causing glibc/low level crashes #155

fricpa opened this issue Jul 18, 2024 · 4 comments

Comments

@fricpa
Copy link

fricpa commented Jul 18, 2024

Platform

Knowing the platform greatly narrows down the potential causes of the problem.

  • Platform linux-arm32/64, Raspberry Pi 3/4, amd64
  • OS version busterarm32,bookworm` arm64=aarch64, Ubuntu 24.04
  • hid4java version 0.8.0
  • openjdk 11.0.23 (arm32, amd64) resp. 17.0.11 (aarch64) on those platforms

To Reproduce

Steps to reproduce the behavior:

Write a trivial program

HidServices hidServices =
            HidManager.getHidServices(new HidServicesSpecification());
while (true) hidServices.getAttachedHidDevices();

let it run for a while on the specified platforms.

Expected behavior

Runs without issues forever.

Screenshots and logs

I observed three crash modes so far (note I have a littlescript running the app and logging some stuff, but the basic program is as above):

all of them often appear within a few minutes of running that loop, however, sometimes they don't appear for a long time or only after I plugged in some devices and read/wrote some data to them...

2024-07-18T10:53:34,225 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:34,226 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:34,227 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:34,228 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:34,229 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
double free or corruption (!prev)
./run.sh: line 7:  1484 Aborted                 MAVEN_OPTS="-ea" mvn package exec:java "-Dexec.mainClass=org.example.Main"
FATAL ERROR EXIT CODE 134 AT ./run.sh:7
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
corrupted size vs. prev_size
./run.sh: line 7:  2845 Aborted                 MAVEN_OPTS="-ea" mvn package exec:java "-Dexec.mainClass=org.example.Main"
FATAL ERROR EXIT CODE 134 AT ./run.sh:7

2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    #  SIGSEGV (0xb) at pc=0x0000007fa7bada9c, pid=2754, tid=2802
    #
    # JRE version: OpenJDK Runtime Environment (17.0.11+9) (build 17.0.11+9-Debian-1deb12u1)
    # Java VM: OpenJDK 64-Bit Server VM (17.0.11+9-Debian-1deb12u1, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64)
    # Problematic frame:
    # C  [libc.so.6+0x8da9c]
    [timeout occurred during error reporting in step "printing problematic frame"] after 30 s.
    # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
    #
    # An error report file with more information is saved as:
    # /home/pi/hid4java-apd-test/hs_err_pid2754.log
    # [ timer expired, abort... ]
    ./run.sh: line 7:  2754 Aborted                 MAVEN_OPTS="-ea" mvn package exec:java "-Dexec.mainClass=org.example.Main"
    FATAL ERROR EXIT CODE 134 AT ./run.sh:7

or, on Ubuntu 24.04

2024-07-18T11:07:20,940 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T11:07:20,940 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000074a8142ab7ec, pid=12909, tid=12965
#
# JRE version: OpenJDK Runtime Environment (11.0.23+9) (build 11.0.23+9-post-Ubuntu-1ubuntu1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.23+9-post-Ubuntu-1ubuntu1, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libc.so.6+0xab7ec]
[timeout occurred during error reporting in step "printing problematic frame"] after 30 s.
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/ubuntu/hid4java-apd-test/core.12909)
#
# An error report file with more information is saved as:
# /home/ubuntu/hid4java-apd-test/hs_err_pid12909.log

Additional information
I have not observed any of these failure modes on amd64 Windows 10, the loop seems to run forever there as it should.

However, on Linux it's definitely broken on every platform I tested.

It seems a lot of such issues can be caused by talking to native code from multiple java threads:

https://stackoverflow.com/questions/22491797/java-double-free-or-corruption
https://stackoverflow.com/questions/49628615/understanding-corrupted-size-vs-prev-size-glibc-error

I don't quite understand why hid4java needs any threads in the first place

image

at least for my usecase, all I would need are synchronous enumeration, synchronous read & write (with timeout), all of which are synchronous calls in hidapi

fwiw I have attached the hs_err log files
hs_err_pid2754.log
hs_err_pid12909.log

@fricpa
Copy link
Author

fricpa commented Jul 18, 2024

for now, I have created a private fork of this repo and removed all Thread based functionality (scan thread, reader thread); now the same infinite loop never crashes the program

@fricpa
Copy link
Author

fricpa commented Jul 18, 2024

FWIW, with hid4java:0.8.0 I also did get a fatal error/crash on Windows 10 at least once now, hs err attached

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ffd41751b40, pid=22844, tid=27032
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.23+9 (11.0.23+9) (build 11.0.23+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.23+9 (11.0.23+9, mixed mode, tiered, compressed oops, g1 gc, windows-amd64)
# Problematic frame:
# C  0x00007ffd41751b40
#
# No core dump will be written. Minidumps are not enabled by default on client versions of Windows
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

hs_err_pid22844.log

@fricpa
Copy link
Author

fricpa commented Jul 18, 2024

with the threadless version I cannot reproduce this anymore; there must be some thread safety concerns that are violated by the current implementation

I would be fairly cautious interacting with libhidapi.so and even with JNA in anything but a single-threaded or at least serialized fashion...

@fricpa
Copy link
Author

fricpa commented Jul 31, 2024

Here's a reference for hidapi not being thread-safe:
https://github.com/libusb/hidapi/wiki

FAQ hidapi is not thread-safe in general. How to use hidapi in multithreaded application?
libusb/hidapi#45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant