Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc StatusRuntimeException: UNAVAILABLE #78

Open
jeongum opened this issue Jun 20, 2023 · 21 comments
Open

grpc StatusRuntimeException: UNAVAILABLE #78

jeongum opened this issue Jun 20, 2023 · 21 comments

Comments

@jeongum
Copy link

jeongum commented Jun 20, 2023

안녕하세요, 데드 코드 분석을 위해 scavenger 도입 중에 있습니다.

agent를 프로젝트를 붙이는 과정에, 프로젝트를 실행시키면 아래와 같은 에러가 납니다.

2023-06-20 00:57:03.320 [scavenger scheduler] ERROR c.n.s.javaagent.scheduling.Scheduler - [scavenger] poll config failed
sc.io.grpc.StatusRuntimeException: UNAVAILABLE: End of stream or IOException
	at sc.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
	at sc.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
	at sc.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)
	at sc.com.navercorp.scavenger.model.GrpcAgentServiceGrpc$GrpcAgentServiceBlockingStub.pollConfig(GrpcAgentServiceGrpc.java:263)
	at com.navercorp.scavenger.javaagent.publishing.GrpcClient.pollConfig(GrpcClient.java:35)
	at com.navercorp.scavenger.javaagent.publishing.Publisher.pollDynamicConfig(Publisher.java:99)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.pollDynamicConfigIfNeeded(Scheduler.java:126)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.run(Scheduler.java:114)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)

혹시 에러의 원인과 해결방법을 알 수 있을까요?

감사합니다.

@sohyun-ku
Copy link
Contributor

@jeongum

안녕하세요.
Collector, API 를 실행한 방법과 configuration 파일 공유부탁드리겠습니다.

@jeongum
Copy link
Author

jeongum commented Jun 20, 2023

@sohyun-ku
Collector, API 모두 도커 이미지로 빌드하여 Deployment 된 상태입니다.
URL은 사내 테스트용 도메인이라 알려드리지 못하는 점 양해부탁드립니다.

scavenger.conf

serverUrl = http://{{collector-url}}
apiKey = 0c54c04a-9816-4e63-bf69-432ac0e1eb5b
appName = wedo
packages = com.sbe.wedo.controller
methodVisibility = public
excludeConstructors = true
excludeGetterSetter = true
asyncCodeBaseScanMode = true

Collector Dockerfile EntryPoint

ENTRYPOINT ["java","-Dspring.datasource.url=${JDBC_URL}","-Dspring.datasource.username=${JDBC_USERNAME}", "-Dspring.datasource.password=${JDBC_PASSWORD}", "-Dspring.datasource.driver-class-name=${JDBC_DRIVER}", "-jar", "-Dameria.port=${PORT}", "/scavenger-collector-1.1.0.jar"]

API Dockerfile EntryPoint

ENTRYPOINT ["java","-Dscavenger.collector-server-url=${COLLECTOR_SERVER_URL}", "-Dspring.datasource.url=${JDBC_URL}", "-Dspring.datasource.username=${JDBC_USERNAME}", "-Dspring.datasource.password=${JDBC_PASSWORD}", "-Dspring.datasource.driver-class-name=${JDBC_DRIVER}", "-Dserver.port=${PORT}", "-jar", "/scavenger-api-1.1.0.jar"]

@sohyun-ku
Copy link
Contributor

@jeongum

Collector에서 사용하는 포트 정보도 알려주실 수 있으실까요?

추가로 아래는 설정이 잘못된 것으로 보입니다. (참고 - https://github.com/naver/scavenger/blob/develop/doc/installation.md#configuration)

Collector Dockerfile EntryPoint

ENTRYPOINT ["java","-Dspring.datasource.url=${JDBC_URL}","-Dspring.datasource.username=${JDBC_USERNAME}", "-Dspring.datasource.password=${JDBC_PASSWORD}", "-Dspring.datasource.driver-class-name=${JDBC_DRIVER}", "-jar", "-Dameria.port=${PORT}", "/scavenger

-Dameria.port -> -Darmeria.port

@jeongum
Copy link
Author

jeongum commented Jun 20, 2023

@sohyun-ku
사용하는 포트는 8080입니다.

설정은 픽스하도록 하겠습니다. 감사합니다.

@taeyeon-Kim
Copy link
Contributor

@jeongum 안녕하세요. (_ _)
이미 8080포트 사용하고 계셨다면 딱히 포트 이슈는 아닌걸로 보이네요.
혹시 공유주신 UNAVAILABLE 발생 시간대에 collector나 agent 로그를 확인할 수 있을까요??

@jeongum
Copy link
Author

jeongum commented Jun 20, 2023

@taeyeon-Kim 안녕하세요.

Collector에서 별다른 에러로그는 발견하지 못했고,

04:41:47.701 [armeria-common-worker-epoll-2-1] INFO  c.l.a.i.common.JavaVersionSpecific - Using the APIs optimized for: Java 12+
04:41:47.838 [armeria-common-blocking-tasks-1-1] INFO  o.a.c.c.C.[Tomcat].[localhost].[/] - Initializing Spring DispatcherServlet 'dispatcherServlet'
04:41:47.838 [armeria-common-blocking-tasks-1-1] INFO  o.s.web.servlet.DispatcherServlet - Initializing Servlet 'dispatcherServlet'
04:41:47.850 [armeria-common-blocking-tasks-1-1] INFO  o.s.web.servlet.DispatcherServlet - Completed initialization in 12 ms
04:41:47.943 [armeria-common-blocking-tasks-1-1] INFO  c.n.s.controller.AgentController - init config requested from grpc client 10.202.61.51 with licenseKey: 0c54c04a-9816-4e63-bf69-432ac0e1eb5b
04:42:39.612 [armeria-common-blocking-tasks-1-1] INFO  c.n.s.controller.AgentController - init config requested from grpc client 10.202.69.15 with licenseKey: 0c54c04a-9816-4e63-bf69-432ac0e1eb5b

위와 같은 로그가 남아있습니다.

agent 로그입니다.

INFO: [scavenger] scavenger agent version 1.0.3 is starting...
Jun 20, 2023 4:41:33 AM com.navercorp.scavenger.javaagent.util.ConfigUtils locateConfigFile
INFO: [scavenger] config file found explicitly: /usr/local/scavenger/scavenger.conf


███████╗ ██████╗ █████╗ ██╗   ██╗███████╗███╗   ██╗ ██████╗ ███████╗██████╗
██╔════╝██╔════╝██╔══██╗██║   ██║██╔════╝████╗  ██║██╔════╝ ██╔════╝██╔══██╗
███████╗██║     ███████║██║   ██║█████╗  ██╔██╗ ██║██║  ███╗█████╗  ██████╔╝
╚════██║██║     ██╔══██║╚██╗ ██╔╝██╔══╝  ██║╚██╗██║██║   ██║██╔══╝  ██╔══██╗
███████║╚██████╗██║  ██║ ╚████╔╝ ███████╗██║ ╚████║╚██████╔╝███████╗██║  ██║
╚══════╝ ╚═════╝╚═╝  ╚═╝  ╚═══╝  ╚══════╝╚═╝  ╚═══╝ ╚═════╝ ╚══════╝╚═╝  ╚═╝

                 agent version :: 1.0.3
               config location :: /usr/local/scavenger/scavenger.conf
                       api key :: 0c54c04a-9816-4e63-bf69-432ac0e1eb5b
                    server url :: http://{{collector-domain}}
                      app name :: wedo
                   app version :: unspecified
                   environment :: <default>
                       package :: com.sbe.wedo.controller
             method visibility :: public
          exclude constructors :: true
      exclude setters, getters :: true
                      hostname :: wedo-deployment-fzzwx
     async code base scan mode :: true
     legacy compatibility mode :: false

[scavenger scheduler] ERROR c.n.s.javaagent.scheduling.Scheduler - [scavenger] poll config failed
sc.io.grpc.StatusRuntimeException: UNAVAILABLE: End of stream or IOException
	at sc.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
	at sc.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
	at sc.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)
	at sc.com.navercorp.scavenger.model.GrpcAgentServiceGrpc$GrpcAgentServiceBlockingStub.pollConfig(GrpcAgentServiceGrpc.java:263)
	at com.navercorp.scavenger.javaagent.publishing.GrpcClient.pollConfig(GrpcClient.java:35)
	at com.navercorp.scavenger.javaagent.publishing.Publisher.pollDynamicConfig(Publisher.java:99)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.pollDynamicConfigIfNeeded(Scheduler.java:126)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.run(Scheduler.java:114)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

(+ 로그의 server url은 collector url로 나옵니다!)

감사합니다..!

@taeyeon-Kim
Copy link
Contributor

@jeongum 정음님 혹시 에이전트를 1.1.0버전으로 사용해보시겠어요??

@jeongum
Copy link
Author

jeongum commented Jun 20, 2023

@taeyeon-Kim
버전 1.1.0으로 해도 동일하게 작동합니다 😥


███████╗ ██████╗ █████╗ ██╗   ██╗███████╗███╗   ██╗ ██████╗ ███████╗██████╗
██╔════╝██╔════╝██╔══██╗██║   ██║██╔════╝████╗  ██║██╔════╝ ██╔════╝██╔══██╗
███████╗██║     ███████║██║   ██║█████╗  ██╔██╗ ██║██║  ███╗█████╗  ██████╔╝
╚════██║██║     ██╔══██║╚██╗ ██╔╝██╔══╝  ██║╚██╗██║██║   ██║██╔══╝  ██╔══██╗
███████║╚██████╗██║  ██║ ╚████╔╝ ███████╗██║ ╚████║╚██████╔╝███████╗██║  ██║
╚══════╝ ╚═════╝╚═╝  ╚═╝  ╚═══╝  ╚══════╝╚═╝  ╚═══╝ ╚═════╝ ╚══════╝╚═╝  ╚═╝

                 agent version :: 1.1.0
               config location :/wedo/scavenger/scavenger.conf
                       api key :: 0c54c04a-9816-4e63-bf69-432ac0e1eb5b
                    server url :: http://{{collector-url}}
                      app name :: wedo
                   app version :: unspecified
                   environment :: <default>
                       package :: com.sbe.wedo.controller
             method visibility :: public
          exclude constructors :: true
      exclude setters, getters :: true
                      hostname :: l-PC.local
     async code base scan mode :: true
     legacy compatibility mode :: false

[scavenger scheduler] ERROR c.n.s.javaagent.scheduling.Scheduler - [scavenger] poll config failed
sc.io.grpc.StatusRuntimeException: UNAVAILABLE: End of stream or IOException
	at sc.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
	at sc.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
	at sc.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)
	at sc.com.navercorp.scavenger.model.GrpcAgentServiceGrpc$GrpcAgentServiceBlockingStub.pollConfig(GrpcAgentServiceGrpc.java:263)
	at com.navercorp.scavenger.javaagent.publishing.GrpcClient.pollConfig(GrpcClient.java:44)
	at com.navercorp.scavenger.javaagent.publishing.Publisher.pollDynamicConfig(Publisher.java:99)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.pollDynamicConfigIfNeeded(Scheduler.java:126)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.run(Scheduler.java:114)

Collector

armeria-common-blocking-tasks-1-2] INFO  c.n.s.controller.AgentController - init config requested from grpc client 172.31.180.47 with licenseKey: 0c54c04a-9816-4e63-bf69-432ac0e1eb5b

@jeongum
Copy link
Author

jeongum commented Jun 20, 2023

@taeyeon-Kim 안녕하세요..!
사내 개발 환경이라 환경을 조금 바꾸어 보았더니 아래와 같이 새로운 오류가 보입니다..!

2023-06-20 14:59:00.831 [scavenger scheduler] INFO  c.n.s.j.publishing.GrpcClient - [scavenger] creating new grpc client. host is {{collector-url}}
2023-06-20 14:59:03.039 [scavenger scheduler] ERROR c.n.s.javaagent.scheduling.Scheduler - [scavenger] poll config failed
sc.io.grpc.StatusRuntimeException: INTERNAL: error in frame handler
	at sc.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
	at sc.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
	at sc.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)
	at sc.com.navercorp.scavenger.model.GrpcAgentServiceGrpc$GrpcAgentServiceBlockingStub.pollConfig(GrpcAgentServiceGrpc.java:263)
	at com.navercorp.scavenger.javaagent.publishing.GrpcClient.pollConfig(GrpcClient.java:35)
	at com.navercorp.scavenger.javaagent.publishing.Publisher.pollDynamicConfig(Publisher.java:99)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.pollDynamicConfigIfNeeded(Scheduler.java:126)
	at com.navercorp.scavenger.javaagent.scheduling.Scheduler.run(Scheduler.java:114)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: FRAME_SIZE_ERROR: 4740180
	at sc.io.grpc.okhttp.internal.framed.Http2.ioException(Http2.java:597)
	at sc.io.grpc.okhttp.internal.framed.Http2.access$200(Http2.java:48)
	at sc.io.grpc.okhttp.internal.framed.Http2$Reader.nextFrame(Http2.java:138)
	at sc.io.grpc.okhttp.OkHttpClientTran

Docs에 해당 에러에 대한 정보를 찾을 수 없어서요.

해결 방법을 알 수 있을까요?

감사합니다.

@taeyeon-Kim
Copy link
Contributor

taeyeon-Kim commented Jun 20, 2023

@jeongum 안녕하세요. (_ _)
FRAME_SIZE_ERROR: 4740180를 보니.... 아마도.. grpc에 대한 통신 이슈가 있는걸로 보입니다.ㅠ...
사내 네트워크 단에서 발생하는 이슈일 수도 있으니... 관련된 부분을 확인 부탁드립니다.

@jeongum
Copy link
Author

jeongum commented Jun 20, 2023

@taeyeon-Kim 넵 감사합니다!

@tngusdl3719
Copy link
Contributor

@jeongum 저도 도커로 구성했을때 동일한 에러가 발생했었어요
서버에서 jar실행하는 형태로 구성하니 정상적으로 통신되더라구요
jar로 실행해서 확인한번 해보셔도 좋을 것 같아요

@taeyeon-Kim
Copy link
Contributor

@tngusdl3719 아마 docker 컨테이너간 네트워크 이슈로 예상이되네요...

@jeongum 혹시 이슈해결됐을까요??

@jeongum
Copy link
Author

jeongum commented Jul 23, 2023

@taeyeon-Kim 안녕하세요,
Kubernetes 환경에서 gRPC 를 ingress-nginx로 서비스 하는데에 이슈가 있는 것으로 파악하였습니다.
현재는 Collector를 NodePort 서비스로 띄워 직접 붙는 방식을 택하여 사용하고 있습니다.
(nginx가 gRPC를 지원한지 꽤 된 것으로 알고있지만, 왜인지 잘 안붙네요 ㅠ..)

@OhSeungho
Copy link

저도 동일한 이슈 발생중입니다.
EKS v1.26 사용중이며, 비슷한 형식의 Dockerfile로 컨테이너 빌드하였고, tcpdump으로 확인해보니 동일한 request지만 컨테이너 Collector가 해당 문제를 발생시킵니다. EC2 vm 사용시 이슈 없음을 확인하였습니다.

@namsic6460
Copy link

저도 동일한 이슈 발생하였습니다
Route53 + Caddy / Route53 + ALB 두가지 환경에서 모두 발생하여 ALB 에 고정 IP를 연결하여 해결하였습니다

@OhSeungho
Copy link

이슈 해결되어 구성 내용을 남겨두겠습니다. api/collector을 각 Deploment로 구성하고 아래와 같은 설정으로 빌드하였습니다. 클러스터 내부 protocol을 TCP로 설정하였습니다.

# Dockerfile
FROM amazoncorretto:17
COPY scavenger-${KIND}-1.1.1.jar /
ENTRYPOINT ["sh", "-c", "java ${JAVA_OPTS} -jar /scavenger-${KIND}-1.1.1.jar"]

클러스터 core-dns(collector.scavenger.svc.cluster.local)로 문제없이 통신되는 부분 확인하였습니다.

@LenKIM
Copy link

LenKIM commented Oct 27, 2023

docker 를 사용하지 않고, Local 에서 실행했을 때도 같은 이슈가 발생합니다. 해당 Repo의 demo 모듈을 타켓으로 실행했습니다.

2023-10-27 15:04:00.838  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.Publisher     : [scavenger] polling dynamic config
2023-10-27 15:04:00.840  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.Publisher     : [scavenger] trying to resolve collector url by accessing http://localhost:8080/javaagent/v5/initConfig
2023-10-27 15:04:00.968  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.Publisher     : [scavenger] resolved collector url is localhost:8080
2023-10-27 15:04:00.968  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.GrpcClient    : [scavenger] creating new grpc client. host is localhost:8080
2023-10-27 15:04:01.046 ERROR 9801 --- [enger scheduler] c.n.s.javaagent.scheduling.Scheduler     : [scavenger] poll config failed

sc.io.grpc.StatusRuntimeException: UNAUTHENTICATED
        at sc.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[scavenger-agent-java-1.1.1.jar:1.1.1]
        at sc.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[scavenger-agent-java-1.1.1.jar:1.1.1]
        at sc.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[scavenger-agent-java-1.1.1.jar:1.1.1]
        at sc.com.navercorp.scavenger.model.GrpcAgentServiceGrpc$GrpcAgentServiceBlockingStub.pollConfig(GrpcAgentServiceGrpc.java:263) ~[scavenger-agent-java-1.1.1.jar:1.1.1]

gRPC 통신이 계속 실패하는데, 조치해볼만한 부분이 있을까요?

@namsic6460
Copy link

docker 를 사용하지 않고, Local 에서 실행했을 때도 같은 이슈가 발생합니다. 해당 Repo의 demo 모듈을 타켓으로 실행했습니다.

2023-10-27 15:04:00.838  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.Publisher     : [scavenger] polling dynamic config
2023-10-27 15:04:00.840  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.Publisher     : [scavenger] trying to resolve collector url by accessing http://localhost:8080/javaagent/v5/initConfig
2023-10-27 15:04:00.968  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.Publisher     : [scavenger] resolved collector url is localhost:8080
2023-10-27 15:04:00.968  INFO 9801 --- [enger scheduler] c.n.s.javaagent.publishing.GrpcClient    : [scavenger] creating new grpc client. host is localhost:8080
2023-10-27 15:04:01.046 ERROR 9801 --- [enger scheduler] c.n.s.javaagent.scheduling.Scheduler     : [scavenger] poll config failed

sc.io.grpc.StatusRuntimeException: UNAUTHENTICATED
        at sc.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[scavenger-agent-java-1.1.1.jar:1.1.1]
        at sc.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[scavenger-agent-java-1.1.1.jar:1.1.1]
        at sc.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[scavenger-agent-java-1.1.1.jar:1.1.1]
        at sc.com.navercorp.scavenger.model.GrpcAgentServiceGrpc$GrpcAgentServiceBlockingStub.pollConfig(GrpcAgentServiceGrpc.java:263) ~[scavenger-agent-java-1.1.1.jar:1.1.1]

gRPC 통신이 계속 실패하는데, 조치해볼만한 부분이 있을까요?

위 에러는 토큰이 맞지 않아 인증에 실패했을 때 발생하는 에러로 보입니다. 구성 파일에 잘못된 부분은 없나요?

@taeyeon-Kim
Copy link
Contributor

@LenKIM 안녕하세요 :)
#23 이슈랑도 관련있어보이네요.
@namsic6460 님이 말씀주신것처럼 UNAUTHENTICATED 는 API key가 올바르지 않을때 발생합니다.

@LenKIM
Copy link

LenKIM commented Oct 30, 2023

@taeyeon-Kim @namsic6460 해결했습니다.
h2 포트로 기존에 물려있던 것이 있었고, 그거 때문에 profile 을 h2로 하더라도 문제가 계속해서 발생했습니다.
감사합니다!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants