Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide ARM release to allow support for FIL backend on Jetson and other ARM platforms #362

Open
blthayer opened this issue Jul 12, 2023 · 20 comments

Comments

@blthayer
Copy link

blthayer commented Jul 12, 2023

Hello,

The FIL backend installation instructions indicate that:

The FIL backend is a part of Triton and can be installed via the methods described in the main Triton documentation.

However, when I download the latest Jetson release (tritonserver2.35.0-jetpack5.1.2.tgz) for Triton, it does not appear to have the FIL backend:

$ tar --list tritonserver/backends -f tritonserver2.35.0-jetpack5.1.2.tgz
...
tritonserver/backends/identity/
...
tritonserver/backends/tensorflow/
...
tritonserver/backends/onnxruntime/
...
tritonserver/backends/python/
...
tritonserver/backends/pytorch/
...
tritonserver/backends/tensorrt/
...

(I also checked the tritonserver2.26.0-jetpack5.0.2.tgz release, and had to use --list ./backends instead of --list tritonserver/backends and reached the same conclusion: FIL is not included.)

Does a pre-built FIL backend for Jetson exist? Note that I do not wish to use NGC Docker containers as those come with the kitchen sink of built-in tools which is not appropriate for my IoT application.

PS - I'm not sure if this issue belongs in this repository, or the server repo so I'm cross-posting this issue and will close one once I identify the appropriate location for the issue/ticket.

@wphicks
Copy link
Collaborator

wphicks commented Jul 17, 2023

The FIL backend has not yet been released for ARM due to a very rare and hard-to-reproduce bug that occasionally causes incorrect results to be returned. Even though the bug is extremely rare, we do not want to ship anything that could silently return incorrect output under any circumstances, so we have held off on release so far.

That being said, we do very much want to include the FIL backend with an upcoming ARM release, and we will likely prioritize this very soon.

@blthayer
Copy link
Author

Thanks for your response, @wphicks! I look forward to the ARM release. Are there any issues I can track/subscribe to related to this topic? Or will this ticket serve as the "ARM release placeholder/proxy" ticket?

@wphicks
Copy link
Collaborator

wphicks commented Jul 17, 2023

Or will this ticket serve as the "ARM release placeholder/proxy" ticket?

Let's make it that! If it's all right, I'm going to go ahead and adjust the title accordingly. Others, please do upvote if having the FIL backend on ARM is important to you. This is already pretty high priority, but additional interest always helps.

@wphicks wphicks changed the title Forest Inference Library (FIL) Backend on Jetson? Provide ARM release to allow support for FIL backend on Jetson and other ARM platforms Jul 17, 2023
@wphicks
Copy link
Collaborator

wphicks commented Jul 21, 2023

Status update: We have put this issue at high priority and will likely begin work on it middle of next week. We can't commit to any specific timeline for resolving it, since the bug is so difficult to reproduce, but we intend to get it figured out as soon as possible. Thanks very much to everyone who let us know that this is an important issue for you!

@blthayer
Copy link
Author

Thank you, @wphicks! Looking forward to updates as this progresses 😄

@wphicks
Copy link
Collaborator

wphicks commented Aug 16, 2023

I wanted to quickly update this thread, since we have not in awhile. Despite the radio silence, we have been actively working on this; it just takes some time to investigate such a rare and flaky bug. I still cannot promise any specific timeline, but this is our top priority issue for the FIL backend at the moment. I'll continue to update this thread with progress or lack thereof as we go.

@blthayer
Copy link
Author

Thanks for the update, @wphicks!

@blthayer
Copy link
Author

@wphicks - do you have any updates on this issue at this time? Thanks!

@wphicks
Copy link
Collaborator

wphicks commented Oct 10, 2023

@blthayer Indeed I do! We're wrapping up the final round of testing on ARM. With millions of runs under our belt and no sign of the previous issue, we're feeling pretty good about moving forward. I'll be speaking to one of the other members of the Triton team at the end of this week, and assuming there is consensus, we will try to get the FIL backend into the next ARM release.

I know your interest is specifically in Jetson, and I cannot promise that it will specifically be included in the very next Jetson release, but the major blocker to that will be gone. I'll update this thread with more specific Jetson timeline and information as soon as I've got that info.

@blthayer
Copy link
Author

@wphicks - thanks for the update! Looking forward to trying out the Jetson release when it's available. Thanks!

@guptap11
Copy link

@wphicks just to confirm this would be fixed in 23.10 triton server release ?

@wphicks
Copy link
Collaborator

wphicks commented Oct 26, 2023

Unfortunately, there's still some discussion ongoing internally about exactly when it will be released. I'll update here as soon as I have a firm timeline. Apologies for the delay!

@wphicks
Copy link
Collaborator

wphicks commented Oct 27, 2023

I've received the go-ahead to move forward with the ARM release. I'm going to work hard to get it onto at least some ARM platforms for 23.11, but I cannot guarantee that Jetson will be one of them. Code freeze for the 23.11 release of the FIL backend is 10/31, so I'll do as much as I can before then.

@wphicks
Copy link
Collaborator

wphicks commented Nov 1, 2023

Update: I was able to get all of the necessary changes into the FIL backend itself, and I've prepped the build workflow changes necessary to add it to our ARM releases. Those changes still need to finish testing and approval from the broader Triton team, but we are on track for 23.11 so far. Jetson remains the least likely platform to have support in 23.11, but it's not out of the question.

@wphicks
Copy link
Collaborator

wphicks commented Nov 4, 2023

My apologies folks; this came in just too close to the release window, so it looks like it is likely to land in 23.12. In the meantime, you can certainly build for ARM on your own. Everything is in place to support ARM platforms; we just need to actually get everything in order for the release.

@guptap11
Copy link

guptap11 commented Jan 8, 2024

@wphicks just wanted to confirm this has been released in 23.12 version or TIS ?

@wphicks
Copy link
Collaborator

wphicks commented Jan 16, 2024

@guptap11 My apologies; this fell through the cracks while I was on vacation. I just checked and it does not look like the FIL backend made it into the 23.12 Jetson release. I'm following up internally to ensure that it gets out into a stable release as soon as possible, but in the meantime I'm very sorry that this has been so long delayed (for some necessary and some more avoidable reasons).

I've provisioned a Jetson board for myself and am actively working on creating a build that we can at least make available as a preview. I'll update here as soon as possible to get folks off the side of the road.

The good news is that the ARM issue has been fully investigated and the build validated, so there are no fundamental technical blockers, only procedural ones. You will have an update from me by the end of the week (1/19) on exactly where we stand.

@wphicks
Copy link
Collaborator

wphicks commented Jan 20, 2024

Update: I've set up an environment to manually test and validate the build on Jetson, but I had some issues getting access to the correct board model so I have not actually been able to validate the build. I will update this thread again on 1/24 if not sooner.

@hcho3
Copy link
Collaborator

hcho3 commented Feb 13, 2024

Update: I built and tested the FIL backend using my Jetson Orin Nano board.

I submitted a request to the Triton team to include the FIL backend in the upcoming release of Triton. In the meanwhile, you can build the FIL backend from the source, as follows:

  1. Check out the r24.02 branch.
  2. Apply the following patch to ops/Dockerfile:
diff --git a/ops/Dockerfile b/ops/Dockerfile
index 079b869..18c5cb8 100644
--- a/ops/Dockerfile
+++ b/ops/Dockerfile
@@ -69,17 +69,6 @@ FROM ${BASE_IMAGE} as base

 ENV PATH="/root/miniconda3/bin:${PATH}"

-# In CI, CPU base image may not have curl, but it also does not need to update
-# the cuda keys
-RUN  if command -v curl; \
- then [ $(uname -m) = 'x86_64' ] \
- && curl -o /tmp/cuda-keyring.deb \
-      https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb \
- || curl -o /tmp/cuda-keyring.deb \
-      https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/cuda-keyring_1.0-1_all.deb; \
- dpkg -i /tmp/cuda-keyring.deb \
- && rm /tmp/cuda-keyring.deb; fi
-
 RUN apt-get update \
     && apt-get install --no-install-recommends -y \
       build-essential \
@@ -149,6 +138,7 @@ RUN source /conda/dev/bin/activate \
  && cmake \
       --log-level=VERBOSE \
       -GNinja \
+      -DCMAKE_CUDA_ARCHITECTURES=87 \
       -DCMAKE_BUILD_TYPE="${BUILD_TYPE}" \
       -DBUILD_TESTS="${BUILD_TESTS}" \
       -DTRITON_CORE_REPO_TAG="${TRITON_CORE_REPO_TAG}" \
  1. Build the Docker container with Triton server and FIL backend: ./build.sh server.
  2. Launch the Triton server (assuming the models are located in ./model_repo:
docker run -p 8000:8000 -p 8001:8001 --runtime=nvidia --gpus all \
   -v $PWD/model_repo:/models triton_fil \
   tritonserver --model-repository=/models

The Triton server will now serve models at port 8000 and 8001.

@hcho3
Copy link
Collaborator

hcho3 commented Feb 14, 2024

If you prefer not to build FIL from the source, you can download the (experimental) libs from the following link and place them under /opt/tritonserver/backends/fil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants