Enable calls to GenAI-Perf for profile subcommand #52

dyastremsky · 2024-04-29T23:16:08Z

This pull request makes it so that Triton CLI calls GenAI-Perf for its profile subcommand. All arguments get passed through to GenAI-Perf.

As part of these changes, the previous profiler functionality has been removed from Triton CLI to avoid maintaining this behavior in both places.

Unit tests have successfully passed with these changes in place in an environment with PA and GenAI-Perf.

GenAI-Perf demo below. Apologies for the extra errors beforehand from some arg types. Note: This was a previous iteration, --task-llm is no longer used. The reason the output tokens are off is because the mock Python model doesn't actually use the max_token inputs, it just returns the input as output.
https://github.com/triton-inference-server/triton_cli/assets/58150256/97056e68-afb8-49e9-9e07-8f6601952c3a

tests/test_models/mock_llm/config.pbtxt

.gitignore

src/triton_cli/parser.py

fpetrini15 · 2024-05-01T18:47:22Z

Great work David! Only some minor tweaks and clarifications.

Just a thought, should we be setting a default value for the required arguments? For example, the profile subcommand requires concurrency as a parameter. During the arg sanitation process, would it make sense to add a default argument for it if the user didn't provide it. I think this would abstract some complexity away from the user.

CC: @rmccorm4 for thoughts.

dyastremsky · 2024-05-01T19:56:42Z

Great work David! Only some minor tweaks and clarifications.

Just a thought, should we be setting a default value for the required arguments? For example, the profile subcommand requires concurrency as a parameter. During the arg sanitation process, would it make sense to add a default argument for it if the user didn't provide it. I think this would abstract some complexity away from the user.

CC: @rmccorm4 for thoughts.

I think we're 100% going for a passthrough approach here. This makes the Triton CLI extensible and maintains all logic unique to the tools in their own repositories. If we start moving over logic, then responsibilities are starting to overlap and we could be duplicating code. If there is a required arg for profile and we think it should set a default, Model Analyzer is the right place to fix that.

src/triton_cli/constants.py

tests/test_e2e.py

dyastremsky · 2024-05-02T17:02:07Z

This work is currently on hold. I will comment the ticket once that status changes.

rmccorm4 · 2024-05-09T02:30:56Z

I don't think it's a strong requirement at this time, probably more of a "Nice to have". I added it to CLI because it was easy when starting from scratch and I believe PyTriton supports 3.8+

I don't mind removing the support for now if it's not a simple fix to support or just unwanted.

dyastremsky · 2024-05-09T02:40:15Z

I don't think it's a strong requirement at this time. I added it to CLI because it was easy and I believe PyTriton supported 3.8+

I don't mind removing the support for now if it's not a simple fix to support or just unwanted.

If you're okay with it, I pushed a commit dropping it. I understand the desire for greater support though, so if we wanted to support 3.8-3.9, it would just require updating GenAI-Perf to remove the 3.10+ features and then move testing to 3.8. I'll start a conversation about it in the morning.

README.md

src/triton_cli/parser.py

rmccorm4

Looking really good! Only minor comments

README.md

pyproject.toml

src/triton_cli/parser.py

tests/test_cli.py

tests/test_e2e.py

src/triton_cli/profile.py

Co-authored-by: Ryan McCormick <[email protected]>

dyastremsky · 2024-05-09T19:15:10Z

As per the offline discussion, I have removed the 3.8 test for now.

@rmccorm4 @fpetrini15 This is ready for another round of review.

tests/test_e2e.py

src/triton_cli/parser.py

src/triton_cli/__init__.py

Co-authored-by: Ryan McCormick <[email protected]>

tests/test_e2e.py

rmccorm4

LGTM other than this fix for test_non_llm: https://github.com/triton-inference-server/triton_cli/pull/52/files#r1596116006

Nice work David! 🚀

rmccorm4

🥳

dyastremsky marked this pull request as ready for review May 1, 2024 01:18

dyastremsky self-assigned this May 1, 2024

dyastremsky commented May 1, 2024

View reviewed changes

tests/test_models/mock_llm/config.pbtxt Show resolved Hide resolved

dyastremsky requested review from fpetrini15 and rmccorm4 and removed request for fpetrini15 May 1, 2024 15:29

fpetrini15 reviewed May 1, 2024

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

fpetrini15 reviewed May 1, 2024

View reviewed changes

src/triton_cli/parser.py Outdated Show resolved Hide resolved

fpetrini15 reviewed May 1, 2024

View reviewed changes

src/triton_cli/parser.py Outdated Show resolved Hide resolved

fpetrini15 reviewed May 1, 2024

View reviewed changes

src/triton_cli/parser.py Outdated Show resolved Hide resolved

fpetrini15 reviewed May 1, 2024

View reviewed changes

src/triton_cli/parser.py Outdated Show resolved Hide resolved

rmccorm4 reviewed May 1, 2024

View reviewed changes

src/triton_cli/constants.py Outdated Show resolved Hide resolved

rmccorm4 reviewed May 2, 2024

View reviewed changes

tests/test_e2e.py Outdated Show resolved Hide resolved

dyastremsky marked this pull request as draft May 2, 2024 17:02

dyastremsky added 10 commits May 8, 2024 08:51

Draft initial code

499b44a

Run pre-commit, exclude dir in gitignore

c6b987e

Remove unused profiler

d1121b1

Fix linter type errors

5d49147

Fix import package

734218b

Fix help messages and errors for subparser args

c691448

Correct parsing behavior

6ce383b

Update parser to work with sub-commands and GenAi-Perf.

79d20e4

Be more specific with test file exclusions in gitignore

e7234c8

Remove noop line, clarify prune comment

895f24b

dyastremsky force-pushed the dyas-profile branch from 0b37a32 to 895f24b Compare May 8, 2024 15:51

Remove MA/PA calls

87432d9

Require Python 3.10

6f3738e

rmccorm4 reviewed May 9, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

rmccorm4 reviewed May 9, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

rmccorm4 reviewed May 9, 2024

View reviewed changes

src/triton_cli/parser.py Outdated Show resolved Hide resolved

Move helper functions to other file, update README

6d42883

dyastremsky force-pushed the dyas-profile branch from 60ae9a2 to 6d42883 Compare May 9, 2024 03:02

rmccorm4 reviewed May 9, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

pyproject.toml Show resolved Hide resolved

src/triton_cli/parser.py Show resolved Hide resolved

tests/test_cli.py Show resolved Hide resolved

tests/test_e2e.py Show resolved Hide resolved

dyastremsky added 2 commits May 8, 2024 20:08

Remove outdated comment

cc4976a

Update tritonclient version, add comment to explain mocks.

ed9c5e4

rmccorm4 reviewed May 9, 2024

View reviewed changes

src/triton_cli/profile.py Outdated Show resolved Hide resolved

src/triton_cli/profile.py Outdated Show resolved Hide resolved

dyastremsky and others added 2 commits May 9, 2024 10:08

Support both tensorrtllm and trtllm backend options

b580003

Co-authored-by: Ryan McCormick <[email protected]>

Fix backend if statement

67c7763

dyastremsky requested review from fpetrini15 and rmccorm4 May 9, 2024 22:13

Bump version to 0.0.8

3d31012

rmccorm4 reviewed May 9, 2024

View reviewed changes

tests/test_e2e.py Show resolved Hide resolved

rmccorm4 reviewed May 9, 2024

View reviewed changes

tests/test_e2e.py Show resolved Hide resolved

rmccorm4 reviewed May 9, 2024

View reviewed changes

src/triton_cli/parser.py Show resolved Hide resolved

rmccorm4 reviewed May 9, 2024

View reviewed changes

src/triton_cli/__init__.py Outdated Show resolved Hide resolved

dyastremsky and others added 3 commits May 9, 2024 16:45

Add comment clarifying special handling

d9ae248

Co-authored-by: Ryan McCormick <[email protected]>

Add dev suffix to version

6630b73

Co-authored-by: Ryan McCormick <[email protected]>

Revert test deletion

07bd337

rmccorm4 reviewed May 10, 2024

View reviewed changes

tests/test_e2e.py Show resolved Hide resolved

rmccorm4 reviewed May 10, 2024

View reviewed changes

Remove unnecessary raises block in test.

b79f56c

rmccorm4 approved these changes May 10, 2024

View reviewed changes

dyastremsky merged commit 085ad83 into main May 10, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable calls to GenAI-Perf for profile subcommand #52

Enable calls to GenAI-Perf for profile subcommand #52

dyastremsky commented Apr 29, 2024 •

edited by rmccorm4

Loading

fpetrini15 commented May 1, 2024

dyastremsky commented May 1, 2024

dyastremsky commented May 2, 2024

rmccorm4 commented May 9, 2024 •

edited

Loading

dyastremsky commented May 9, 2024 •

edited

Loading

rmccorm4 left a comment

dyastremsky commented May 9, 2024

rmccorm4 left a comment

rmccorm4 left a comment

Enable calls to GenAI-Perf for profile subcommand #52

Enable calls to GenAI-Perf for profile subcommand #52

Conversation

dyastremsky commented Apr 29, 2024 • edited by rmccorm4 Loading

fpetrini15 commented May 1, 2024

dyastremsky commented May 1, 2024

dyastremsky commented May 2, 2024

rmccorm4 commented May 9, 2024 • edited Loading

dyastremsky commented May 9, 2024 • edited Loading

rmccorm4 left a comment

Choose a reason for hiding this comment

dyastremsky commented May 9, 2024

rmccorm4 left a comment

Choose a reason for hiding this comment

rmccorm4 left a comment

Choose a reason for hiding this comment

dyastremsky commented Apr 29, 2024 •

edited by rmccorm4

Loading

rmccorm4 commented May 9, 2024 •

edited

Loading

dyastremsky commented May 9, 2024 •

edited

Loading