Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BackendMemory::Create must release all errors #95

Merged
merged 1 commit into from
Feb 7, 2024
Merged

BackendMemory::Create must release all errors #95

merged 1 commit into from
Feb 7, 2024

Conversation

lhrios
Copy link
Contributor

@lhrios lhrios commented Jan 15, 2024

This PR changes BackendMemory::Create to make it release errors even if the allocation request succeeded. BackendMemory::Create allows callers to try different allocation types. However, it was not properly releasing the error associated with attempts that failed prior to the one that succeeded.

Valgrind generates the following error while debugging the Triton Server:

==76== 182,619 (70,920 direct, 111,699 indirect) bytes in 1,773 blocks are definitely lost in loss record 1,321 of 1,321
==76==    at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==76==    by 0x5319270: TRITONSERVER_ErrorNew (in /opt/tritonserver/lib/libtritonserver.so)
==76==    by 0x5184A53: TRITONBACKEND_MemoryManagerAllocate (in /opt/tritonserver/lib/libtritonserver.so)
==76==    by 0x15F16833: triton::backend::BackendMemory::Create(TRITONBACKEND_MemoryManager*, triton::backend::BackendMemory::AllocationType, long, unsigned long, triton::backend::BackendMemory**) (in /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so)
==76==    by 0x15F17B5A: triton::backend::BackendMemory::Create(TRITONBACKEND_MemoryManager*, std::vector<triton::backend::BackendMemory::AllocationType, std::allocator<triton::backend::BackendMemory::AllocationType> > const&, long, unsigned long, triton::backend::BackendMemory**) (in /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so)
==76==    by 0x15EC4945: triton::backend::onnxruntime::ModelInstanceState::SetStringInputTensor(TRITONBACKEND_Request**, unsigned int, std::vector<TRITONBACKEND_Response*, std::allocator<TRITONBACKEND_Response*> >*, char const*, std::vector<char const*, std::allocator<char const*> >*, bool*) (in /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so)
==76==    by 0x15EC748A: triton::backend::onnxruntime::ModelInstanceState::SetInputTensors(unsigned long, TRITONBACKEND_Request**, unsigned int, std::vector<TRITONBACKEND_Response*, std::allocator<TRITONBACKEND_Response*> >*, triton::backend::BackendInputCollector*, std::vector<char const*, std::allocator<char const*> >*, bool*) (in /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so)
==76==    by 0x15ECDF6E: triton::backend::onnxruntime::ModelInstanceState::ProcessRequests(TRITONBACKEND_Request**, unsigned int) (in /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so)
==76==    by 0x15ED1B83: TRITONBACKEND_ModelInstanceExecute (in /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so)
==76==    by 0x51A3F13: triton::core::TritonModelInstance::Execute(std::vector<TRITONBACKEND_Request*, std::allocator<TRITONBACKEND_Request*> >&) (in /opt/tritonserver/lib/libtritonserver.so)
==76==    by 0x51A427A: triton::core::TritonModelInstance::Schedule(std::vector<std::unique_ptr<triton::core::InferenceRequest, std::default_delete<triton::core::InferenceRequest> >, std::allocator<std::unique_ptr<triton::core::InferenceRequest, std::default_delete<triton::core::InferenceRequest> > > >&&) (in /opt/tritonserver/lib/libtritonserver.so)
==76==    by 0x52B505C: triton::core::Payload::Execute(bool*) (in /opt/tritonserver/lib/libtritonserver.so)
==76==    by 0x51A86A3: triton::core::TritonModelInstance::TritonBackendThread::BackendThread() (in /opt/tritonserver/lib/libtritonserver.so)
==76==    by 0x6940252: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==76==    by 0x6B24AC2: start_thread (pthread_create.c:442)
==76==    by 0x6BB5A03: clone (clone.S:100)

@Tabrizian
Copy link
Member

Thanks for your contribution @lhrios! Could you please sign the CLA as instructed here: https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md

@casassg
Copy link

casassg commented Jan 17, 2024

@Tabrizian we signed the CLA from Block (including Luis here as contributor) for this PR: triton-inference-server/onnxruntime_backend#195 Happy to re-send it over.

@Tabrizian
Copy link
Member

Thanks @casassg. I think in that case we just need to run it through the CI and merge it.

@lhrios
Copy link
Contributor Author

lhrios commented Jan 18, 2024

run it through the CI

Hey @Tabrizian . By running it through the CI you mean following these steps or has it been automated?

@Tabrizian
Copy link
Member

At this point there are no actions required on your end. I'll run it through the CI to make sure there are no issues with that.

@lhrios
Copy link
Contributor Author

lhrios commented Jan 29, 2024

At this point there are no actions required on your end. I'll run it through the CI to make sure there are no issues with that.

Hey @Tabrizian. Would you know if this will be released as part of the next Triton version (the one after 23.12)?

@Tabrizian Tabrizian merged commit a06e9a1 into triton-inference-server:main Feb 7, 2024
1 check passed
@Tabrizian
Copy link
Member

@lhrios This should be part of the 24.02 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants