Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Phi-3-vision-128k-instruct 跑模型在8卡上出现 “Expected all tensors to be on the same device, but found at least two devices” #2633

Open
3 tasks done
dreamerlin opened this issue Oct 22, 2024 · 4 comments
Assignees
Labels

Comments

@dreamerlin
Copy link

dreamerlin commented Oct 22, 2024

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

image

Reproduction

backend_config = PytorchEngineConfig(tp=8, session_len=session_len)
pipe = lmdeploy.pipeline(args.checkpoint, backend_config=backend_config, chat_template_config=ChatTemplateConfig(model_name='phi-3'))

Environment

sys.platform: linux
Python: 3.9.19 (main, May  6 2024, 19:43:03) [GCC 11.2.0]                                                                                                                          CUDA available: False
MUSA available: False                                                                                                                                                          numpy_random_seed: 2147483648                                                                                                                                                      GCC: gcc (GCC) 9.4.0
PyTorch: 2.0.1
PyTorch compiling details: PyTorch built with:                                                                                                                                       - GCC 9.3
 - C++ Version: 201703                                                                                                                                                              Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications                                                             - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)                                                                                                     - OpenMP 201511 (a.k.a. OpenMP 4.5)                                                                                                                                               - LAPACK is enabled (usually provided by MKL)                                                                                                                                     - NNPACK is enabled
- CPU capability usage: AVX2                                                                                                                                                       - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,                                                                                                                                                                                                                                                         TorchVision: 0.15.2
LMDeploy: 0.6.1+2323e69                                                                                                                                                            transformers: 4.45.2
gradio: 4.44.1
fastapi: 0.103.2
pydantic: 2.9.2
triton: 3.0.0

Error traceback

Traceback (most recent call last):
File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/asyncio/events.py", line 80, in _run                                                                           self._context.run(self._callback, *self._args)
File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/site-packages/lmdeploy/vl/engine.py", line 27, in _raise_exception_on_finish
raise e
 File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/site-packages/lmdeploy/vl/engine.py", line 23, in _raise_exception_on_finish
task.result()
File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/concurrent/futures/thread.py", line 58, in run                                                                  result = self.fn(*self.args, **self.kwargs)
File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/site-packages/lmdeploy/vl/engine.py", line 169, in forward
outputs = self.model.forward(*func_inputs)                                                                                                                                      File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
 File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/site-packages/lmdeploy/vl/model/phi3_vision.py", line 193, in forward
image_features = _process_image_embedding(                                                                                                                                       File "/mnt/petrelfs/wangweiyun/miniconda3/envs/lmdploy/lib/python3.9/site-packages/lmdeploy/vl/model/phi3_vision.py", line 64, in _process_image_embedding
glb_img = torch.cat([glb_img, temp_glb_GN],                                                                                                                                   
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:5 and cuda:6! (when checking argument for argument tensors in method wrapper_CUDA_cat)
@dreamerlin
Copy link
Author

8卡跑的

@dreamerlin
Copy link
Author

dreamerlin commented Oct 22, 2024

顺带,这行代码是不是有问题 https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/vl/model/phi3_vision.py#L61
是不是应该是

temp_glb_GN = self.glb_GN.repeat(1, H // 2, 1, 1)

@dreamerlin
Copy link
Author

我自己改了代码后(只改了和 device 有关的代码),跑8k with 2 images,做 text needle 任务,输出有问题
image

你们确保 phi 的代码逻辑没错误嘛

@RunningLeon
Copy link
Collaborator

RunningLeon commented Oct 23, 2024

@dreamerlin hi, it seems that the implementation in lmdeploy is based on the old version of the phi3 model, see this commit https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/commit/866d1691437a49af79d5f3ad4a34c1750e08d163 . we may update it later.
BTW. Could you provide the sample codes with image files to reproduce? THX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants