Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] AWQ量化InternVL2 20B输出无意义的杂乱文本 #2650

Open
1 of 3 tasks
diandianliu opened this issue Oct 24, 2024 · 7 comments
Open
1 of 3 tasks

[Bug] AWQ量化InternVL2 20B输出无意义的杂乱文本 #2650

diandianliu opened this issue Oct 24, 2024 · 7 comments
Assignees

Comments

@diandianliu
Copy link

diandianliu commented Oct 24, 2024

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

训练后的模型,使用AWQ量化InternVL2-20B输出无意义的杂乱文本

Reproduction

量化(因为开发环境没有互联网,pth_text_only通过脚本下下载的)
下载脚本
from datasets import load_dataset
traindata = load_dataset('ptb_text_only', 'penn_treebank', split='train')

lmdeploy lite auto_awq
/root/models/internvl2-26B
--calib-dataset 'ptb'
--calib-samples 128
--calib-seqlen 2048
--w-bits 4
--w-group-size 128
--work-dir /root/models/internvl2-26B_awq_4bit

运行
lmdeploy serve api_server
/root/models/internvl2-26B_awq_4bit
--server-name 0.0.0.0
--server-port 23333
--tp2

Environment

python3.9.19
NVIDIA V100
pytorch 2.2.2+cul21
TorchVision 0.17.2+cul21
LMDeploy 0.5.1+unknow
transformer 4.43.3

--内网无法复制

Error traceback

量化时出现告警,不知道有没有影响:
Using the latest cached version of the module from /root/mydataset/modules/datasets_modules/datasets/ptb_text_only/8d1b97746fb9765d140e569ec5ddd35e20af4d37761f5e1bf357ea0b081f2c1f (last modified on Sat Feb 10 16:50:50 2024) since it couldn't be found locally at ptb_text_only
Token indices sequence length is longer than the specified maximum sequence length for this model (1085165> 4096). Running this sequence through the model will result in indexing errors
@sjzhou4
Copy link

sjzhou4 commented Oct 24, 2024

理解的vl模型的量化和lm模型的AWQ量化有差异吧,AWQ量化需要使用数据集进行处理,但是vl的输入是images feature + query等信息,包括embedding,直接使用query进行量化,效果应该会很差吧

@diandianliu
Copy link
Author

理解的vl模型的量化和lm模型的AWQ量化有差异吧,AWQ量化需要使用数据集进行处理,但是vl的输入是images feature + query等信息,包括embedding,直接使用query进行量化,效果应该会很差吧

回答的压根不是句子,我看教程只用lmdeploy lite auto_awq 量化的,不知道哪一步错了

1 similar comment
@diandianliu
Copy link
Author

理解的vl模型的量化和lm模型的AWQ量化有差异吧,AWQ量化需要使用数据集进行处理,但是vl的输入是images feature + query等信息,包括embedding,直接使用query进行量化,效果应该会很差吧

回答的压根不是句子,我看教程只用lmdeploy lite auto_awq 量化的,不知道哪一步错了

@AllentDan
Copy link
Collaborator

没看懂,issue 里面说是 internvl 模型,但是命令给的都是 internlm。可以排查下变量:

  1. 模型经过了再次训练
  2. 数据集是下载的

@diandianliu
Copy link
Author

没看懂,issue 里面说是 internvl 模型,但是命令给的都是 internlm。可以排查下变量:

  1. 模型经过了再次训练
  2. 数据集是下载的

抱歉,路径写错了,已更正,正准备用原始模型试试;步骤是正确的吗(只需执行lmdeploy lite auto_awq,无需其他操作)

@diandianliu
Copy link
Author

没看懂,issue 里面说是 internvl 模型,但是命令给的都是 internlm。可以排查下变量:

  1. 模型经过了再次训练
  2. 数据集是下载的

量化时出现这句有没有影响
Token indices sequence length is longer than the specified maximum sequence length for this model (1085165> 4096). Running this sequence through the model will result in indexing errors

@AllentDan
Copy link
Collaborator

是不是你训的模型 tokenizer 有点问题?1085165> 4096 差距有点大了,其他模型也有,但是量化也没啥问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants