RuntimeError: expected scalar type Half but found Char #27

cmazzoni87 · 2023-06-21T20:43:52Z

Trying to finetune 7b model using sample on the documentation but getting this error, has anyone seen this before?:
Log is here:
bin /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA SETUP: CUDA runtime path found: /home/ec2-user/anaconda3/envs/pytorch_p310/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Downloading (…)model.bin.index.json: 16.9kB [00:00, 57.6MB/s]
Downloading (…)l-00001-of-00002.bin: 100%|██| 9.95G/9.95G [01:00<00:00, 166MB/s]
Downloading (…)l-00002-of-00002.bin: 100%|██| 4.48G/4.48G [00:36<00:00, 123MB/s]
Downloading (…)okenizer_config.json: 100%|█████| 220/220 [00:00<00:00, 1.93MB/s]
Downloading (…)/main/tokenizer.json: 2.73MB [00:00, 17.2MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████| 281/281 [00:00<00:00, 2.00MB/s]

Parameters:
-------config-------
dataset='sql-create-context-train.json'
data_type='alpaca'
lora_out_dir='./falcon-7b-alpaca/'
lora_apply_dir=None
weights='tiiuae/falcon-7b'
target_modules=['query_key_value']

------training------
mbatch_size=1
batch_size=2
gradient_accumulation_steps=2
epochs=3
lr=0.0003
cutoff_len=256
lora_r=8
lora_alpha=16
lora_dropout=0.05
val_set_size=0.2
gradient_checkpointing=False
gradient_checkpointing_ratio=1
warmup_steps=5
save_steps=50
save_total_limit=3
logging_steps=5
checkpoint=False
skip=False
world_size=1
ddp=False
device_map='auto'

trainable params: 2359296 || all params: 6924080000 || trainable%: 0.03407378308742822
Downloading and preparing dataset json/default to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4...
Downloading data files: 100%|██████████████████| 1/1 [00:00<00:00, 10230.01it/s]
Extracting data files: 100%|█████████████████████| 1/1 [00:00<00:00, 221.66it/s]
Dataset json downloaded and prepared to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data.
100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 175.08it/s]
Run eval every 7857 steps
PyTorch: setting up devices
The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-).
Using the WANDB_DISABLED environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using cuda_amp half precision backend
wandb: Tracking run with wandb version 0.15.4
wandb: W&B syncing is set to offline in this directory.
wandb: Run wandb online or set WANDB_MODE=online to enable cloud syncing.
The following columns in the training set don't have a corresponding argument in PeftModelForCausalLM.forward and have been ignored: instruction, output, token_type_ids, input. If instruction, output, token_type_ids, input are not expected by PeftModelForCausalLM.forward, you can safely ignore this message.
***** Running training *****
Num examples = 62861
Num Epochs = 3
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 2
Gradient Accumulation steps = 2
Total optimization steps = 94290
0%| | 0/94290 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync /home/ec2-user/SageMaker/wandb/offline-run-20230621_200524-jdrmalza
wandb: Find logs at: ./wandb/offline-run-20230621_200524-jdrmalza/logs
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/pytorch_p310/bin/falcontune", line 33, in
sys.exit(load_entry_point('falcontune==0.1.0', 'console_scripts', 'falcontune')())
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/run.py", line 88, in main
args.func(args)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/finetune.py", line 162, in finetune
trainer.train()
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1521, in train
return inner_training_loop(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1763, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2499, in training_step
loss = self.compute_loss(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2531, in compute_loss
outputs = model(**inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/peft_model.py", line 678, in forward
return self.base_model(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 1070, in forward
transformer_outputs = self.transformer(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 965, in forward
outputs = block(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 698, in forward
attn_outputs = self.self_attention(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 291, in forward
fused_qkv = self.query_key_value(hidden_states) # [batch_size, seq_length, 3 x hidden_size]
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/tuners/lora.py", line 565, in forward
result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: expected scalar type Half but found Char

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: expected scalar type Half but found Char #27

RuntimeError: expected scalar type Half but found Char #27

cmazzoni87 commented Jun 21, 2023

RuntimeError: expected scalar type Half but found Char #27

RuntimeError: expected scalar type Half but found Char #27

Comments

cmazzoni87 commented Jun 21, 2023