You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trying to finetune 7b model using sample on the documentation but getting this error, has anyone seen this before?:
Log is here:
bin /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA SETUP: CUDA runtime path found: /home/ec2-user/anaconda3/envs/pytorch_p310/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Downloading (…)model.bin.index.json: 16.9kB [00:00, 57.6MB/s]
Downloading (…)l-00001-of-00002.bin: 100%|██| 9.95G/9.95G [01:00<00:00, 166MB/s]
Downloading (…)l-00002-of-00002.bin: 100%|██| 4.48G/4.48G [00:36<00:00, 123MB/s]
Downloading (…)okenizer_config.json: 100%|█████| 220/220 [00:00<00:00, 1.93MB/s]
Downloading (…)/main/tokenizer.json: 2.73MB [00:00, 17.2MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████| 281/281 [00:00<00:00, 2.00MB/s]
trainable params: 2359296 || all params: 6924080000 || trainable%: 0.03407378308742822
Downloading and preparing dataset json/default to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4...
Downloading data files: 100%|██████████████████| 1/1 [00:00<00:00, 10230.01it/s]
Extracting data files: 100%|█████████████████████| 1/1 [00:00<00:00, 221.66it/s]
Dataset json downloaded and prepared to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data.
100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 175.08it/s]
Run eval every 7857 steps
PyTorch: setting up devices
The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-).
Using the WANDB_DISABLED environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using cuda_amp half precision backend
wandb: Tracking run with wandb version 0.15.4
wandb: W&B syncing is set to offline in this directory.
wandb: Run wandb online or set WANDB_MODE=online to enable cloud syncing.
The following columns in the training set don't have a corresponding argument in PeftModelForCausalLM.forward and have been ignored: instruction, output, token_type_ids, input. If instruction, output, token_type_ids, input are not expected by PeftModelForCausalLM.forward, you can safely ignore this message.
***** Running training *****
Num examples = 62861
Num Epochs = 3
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 2
Gradient Accumulation steps = 2
Total optimization steps = 94290
0%| | 0/94290 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync /home/ec2-user/SageMaker/wandb/offline-run-20230621_200524-jdrmalza
wandb: Find logs at: ./wandb/offline-run-20230621_200524-jdrmalza/logs
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/pytorch_p310/bin/falcontune", line 33, in
sys.exit(load_entry_point('falcontune==0.1.0', 'console_scripts', 'falcontune')())
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/run.py", line 88, in main
args.func(args)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/finetune.py", line 162, in finetune
trainer.train()
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1521, in train
return inner_training_loop(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1763, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2499, in training_step
loss = self.compute_loss(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2531, in compute_loss
outputs = model(**inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/peft_model.py", line 678, in forward
return self.base_model(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 1070, in forward
transformer_outputs = self.transformer(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 965, in forward
outputs = block(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 698, in forward
attn_outputs = self.self_attention(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 291, in forward
fused_qkv = self.query_key_value(hidden_states) # [batch_size, seq_length, 3 x hidden_size]
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/tuners/lora.py", line 565, in forward
result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: expected scalar type Half but found Char
The text was updated successfully, but these errors were encountered:
Trying to finetune 7b model using sample on the documentation but getting this error, has anyone seen this before?:
Log is here:
bin /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA SETUP: CUDA runtime path found: /home/ec2-user/anaconda3/envs/pytorch_p310/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Downloading (…)model.bin.index.json: 16.9kB [00:00, 57.6MB/s]
Downloading (…)l-00001-of-00002.bin: 100%|██| 9.95G/9.95G [01:00<00:00, 166MB/s]
Downloading (…)l-00002-of-00002.bin: 100%|██| 4.48G/4.48G [00:36<00:00, 123MB/s]
Downloading (…)okenizer_config.json: 100%|█████| 220/220 [00:00<00:00, 1.93MB/s]
Downloading (…)/main/tokenizer.json: 2.73MB [00:00, 17.2MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████| 281/281 [00:00<00:00, 2.00MB/s]
Parameters:
-------config-------
dataset='sql-create-context-train.json'
data_type='alpaca'
lora_out_dir='./falcon-7b-alpaca/'
lora_apply_dir=None
weights='tiiuae/falcon-7b'
target_modules=['query_key_value']
------training------
mbatch_size=1
batch_size=2
gradient_accumulation_steps=2
epochs=3
lr=0.0003
cutoff_len=256
lora_r=8
lora_alpha=16
lora_dropout=0.05
val_set_size=0.2
gradient_checkpointing=False
gradient_checkpointing_ratio=1
warmup_steps=5
save_steps=50
save_total_limit=3
logging_steps=5
checkpoint=False
skip=False
world_size=1
ddp=False
device_map='auto'
trainable params: 2359296 || all params: 6924080000 || trainable%: 0.03407378308742822
Downloading and preparing dataset json/default to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4...
Downloading data files: 100%|██████████████████| 1/1 [00:00<00:00, 10230.01it/s]
Extracting data files: 100%|█████████████████████| 1/1 [00:00<00:00, 221.66it/s]
Dataset json downloaded and prepared to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data.
100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 175.08it/s]
Run eval every 7857 steps
PyTorch: setting up devices
The default value for the training argument
--report_to
will change in v5 (from all installed integrations to none). In v5, you will need to use--report_to all
to get the same behavior as now. You should start updating your code and make this info disappear :-).Using the
WANDB_DISABLED
environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).Using cuda_amp half precision backend
wandb: Tracking run with wandb version 0.15.4
wandb: W&B syncing is set to
offline
in this directory.wandb: Run
wandb online
or set WANDB_MODE=online to enable cloud syncing.The following columns in the training set don't have a corresponding argument in
PeftModelForCausalLM.forward
and have been ignored: instruction, output, token_type_ids, input. If instruction, output, token_type_ids, input are not expected byPeftModelForCausalLM.forward
, you can safely ignore this message.***** Running training *****
Num examples = 62861
Num Epochs = 3
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 2
Gradient Accumulation steps = 2
Total optimization steps = 94290
0%| | 0/94290 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the
__call__
method is faster than using a method to encode the text followed by a call to thepad
method to get a padded encoding.wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync /home/ec2-user/SageMaker/wandb/offline-run-20230621_200524-jdrmalza
wandb: Find logs at: ./wandb/offline-run-20230621_200524-jdrmalza/logs
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/pytorch_p310/bin/falcontune", line 33, in
sys.exit(load_entry_point('falcontune==0.1.0', 'console_scripts', 'falcontune')())
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/run.py", line 88, in main
args.func(args)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/finetune.py", line 162, in finetune
trainer.train()
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1521, in train
return inner_training_loop(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1763, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2499, in training_step
loss = self.compute_loss(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2531, in compute_loss
outputs = model(**inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/peft_model.py", line 678, in forward
return self.base_model(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 1070, in forward
transformer_outputs = self.transformer(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 965, in forward
outputs = block(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 698, in forward
attn_outputs = self.self_attention(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 291, in forward
fused_qkv = self.query_key_value(hidden_states) # [batch_size, seq_length, 3 x hidden_size]
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/tuners/lora.py", line 565, in forward
result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: expected scalar type Half but found Char
The text was updated successfully, but these errors were encountered: