You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████| 2/2 [00:42<00:00, 21.04s/it]
Device map for lora: auto
falcon-7b-alpaca loaded
/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py:318: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
Traceback (most recent call last):
File "/data/home/yaokj5/anaconda3/envs/falcon/bin/falcontune", line 33, in <module>
sys.exit(load_entry_point('falcontune==0.1.0', 'console_scripts', 'falcontune')())
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/run.py", line 88, in main
args.func(args)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/generate.py", line 71, in generate
generated_ids = model.generate(
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/generate.py", line 27, in autocast_generate
return self.model.non_autocast_generate(*args, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/peft/peft_model.py", line 731, in generate
outputs = self.base_model.generate(**kwargs)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/transformers/generation/utils.py", line 1565, in generate
return self.sample(
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/transformers/generation/utils.py", line 2612, in sample
outputs = self(
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 1072, in forward
transformer_outputs = self.transformer(
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 967, in forward
outputs = block(
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data/home/yaokj5/anaconda3/envs/falcon/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 722, in forward
mlp_output += attention_output
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
The text was updated successfully, but these errors were encountered:
File "/opt/conda/lib/python3.9/site-packages/bitsandbytes/nn/modules.py", line 336, in _save_to_state_dict
self.weight.data = undo_layout(self.state.CxB, self.state.tile_indices)
File "/opt/conda/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py", line 96, in undo_layout
outputs = torch.empty_like(tensor) # note: not using .index_copy because it was slower on cuda
torch.cuda.OutOfMemoryError: CUDA out of memory.
Run
Error
The text was updated successfully, but these errors were encountered: