You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is what I tried (see below for installed packages):
$ torchrun --nproc_per_node 1 example.py --ckpt_dir ../codellama/CodeLlama-7b/ --tokenizer_path ../codellama/CodeLlama-7b/tokenizer.model
Traceback (most recent call last):
File "/home/xxxxx/pyllama/example.py", line 80, in <module>
fire.Fire(main)
File "/home/xxxxx/miniconda3/envs/llama2/lib/python3.11/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
..
File "/home/xxxxx/miniconda3/envs/llama2/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py", line 1268, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in
[2024-01-01 20:58:30,998] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 1814953) of binary: /home/xxxxx/miniconda3/envs/llama2/bin/python
Traceback (most recent call last):
..
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Below seems to work, but I don't get any response whatsoever:
KV_CACHE_IN_GPU=0 python inference.py --ckpt_dir ../codellama/CodeLlama-7b/ --tokenizer_path ../codellama/CodeLlama-7b/tokenizer.model
.. <after waiting for several seconds .. typed in the following command and pressed Enter> ..
Prompt:['I believe in ']
<no response whatsoever>
I tried both pytorch cuda and non-cuda packages from https://pytorch.org/get-started/locally/. Example: conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia but same NCCL error in torchrun and no output from inference.py
I am on an HP workstation running ubuntu (23.04 (Lunar Lobster))
CPU(s): 4
On-line CPU(s) list: 0-3
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU W3565 @ 3.20GHz
CPU family: 6
Model: 26
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Stepping: 5
I am new to AI and trying to use
llama2
model locally usingpyllama
.I tried different options, but nothing seems to work. I downloaded llama using https://github.com/facebookresearch/llama.
Here is what I tried (see below for installed packages):
Below seems to work, but I don't get any response whatsoever:
I tried both pytorch cuda and non-cuda packages from https://pytorch.org/get-started/locally/. Example:
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
but same NCCL error in torchrun and no output frominference.py
I am on an HP workstation running ubuntu (23.04 (Lunar Lobster))
The text was updated successfully, but these errors were encountered: