-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for more Details on Training the Reward Model #2
Comments
Please show me the error message. |
Ubuntu 20.04.1 flash-attn: 2.3.3 partial output below: Thanks for your help! [2024-11-07 20:47:17,327][FK][WARNING] - Error locating target 'models.llama.LlamaModelForSequenceClassification.from_pretrained', see chained exception above.
full_key: model
Traceback (most recent call last):
File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/utils.py", line 639, in _locate
obj = getattr(obj, part)
obj = getattr(obj, part)AttributeError
: module 'models' has no attribute 'llama'
AttributeError
During handling of the above exception, another exception occurred:
···
ImportErrorTraceback (most recent call last):
: File "/home/data/daiqun/dpo-trajectory-reasoning-main/trainer_base_ds_mul.py", line 328, in main
Error loading 'models.llama.LlamaModelForSequenceClassification.from_pretrained':
ImportError('/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/flash_attn_2_cuda.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNK3c105Error4whatEv')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/data/daiqun/dpo-trajectory-reasoning-main/trainer_base_ds_mul.py", line 328, in main
model = hydra.utils.call(cfg.model, cfg.model_name_or_path, state_dict=pretrain_state_dict)
File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 325, in instantiate_node
return instantiate_node(
File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 325, in instantiate_node
_target_ = _resolve_target(node.get(_Keys.TARGET), full_key)
File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 139, in _resolve_target
raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error locating target 'models.llama.LlamaModelForSequenceClassification.from_pretrained', see chained exception above.
full_key: model
···
Traceback (most recent call last):
File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/utils.py", line 639, in _locate
obj = getattr(obj, part)
AttributeError: module 'models' has no attribute 'llama'
During handling of the above exception, another exception occurred:
···
Traceback (most recent call last):
File "/home/data/daiqun/dpo-trajectory-reasoning-main/trainer_base_ds_mul.py", line 426, in <module>
raise ImportError(
ImportError: Error loading 'models.llama.LlamaModelForSequenceClassification.from_pretrained':
ImportError('/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/flash_attn_2_cuda.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNK3c105Error4whatEv')
The above exception was the direct cause of the following exception:
···
hydra.errors.InstantiationException: Error locating target 'models.llama.LlamaModelForSequenceClassification.from_pretrained', see chained exception above.
full_key: model |
Maybe you can first reinstall flash-attention following https://github.com/Dao-AILab/flash-attention If it still does not work, change the attn_implementation here from |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I am currently attempting to reproduce the experiments detailed in the section titled "Process Rewards Annotating (Taking LogiQA-v2 as an Example)" from your README.md file. However, as I reach Step 5, I have observed that there is no script available for training the reward model (RM). Could you please provide more information about the RM training process? Specifically, it would be extremely beneficial if you could share details about any files associated with the RM training or provide a script similar to those previously shared.
I have attempted to run the following script:
but it failed to execute. Perhaps there is an issue with my Python environment setup (even though I installed it according to the requirements.txt) or I might have misunderstood the code structure and executed the wrong script.
Thank you very much for your assistance. Please excuse my limited coding experience.
The text was updated successfully, but these errors were encountered: