Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for more Details on Training the Reward Model #2

Open
brightest66 opened this issue Nov 7, 2024 · 3 comments
Open

Request for more Details on Training the Reward Model #2

brightest66 opened this issue Nov 7, 2024 · 3 comments

Comments

@brightest66
Copy link

Hi,

I am currently attempting to reproduce the experiments detailed in the section titled "Process Rewards Annotating (Taking LogiQA-v2 as an Example)" from your README.md file. However, as I reach Step 5, I have observed that there is no script available for training the reward model (RM). Could you please provide more information about the RM training process? Specifically, it would be extremely beneficial if you could share details about any files associated with the RM training or provide a script similar to those previously shared.

I have attempted to run the following script:

deepspeed trainer_base_ds_mul.py -cp conf/exp/reward/logiqav2 -cn llama2_7b_70bdistil_prm_v2_0

but it failed to execute. Perhaps there is an issue with my Python environment setup (even though I installed it according to the requirements.txt) or I might have misunderstood the code structure and executed the wrong script.

Thank you very much for your assistance. Please excuse my limited coding experience.

@SparkJiao
Copy link
Owner

Please show me the error message.

@brightest66
Copy link
Author

Ubuntu 20.04.1
nvidia 4090 * 2
cuda 12.4

flash-attn: 2.3.3
vllm: 0.2.5
transformers: 4.36.1
deepspeed: 0.12.2

partial output below: Thanks for your help!

[2024-11-07 20:47:17,327][FK][WARNING] - Error locating target 'models.llama.LlamaModelForSequenceClassification.from_pretrained', see chained exception above.
full_key: model
Traceback (most recent call last):
  File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/utils.py", line 639, in _locate
    obj = getattr(obj, part)
    obj = getattr(obj, part)AttributeError
: module 'models' has no attribute 'llama'
AttributeError
During handling of the above exception, another exception occurred:

···
ImportErrorTraceback (most recent call last):
:   File "/home/data/daiqun/dpo-trajectory-reasoning-main/trainer_base_ds_mul.py", line 328, in main
Error loading 'models.llama.LlamaModelForSequenceClassification.from_pretrained':
ImportError('/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/flash_attn_2_cuda.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNK3c105Error4whatEv')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/data/daiqun/dpo-trajectory-reasoning-main/trainer_base_ds_mul.py", line 328, in main
    model = hydra.utils.call(cfg.model, cfg.model_name_or_path, state_dict=pretrain_state_dict)
  File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 325, in instantiate_node
    return instantiate_node(
  File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 325, in instantiate_node
    _target_ = _resolve_target(node.get(_Keys.TARGET), full_key)
  File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 139, in _resolve_target
    raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error locating target 'models.llama.LlamaModelForSequenceClassification.from_pretrained', see chained exception above.
full_key: model
    
···
Traceback (most recent call last):
  File "/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/hydra/_internal/utils.py", line 639, in _locate
    obj = getattr(obj, part)
AttributeError: module 'models' has no attribute 'llama'

During handling of the above exception, another exception occurred:

···
Traceback (most recent call last):
  File "/home/data/daiqun/dpo-trajectory-reasoning-main/trainer_base_ds_mul.py", line 426, in <module>
    raise ImportError(
ImportError: Error loading 'models.llama.LlamaModelForSequenceClassification.from_pretrained':
ImportError('/home/work/anaconda3/envs/reasoning_dpo/lib/python3.9/site-packages/flash_attn_2_cuda.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNK3c105Error4whatEv')

The above exception was the direct cause of the following exception:

···
hydra.errors.InstantiationException: Error locating target 'models.llama.LlamaModelForSequenceClassification.from_pretrained', see chained exception above.
full_key: model

@SparkJiao
Copy link
Owner

Maybe you can first reinstall flash-attention following https://github.com/Dao-AILab/flash-attention

If it still does not work, change the attn_implementation here from flash_attention_2 to sdpa or eager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants