Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fine_tuning] toolbox: fine_tuning_ray_fine_tuning_job: new toolbox command #572

Merged
merged 1 commit into from
Oct 23, 2024

Conversation

kpouget
Copy link
Contributor

@kpouget kpouget commented Oct 18, 2024

./run_toolbox.py fine_tuning ray_fine_tuning_job \
   --name=lora \
   --namespace=fine-tuning-testing \
   --pvc_name=fine-tuning-storage \
   --model_name=granite-7b-base \
   --dataset_name=twitter_complaints_small.json \
   --gpu=1 --cpu=16 --memory=64 --request_equals_limits=True --worker_replicas=7 \
   --ft-scripts-dir=projects/fine_tuning/toolbox/fine_tuning_ray_fine_tuning_job/files/ray-finetune-llm-deepspeed

(currently on works on the DGX, because of the dataset put manually in the PVC)

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 18, 2024
Copy link

openshift-ci bot commented Oct 18, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from kpouget. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kpouget kpouget changed the title WIP: [fine_tuning] toolbox: fine_tuning_ray_fine_tuning_job: new toolbox c… WIP: [fine_tuning] toolbox: fine_tuning_ray_fine_tuning_job: new toolbox command Oct 18, 2024
@kpouget kpouget changed the title WIP: [fine_tuning] toolbox: fine_tuning_ray_fine_tuning_job: new toolbox command [fine_tuning] toolbox: fine_tuning_ray_fine_tuning_job: new toolbox command Oct 23, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 23, 2024
@kpouget
Copy link
Contributor Author

kpouget commented Oct 23, 2024

test passed locally, merging.
Still a POC, detecting the pass/fail of the job is not working 100% correctly, and the dataset used by this fine-tuning needs to be manually generated and copied to the PVC. But the fine-tuning job runs :)

@kpouget kpouget merged commit 311382c into openshift-psap:main Oct 23, 2024
6 of 7 checks passed
@kpouget kpouget deleted the ray branch October 23, 2024 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant