Skip to content

Commit

Permalink
Fix data_parallel_shard_degree description (#659)
Browse files Browse the repository at this point in the history
  • Loading branch information
carmocca authored Oct 29, 2024
1 parent 7310abe commit dbb0520
Showing 1 changed file with 1 addition and 3 deletions.
4 changes: 1 addition & 3 deletions torchtitan/config_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,9 +249,7 @@ def __init__(self):
parallelism method used is FSDP (Fully Sharded Data Parallelism).
-1 means leftover ranks will be used (After DP_REPLICATE/SP/PP). Note that
only one of `data_parallel_replicate_degree` and `data_parallel_shard_degree`
can be negative.
1 means disabled.""",
only `data_parallel_shard_degree` can be negative. 1 means disabled.""",
)
self.parser.add_argument(
"--training.enable_cpu_offload",
Expand Down

0 comments on commit dbb0520

Please sign in to comment.