(4/n) Data Refactor - Finetuning Scripts #950

awaelchli · 2024-02-23T18:20:52Z

Fixes #954
Fixes #951

finetune/full.py

lit_gpt/datasets/alpaca.py

finetune/full.py

rasbt · 2024-02-26T20:08:58Z

When running with 1 epoch and an epoch size of 50000 on Alpaca (default 64 global batch size and microbatch size of 1), that's currently how the learning rate looks like during the course of the training:

This looks good and is exactly what I would expect.

When I reduced the training epoch size to 1000 and adjusted the warmup steps 100 -> 10, then it seems to be doing something weird:

I think we need to adjust the code so that the scheduler steps here

    steps_per_epoch = len(train_dataloader) // train.gradient_accumulation_iters(devices)
    lr_max_steps = train.epochs * steps_per_epoch

are perhaps computed by the epoch_size * train.epochs so that this doesn't happen? We can use len(train_dataloader) // train.gradient_accumulation_iters(devices) then if train.epoch_size is None. What do you think?

… into refactor/data

carmocca

My comments would apply to all the other files too

finetune/adapter.py

pretrain/openwebtext.py

pretrain/redpajama.py

tests/conftest.py

Co-authored-by: rasbt <[email protected]> Co-authored-by: Carlos Mocholí <[email protected]>

awaelchli added 5 commits February 16, 2024 00:35

alpaca

1655751

fixes

0b65b09

fixes

1e273fe

separate

719ac2f

lima

8b31c77

carmocca mentioned this pull request Feb 23, 2024

Trainer args consistency #951

Closed

awaelchli added 5 commits February 24, 2024 05:12

Merge branch 'main' into refactor/data

e4daef9

integrate

a2047f6

remove converted datasets

815e03b

tinyllama

34bf71c

update

1768d56

awaelchli mentioned this pull request Feb 26, 2024

Data Refactor Proposal #954

Closed

small typo fix: laoder -> loader

0b9eca8

rasbt reviewed Feb 26, 2024

View reviewed changes

finetune/full.py Show resolved Hide resolved

rasbt reviewed Feb 26, 2024

View reviewed changes

lit_gpt/datasets/alpaca.py Outdated Show resolved Hide resolved

rasbt reviewed Feb 26, 2024

View reviewed changes

finetune/full.py Show resolved Hide resolved

awaelchli added 12 commits February 26, 2024 23:10

refactor base class

c83f468

args stuff

1f4a4ee

Merge branch 'refactor/data' of ssh://github.com/Lightning-AI/lit-gpt…

a8e6ce4

… into refactor/data

max_seq_length needs to be specified differently

65d4a21

fix for max steps

19cbc37

fix init

095cc7b

tinyllama

35537c0

model config

2f5658e

remove epoch size

7f81bbe

simplify

da0710f

fix

9638b0b

refactor

edbd4e0

awaelchli mentioned this pull request Feb 27, 2024

(1/n) Data Refactor - TinyLlama Pretraining #958

Merged

awaelchli added 9 commits February 29, 2024 02:04

adapter

2014b18

adapter v2

e4c6396

update

8fcfe26

update tests

eac6bd6

update

563c580

update

ebb5b7b

tests

2053cc0

tests

e5637e3

reset

d69c73a

awaelchli changed the title ~~[WIP] Data Refactor~~ (3/n) Data Refactor - Finetuning Scripts Feb 29, 2024

awaelchli mentioned this pull request Feb 29, 2024

(3/n) Data Refactor - Add JSON Module #971

Merged

awaelchli changed the title ~~(3/n) Data Refactor - Finetuning Scripts~~ (4/n) Data Refactor - Finetuning Scripts Feb 29, 2024

awaelchli marked this pull request as ready for review February 29, 2024 12:58

awaelchli requested review from carmocca and lantiga as code owners February 29, 2024 12:58

awaelchli and others added 3 commits February 29, 2024 13:59

Merge branch 'main' into refactor/data

490c51e

Run CI on wip branch

6bcff6a

Merge branch 'main' into refactor/data

8d7a2b3

awaelchli changed the base branch from main to wip February 29, 2024 16:16

carmocca reviewed Feb 29, 2024

View reviewed changes

require either epochs or max_steps to be set

7215ac5

carmocca approved these changes Feb 29, 2024

View reviewed changes

don't inline max_steps redefinition

5e10c9e

rasbt mentioned this pull request Feb 29, 2024

Experience Streamlining #973

Closed

4 tasks

awaelchli merged commit f814cad into wip Feb 29, 2024
8 checks passed

awaelchli deleted the refactor/data branch February 29, 2024 18:38

carmocca added this to the Configurability milestone Mar 1, 2024

awaelchli added a commit that referenced this pull request Mar 15, 2024

(4/n) Data Refactor - Finetuning Scripts (#950)

411be53

Co-authored-by: rasbt <[email protected]> Co-authored-by: Carlos Mocholí <[email protected]>

awaelchli added a commit that referenced this pull request Mar 15, 2024

(4/n) Data Refactor - Finetuning Scripts (#950)

73baa2d

Co-authored-by: rasbt <[email protected]> Co-authored-by: Carlos Mocholí <[email protected]>

rasbt added a commit that referenced this pull request Mar 18, 2024

(4/n) Data Refactor - Finetuning Scripts (#950)

63f162f

Co-authored-by: rasbt <[email protected]> Co-authored-by: Carlos Mocholí <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(4/n) Data Refactor - Finetuning Scripts #950

(4/n) Data Refactor - Finetuning Scripts #950

awaelchli commented Feb 23, 2024 •

edited

Loading

rasbt commented Feb 26, 2024

carmocca left a comment

(4/n) Data Refactor - Finetuning Scripts #950

(4/n) Data Refactor - Finetuning Scripts #950

Conversation

awaelchli commented Feb 23, 2024 • edited Loading

rasbt commented Feb 26, 2024

carmocca left a comment

Choose a reason for hiding this comment

awaelchli commented Feb 23, 2024 •

edited

Loading