Name		Name	Last commit message	Last commit date
parent directory ..
dataset		dataset
flair		flair
image_classification		image_classification
llm		llm
lm		lm
model		model
test		test
utils		utils
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

README.md

PFL benchmarks

Setup environment

Prerequisite 1: you need to have poetry installed.

curl -sSL https://install.python-poetry.org | python3 -

Prerequisite 2: you need to have compatible Python version available in poetry. You only need to have this activated the first time you install the poetry environment.

conda create -n py310 python=3.10
conda activate py310

Install environment:

git clone [email protected]:apple/pfl-research.git
cd pfl-research/benchmarks/
# If you have the new Python 3.10 environment active, it should be cloned.
poetry env use `which python`
# Install to run tf, pytorch and tests
poetry install -E pytorch -E tf
# Activate environment
poetry shell
# Add root directory to `PYTHONPATH` such that the utility modules can be imported
export PYTHONPATH=`pwd`:$PYTHONPATH

This default setup should enable you to run any of the official benchmarks.

Quickstart

Complete setup as above.
Download CIFAR10 data:

python -m dataset.cifar10.download_preprocess --output_dir data/cifar10

Train a small CNN on CIFAR10 IID data:

python image_classification/pytorch/train.py --args_config image_classification/configs/baseline.yaml

Official benchmarks

There are multiple official benchmarks for pfl to simulate various scenarios, split into categories:

image_classification - train small CNN on CIFAR10.
lm - train transformer model on StackOverflow
flair - train ResNet18 on FLAIR dataset.

Run distributed simulations

Each benchmark can run in distributed mode with multiple cores, GPUs and machines. See the distributed simulation guide on how it works. In summary, to quickly get started running distributed simulations:

Install Horovod. We have a helper script here.
Invoke your Python script with the horovodrun command. E.g. to run the same CIFAR10 training as described above in the quickstart, but train with 2 processes on the same machine, the command will look like this:

horovodrun --gloo -np 2 -H localhost:2 python image_classification/pytorch/train.py --args_config image_classification/configs/baseline.yaml

If you have 2 GPUs on the machine, each process will be allocated 1 GPU. If you only have 1 GPU, both processes will share the GPU. Sharing GPU can still result in speedup depending on the use case because of the inevitable overhead of FL.

Other Poetry commands

Alternatively, you can install a subset of dependencies:

# Install to run tf examples
poetry install -E tf --no-dev
# Install to run pytorch examples
poetry install -E pytorch --no-dev
# Install to run tf, pytorch and tests
poetry install -E pytorch -E tf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarks

benchmarks

README.md

PFL benchmarks

Setup environment

Quickstart

Official benchmarks

Run distributed simulations

Other Poetry commands

Files

benchmarks

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmarks

Folders and files

parent directory

README.md

PFL benchmarks

Setup environment

Quickstart

Official benchmarks

Run distributed simulations

Other Poetry commands