pikaGPT
: A tiny implementation of a GPT, accelerated for Apple Silicon
Built on picoGPT
: a GPT in ~60 Lines of Numpy, using MLX: An array framework
for Apple Silicon
picoGPT: jaymody/picoGPT
Apple MLX: ml-explore/mlx
python pika.py "Alan Turing theorized that computers would one day become"
Returns something like:
generating: 100%|█████████████████████████| 40/40 [00:00<00:00, 51.76it/s]
the most powerful machines on the planet.
The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.
Note: Models will be downloaded to /models
if needed.
Change model size, tokens to generate like, or model directory:
python pika.py \
"Alan Turing theorized that computers would one day become" \
--n_tokens_to_generate 40 \
--model_size "124M" \
--models_dir "models"
To check against original Numpy implementation (non-MLX), add --numpy
:
python pika.py \
"Alan Turing theorized that computers would one day become" \
--numpy
If Python>=3.12, first pip install setuptools
to get distutils
. See docs
pip install -r requirements.txt
Tested and benchmarked on Python 3.12.4
and macOS Sonoma 14.5 (M1 Pro, 32GB)
Main script is pika.py
, which imports encoder.py
(from OpenAI) and
downloads model files with utils.py
pikaGPT
is based on picoGPT
which is "an unnecessarily tiny and minimal
implementation of GPT-2 in plain NumPy. The entire forward pass code is 40
lines of code."
For more, see picoGPT: jaymody/picoGPT
If/where to add mx.compile?
MLX seems to provide >4x speedup, see iterations/second it/s
etc:
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become"
generating: 100%|██████████████████████████| 40/40 [00:00<00:00, 53.03it/s]
the most powerful machines on the planet.
The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become" --numpy
generating: 100%|██████████████████████████| 40/40 [00:04<00:00, 9.54it/s]
the most powerful machines on the planet.
The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become" --model_size "1558M"
generating: 100%|██████████████████████████| 40/40 [00:06<00:00, 6.32it/s]
so powerful that they would be able to think like humans.
In the 1950s, he proposed a way to build a computer that could think like a human. He called it the "T
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become" --model_size "1558M" --numpy
generating: 100%|██████████████████████████| 40/40 [00:43<00:00, 1.10s/it]
so powerful that they would be able to think like humans.
In the 1950s, he proposed a way to build a computer that could think like a human. He called it the "T
pip install -r requirements_dev.txt
Run some tests with make test