Release 24.5

We are excited to announce the release of MAX 24.5! This release includes support for installing MAX as a conda package with magic, a powerful new package and virtual environment manager. We’re also introducing two new Python APIs for MAX Graph and MAX Driver, which will ultimately provide the same low-level programming interface as the Mojo Graph API. MAX Engine has improved performance for Llama3, with 24.5 generating tokens for Llama an average of 15% to 48% faster. Lastly, this release also adds support for Python 3.12, and drops support for Python 3.8 and Ubuntu 20.04.

For additional details, checkout the changelog and the release announcement.

Release 24.4

Today, we are thrilled to announce the release of MAX 24.4, which introduces a powerful new quantization API for MAX Graphs and extends MAX’s reach to macOS. Together, these unlock a new industry standard paradigm where developers can leverage a single toolchain to build Generative AI pipelines locally and seamlessly deploy them to the cloud, all with industry-leading performance. Leveraging the Quantization API reduces the latency and memory cost of Generative AI pipelines by up to 8x on desktop architectures like macOS, by up to 7x on cloud CPU architectures like Intel and Graviton, without requiring developers to rewrite models or update any application code.

Checkout the changelog and the full release blog for additional details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 24.5

Release 24.4

Releases: modularml/max

MAX 24.5

Release 24.5

MAX 24.4

Release 24.4