Skip to content

Commit

Permalink
DOC: ascend support (xorbitsai#1978)
Browse files Browse the repository at this point in the history
  • Loading branch information
qinxuye authored Jul 30, 2024
1 parent aafd36e commit 31523d6
Show file tree
Hide file tree
Showing 6 changed files with 284 additions and 43 deletions.
14 changes: 14 additions & 0 deletions doc/source/getting_started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,3 +99,17 @@ SGLang has a high-performance inference runtime with RadixAttention. It signific
Initial setup::

pip install 'xinference[sglang]'


MLX Backend
~~~~~~~~~~~
MLX-lm is designed for Apple silicon users to run LLM efficiently.

Initial setup::

pip install 'xinference[mlx]'

Other Platforms
~~~~~~~~~~~~~~~

* :ref:`Ascend NPU <installation_npu>`
47 changes: 47 additions & 0 deletions doc/source/getting_started/installation_npu.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
.. _installation_npu:


=================================
Installation Guide for Ascend NPU
=================================
Xinference can run on Ascend NPU, follow below instructions to install.


Installing PyTorch and Ascend extension for PyTorch
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Install PyTorch CPU version and corresponding Ascend extension.

Take PyTorch v2.1.0 as example.

.. code-block:: bash
pip3 install torch==2.1.0 torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cpu
Then install `Ascend extension for PyTorch <https://github.com/Ascend/pytorch>`_.

.. code-block:: bash
pip3 install 'numpy<2.0'
pip3 install decorator
pip3 install torch-npu==2.1.0.post3
Running below command to see if it correctly prints the Ascend NPU count.

.. code-block:: bash
python -c "import torch; import torch_npu; print(torch.npu.device_count())"
Installing Xinference
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: bash
pip3 install xinference
Now you can use xinference according to :ref:`doc <using_xinference>`.
``Transformers`` backend is the only available engine supported for Ascend NPU for open source version.

Enterprise Support
~~~~~~~~~~~~~~~~~~
If you encounter any performance or other issues for Ascend NPU, please reach out to us
via `link <https://xorbits.io/community>`_.
101 changes: 75 additions & 26 deletions doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation.po
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-05-31 11:46+0800\n"
"POT-Creation-Date: 2024-07-30 17:00+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
Expand Down Expand Up @@ -116,7 +116,9 @@ msgid "Currently, supported models include:"
msgstr "目前,支持的模型包括:"

#: ../../source/getting_started/installation.rst:42
msgid "``llama-2``, ``llama-3``, ``llama-2-chat``, ``llama-3-instruct``"
msgid ""
"``llama-2``, ``llama-3``, ``llama-2-chat``, ``llama-3-instruct``, "
"``llama-3.1``, ``llama-3.1-instruct``"
msgstr ""

#: ../../source/getting_started/installation.rst:43
Expand All @@ -130,72 +132,95 @@ msgid ""
msgstr ""

#: ../../source/getting_started/installation.rst:45
msgid "``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``"
msgid ""
"``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``, "
"``mistral-instruct-v0.3``, ``mistral-nemo-instruct``, ``mistral-large-"
"instruct``"
msgstr ""

#: ../../source/getting_started/installation.rst:46
msgid "``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``"
msgid "``codestral-v0.1``"
msgstr ""

#: ../../source/getting_started/installation.rst:47
msgid "``code-llama``, ``code-llama-python``, ``code-llama-instruct``"
msgid "``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``"
msgstr ""

#: ../../source/getting_started/installation.rst:48
msgid "``code-llama``, ``code-llama-python``, ``code-llama-instruct``"
msgstr ""

#: ../../source/getting_started/installation.rst:49
msgid ""
"``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-"
"instruct``"
msgstr ""

#: ../../source/getting_started/installation.rst:49
#: ../../source/getting_started/installation.rst:50
msgid "``codeqwen1.5``, ``codeqwen1.5-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:50
#: ../../source/getting_started/installation.rst:51
msgid "``vicuna-v1.3``, ``vicuna-v1.5``"
msgstr ""

#: ../../source/getting_started/installation.rst:51
#: ../../source/getting_started/installation.rst:52
msgid "``internlm2-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:52
#: ../../source/getting_started/installation.rst:53
msgid "``internlm2.5-chat``, ``internlm2.5-chat-1m``"
msgstr ""

#: ../../source/getting_started/installation.rst:54
msgid "``qwen-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:53
#: ../../source/getting_started/installation.rst:55
msgid "``mixtral-instruct-v0.1``, ``mixtral-8x22B-instruct-v0.1``"
msgstr ""

#: ../../source/getting_started/installation.rst:54
#: ../../source/getting_started/installation.rst:56
msgid "``chatglm3``, ``chatglm3-32k``, ``chatglm3-128k``"
msgstr ""

#: ../../source/getting_started/installation.rst:55
#: ../../source/getting_started/installation.rst:57
msgid "``glm4-chat``, ``glm4-chat-1m``"
msgstr ""

#: ../../source/getting_started/installation.rst:58
msgid "``codegeex4``"
msgstr ""

#: ../../source/getting_started/installation.rst:59
msgid "``qwen1.5-chat``, ``qwen1.5-moe-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:56
#: ../../source/getting_started/installation.rst:60
msgid "``qwen2-instruct``, ``qwen2-moe-instruct``"
msgstr ""

#: ../../source/getting_started/installation.rst:61
msgid "``gemma-it``"
msgstr ""

#: ../../source/getting_started/installation.rst:57
#: ../../source/getting_started/installation.rst:62
msgid "``orion-chat``, ``orion-chat-rag``"
msgstr ""

#: ../../source/getting_started/installation.rst:58
#: ../../source/getting_started/installation.rst:63
msgid "``c4ai-command-r-v01``"
msgstr ""

#: ../../source/getting_started/installation.rst:61
#: ../../source/getting_started/installation.rst:66
msgid "To install Xinference and vLLM::"
msgstr "安装 xinference 和 vLLM:"

#: ../../source/getting_started/installation.rst:68
#: ../../source/getting_started/installation.rst:73
msgid "Llama.cpp Backend"
msgstr "Llama.cpp 引擎"

#: ../../source/getting_started/installation.rst:69
#: ../../source/getting_started/installation.rst:74
msgid ""
"Xinference supports models in ``gguf`` and ``ggml`` format via ``llama-"
"cpp-python``. It's advised to install the llama.cpp-related dependencies "
Expand All @@ -204,32 +229,33 @@ msgstr ""
"Xinference 通过 ``llama-cpp-python`` 支持 ``gguf`` 和 ``ggml`` 格式的模型"
"。建议根据当前使用的硬件手动安装依赖,从而获得最佳的加速效果。"

#: ../../source/getting_started/installation.rst:71
#: ../../source/getting_started/installation.rst:94
#: ../../source/getting_started/installation.rst:76
#: ../../source/getting_started/installation.rst:99
#: ../../source/getting_started/installation.rst:108
msgid "Initial setup::"
msgstr "初始步骤:"

#: ../../source/getting_started/installation.rst:75
#: ../../source/getting_started/installation.rst:80
msgid "Hardware-Specific installations:"
msgstr "不同硬件的安装方式:"

#: ../../source/getting_started/installation.rst:77
#: ../../source/getting_started/installation.rst:82
msgid "Apple Silicon::"
msgstr "Apple M系列"

#: ../../source/getting_started/installation.rst:81
#: ../../source/getting_started/installation.rst:86
msgid "Nvidia cards::"
msgstr "英伟达显卡:"

#: ../../source/getting_started/installation.rst:85
#: ../../source/getting_started/installation.rst:90
msgid "AMD cards::"
msgstr "AMD 显卡:"

#: ../../source/getting_started/installation.rst:91
#: ../../source/getting_started/installation.rst:96
msgid "SGLang Backend"
msgstr "SGLang 引擎"

#: ../../source/getting_started/installation.rst:92
#: ../../source/getting_started/installation.rst:97
msgid ""
"SGLang has a high-performance inference runtime with RadixAttention. It "
"significantly accelerates the execution of complex LLM programs by "
Expand All @@ -240,6 +266,23 @@ msgstr ""
"自动重用KV缓存,显著加速了复杂 LLM 程序的执行。它还支持其他常见推理技术,"
"如连续批处理和张量并行处理。"

#: ../../source/getting_started/installation.rst:105
#, fuzzy
msgid "MLX Backend"
msgstr "vLLM 引擎"

#: ../../source/getting_started/installation.rst:106
msgid "MLX-lm is designed for Apple silicon users to run LLM efficiently."
msgstr "MLX-lm 用来在苹果 silicon 芯片上提供高效的 LLM 推理。"

#: ../../source/getting_started/installation.rst:113
msgid "Other Platforms"
msgstr "其他平台"

#: ../../source/getting_started/installation.rst:115
msgid ":ref:`Ascend NPU <installation_npu>`"
msgstr ""

#~ msgid "``Yi``, ``Yi-chat``"
#~ msgstr ""

Expand All @@ -252,3 +295,9 @@ msgstr ""
#~ msgid "``codeqwen1.5-chat``"
#~ msgstr ""

#~ msgid "``llama-2``, ``llama-3``, ``llama-2-chat``, ``llama-3-instruct``"
#~ msgstr ""

#~ msgid "``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``"
#~ msgstr ""

Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# SOME DESCRIPTIVE TITLE.
# Copyright (C) 2023, Xorbits Inc.
# This file is distributed under the same license as the Xinference package.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2024.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-07-30 17:00+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
"Language-Team: zh_CN <[email protected]>\n"
"Plural-Forms: nplurals=1; plural=0;\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.14.0\n"

#: ../../source/getting_started/installation_npu.rst:6
msgid "Installation Guide for Ascend NPU"
msgstr "在昇腾 NPU 上安装"

#: ../../source/getting_started/installation_npu.rst:7
msgid "Xinference can run on Ascend NPU, follow below instructions to install."
msgstr "Xinference 能在昇腾 NPU 上运行,使用如下命令安装。"

#: ../../source/getting_started/installation_npu.rst:11
msgid "Installing PyTorch and Ascend extension for PyTorch"
msgstr "安装 PyTorch 和昇腾扩展"

#: ../../source/getting_started/installation_npu.rst:12
msgid "Install PyTorch CPU version and corresponding Ascend extension."
msgstr "安装 PyTorch CPU 版本和相应的昇腾扩展。"

#: ../../source/getting_started/installation_npu.rst:14
msgid "Take PyTorch v2.1.0 as example."
msgstr "以 PyTorch v2.1.0 为例。"

#: ../../source/getting_started/installation_npu.rst:20
msgid ""
"Then install `Ascend extension for PyTorch "
"<https://github.com/Ascend/pytorch>`_."
msgstr ""
"接着安装 `昇腾 PyTorch 扩展 "
"<https://gitee.com/ascend/pytorch>`_."

#: ../../source/getting_started/installation_npu.rst:28
msgid "Running below command to see if it correctly prints the Ascend NPU count."
msgstr "运行如下命令查看,如果正常运行,会打印昇腾 NPU 的个数。"

#: ../../source/getting_started/installation_npu.rst:35
msgid "Installing Xinference"
msgstr "安装 Xinference"

#: ../../source/getting_started/installation_npu.rst:41
msgid ""
"Now you can use xinference according to :ref:`doc <using_xinference>`. "
"``Transformers`` backend is the only available engine supported for "
"Ascend NPU for open source version."
msgstr ""
"现在你可以参考 :ref:`文档 <using_xinference>` 来使用 Xinference。"
"``Transformers`` 是开源唯一支持的昇腾 NPU 的引擎。"

#: ../../source/getting_started/installation_npu.rst:45
msgid "Enterprise Support"
msgstr "企业支持"

#: ../../source/getting_started/installation_npu.rst:46
msgid ""
"If you encounter any performance or other issues for Ascend NPU, please "
"reach out to us via `link <https://xorbits.io/community>`_."
msgstr ""
"如果你在昇腾 NPU 遇到任何性能和其他问题,欢迎垂询 Xinference 企业版,"
"在 `这里 <https://xorbits.cn/community>`_ 可以找到我们,亦可以 "
"`填写表单 <https://w8v6grm432.feishu.cn/share/base/form/shrcn9u1EBXQxmGMqILEjguuGoh>`_ 申请企业版试用。"

Loading

0 comments on commit 31523d6

Please sign in to comment.