DOC: ascend support (xorbitsai#1978)

Valdanitooooo · Jul 30, 2024 · 31523d6 · 31523d6
1 parent aafd36e
commit 31523d6
Show file tree

Hide file tree

Showing 6 changed files with 284 additions and 43 deletions.
diff --git a/doc/source/getting_started/installation.rst b/doc/source/getting_started/installation.rst
@@ -99,3 +99,17 @@ SGLang has a high-performance inference runtime with RadixAttention. It signific
 Initial setup::
 
    pip install 'xinference[sglang]'
+
+
+MLX Backend
+~~~~~~~~~~~
+MLX-lm is designed for Apple silicon users to run LLM efficiently.
+
+Initial setup::
+
+   pip install 'xinference[mlx]'
+
+Other Platforms
+~~~~~~~~~~~~~~~
+
+* :ref:`Ascend NPU <installation_npu>`
diff --git a/doc/source/getting_started/installation_npu.rst b/doc/source/getting_started/installation_npu.rst
@@ -0,0 +1,47 @@
+.. _installation_npu:
+
+
+=================================
+Installation Guide for Ascend NPU
+=================================
+Xinference can run on Ascend NPU, follow below instructions to install.
+
+
+Installing PyTorch and Ascend extension for PyTorch
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Install PyTorch CPU version and corresponding Ascend extension.
+
+Take PyTorch v2.1.0 as example.
+
+  .. code-block:: bash
+
+    pip3 install torch==2.1.0 torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cpu
+
+Then install `Ascend extension for PyTorch <https://github.com/Ascend/pytorch>`_.
+
+  .. code-block:: bash
+
+    pip3 install 'numpy<2.0'
+    pip3 install decorator
+    pip3 install torch-npu==2.1.0.post3
+
+Running below command to see if it correctly prints the Ascend NPU count.
+
+.. code-block:: bash
+
+    python -c "import torch; import torch_npu; print(torch.npu.device_count())"
+
+Installing Xinference
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: bash
+
+    pip3 install xinference
+
+Now you can use xinference according to :ref:`doc <using_xinference>`.
+``Transformers`` backend is the only available engine supported for Ascend NPU for open source version.
+
+Enterprise Support
+~~~~~~~~~~~~~~~~~~
+If you encounter any performance or other issues for Ascend NPU, please reach out to us
+via `link <https://xorbits.io/community>`_.
diff --git a/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation.po b/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation.po
@@ -7,7 +7,7 @@ msgid ""
 msgstr ""
 "Project-Id-Version: Xinference \n"
 "Report-Msgid-Bugs-To: \n"
-"POT-Creation-Date: 2024-05-31 11:46+0800\n"
+"POT-Creation-Date: 2024-07-30 17:00+0800\n"
 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
 "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
 "Language: zh_CN\n"
@@ -116,7 +116,9 @@ msgid "Currently, supported models include:"
 msgstr "目前，支持的模型包括："
 
 #: ../../source/getting_started/installation.rst:42
-msgid "``llama-2``, ``llama-3``, ``llama-2-chat``, ``llama-3-instruct``"
+msgid ""
+"``llama-2``, ``llama-3``, ``llama-2-chat``, ``llama-3-instruct``, "
+"``llama-3.1``, ``llama-3.1-instruct``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:43
@@ -130,72 +132,95 @@ msgid ""
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:45
-msgid "``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``"
+msgid ""
+"``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``, "
+"``mistral-instruct-v0.3``, ``mistral-nemo-instruct``, ``mistral-large-"
+"instruct``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:46
-msgid "``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``"
+msgid "``codestral-v0.1``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:47
-msgid "``code-llama``, ``code-llama-python``, ``code-llama-instruct``"
+msgid "``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:48
+msgid "``code-llama``, ``code-llama-python``, ``code-llama-instruct``"
+msgstr ""
+
+#: ../../source/getting_started/installation.rst:49
 msgid ""
 "``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-"
 "instruct``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:49
+#: ../../source/getting_started/installation.rst:50
 msgid "``codeqwen1.5``, ``codeqwen1.5-chat``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:50
+#: ../../source/getting_started/installation.rst:51
 msgid "``vicuna-v1.3``, ``vicuna-v1.5``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:51
+#: ../../source/getting_started/installation.rst:52
 msgid "``internlm2-chat``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:52
+#: ../../source/getting_started/installation.rst:53
+msgid "``internlm2.5-chat``, ``internlm2.5-chat-1m``"
+msgstr ""
+
+#: ../../source/getting_started/installation.rst:54
 msgid "``qwen-chat``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:53
+#: ../../source/getting_started/installation.rst:55
 msgid "``mixtral-instruct-v0.1``, ``mixtral-8x22B-instruct-v0.1``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:54
+#: ../../source/getting_started/installation.rst:56
 msgid "``chatglm3``, ``chatglm3-32k``, ``chatglm3-128k``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:55
+#: ../../source/getting_started/installation.rst:57
+msgid "``glm4-chat``, ``glm4-chat-1m``"
+msgstr ""
+
+#: ../../source/getting_started/installation.rst:58
+msgid "``codegeex4``"
+msgstr ""
+
+#: ../../source/getting_started/installation.rst:59
 msgid "``qwen1.5-chat``, ``qwen1.5-moe-chat``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:56
+#: ../../source/getting_started/installation.rst:60
+msgid "``qwen2-instruct``, ``qwen2-moe-instruct``"
+msgstr ""
+
+#: ../../source/getting_started/installation.rst:61
 msgid "``gemma-it``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:57
+#: ../../source/getting_started/installation.rst:62
 msgid "``orion-chat``, ``orion-chat-rag``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:58
+#: ../../source/getting_started/installation.rst:63
 msgid "``c4ai-command-r-v01``"
 msgstr ""
 
-#: ../../source/getting_started/installation.rst:61
+#: ../../source/getting_started/installation.rst:66
 msgid "To install Xinference and vLLM::"
 msgstr "安装 xinference 和 vLLM："
 
-#: ../../source/getting_started/installation.rst:68
+#: ../../source/getting_started/installation.rst:73
 msgid "Llama.cpp Backend"
 msgstr "Llama.cpp 引擎"
 
-#: ../../source/getting_started/installation.rst:69
+#: ../../source/getting_started/installation.rst:74
 msgid ""
 "Xinference supports models in ``gguf`` and ``ggml`` format via ``llama-"
 "cpp-python``. It's advised to install the llama.cpp-related dependencies "
@@ -204,32 +229,33 @@ msgstr ""
 "Xinference 通过 ``llama-cpp-python`` 支持 ``gguf`` 和 ``ggml`` 格式的模型"
 "。建议根据当前使用的硬件手动安装依赖，从而获得最佳的加速效果。"
 
-#: ../../source/getting_started/installation.rst:71
-#: ../../source/getting_started/installation.rst:94
+#: ../../source/getting_started/installation.rst:76
+#: ../../source/getting_started/installation.rst:99
+#: ../../source/getting_started/installation.rst:108
 msgid "Initial setup::"
 msgstr "初始步骤："
 
-#: ../../source/getting_started/installation.rst:75
+#: ../../source/getting_started/installation.rst:80
 msgid "Hardware-Specific installations:"
 msgstr "不同硬件的安装方式："
 
-#: ../../source/getting_started/installation.rst:77
+#: ../../source/getting_started/installation.rst:82
 msgid "Apple Silicon::"
 msgstr "Apple M系列"
 
-#: ../../source/getting_started/installation.rst:81
+#: ../../source/getting_started/installation.rst:86
 msgid "Nvidia cards::"
 msgstr "英伟达显卡："
 
-#: ../../source/getting_started/installation.rst:85
+#: ../../source/getting_started/installation.rst:90
 msgid "AMD cards::"
 msgstr "AMD 显卡："
 
-#: ../../source/getting_started/installation.rst:91
+#: ../../source/getting_started/installation.rst:96
 msgid "SGLang Backend"
 msgstr "SGLang 引擎"
 
-#: ../../source/getting_started/installation.rst:92
+#: ../../source/getting_started/installation.rst:97
 msgid ""
 "SGLang has a high-performance inference runtime with RadixAttention. It "
 "significantly accelerates the execution of complex LLM programs by "
@@ -240,6 +266,23 @@ msgstr ""
 "自动重用KV缓存，显著加速了复杂 LLM 程序的执行。它还支持其他常见推理技术，"
 "如连续批处理和张量并行处理。"
 
+#: ../../source/getting_started/installation.rst:105
+#, fuzzy
+msgid "MLX Backend"
+msgstr "vLLM 引擎"
+
+#: ../../source/getting_started/installation.rst:106
+msgid "MLX-lm is designed for Apple silicon users to run LLM efficiently."
+msgstr "MLX-lm 用来在苹果 silicon 芯片上提供高效的 LLM 推理。"
+
+#: ../../source/getting_started/installation.rst:113
+msgid "Other Platforms"
+msgstr "其他平台"
+
+#: ../../source/getting_started/installation.rst:115
+msgid ":ref:`Ascend NPU <installation_npu>`"
+msgstr ""
+
 #~ msgid "``Yi``, ``Yi-chat``"
 #~ msgstr ""
 
@@ -252,3 +295,9 @@ msgstr ""
 #~ msgid "``codeqwen1.5-chat``"
 #~ msgstr ""
 
+#~ msgid "``llama-2``, ``llama-3``, ``llama-2-chat``, ``llama-3-instruct``"
+#~ msgstr ""
+
+#~ msgid "``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``"
+#~ msgstr ""
+
diff --git a/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation_npu.po b/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation_npu.po
@@ -0,0 +1,79 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2023, Xorbits Inc.
+# This file is distributed under the same license as the Xinference package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2024.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: Xinference \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2024-07-30 17:00+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <[email protected]>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.14.0\n"
+
+#: ../../source/getting_started/installation_npu.rst:6
+msgid "Installation Guide for Ascend NPU"
+msgstr "在昇腾 NPU 上安装"
+
+#: ../../source/getting_started/installation_npu.rst:7
+msgid "Xinference can run on Ascend NPU, follow below instructions to install."
+msgstr "Xinference 能在昇腾 NPU 上运行，使用如下命令安装。"
+
+#: ../../source/getting_started/installation_npu.rst:11
+msgid "Installing PyTorch and Ascend extension for PyTorch"
+msgstr "安装 PyTorch 和昇腾扩展"
+
+#: ../../source/getting_started/installation_npu.rst:12
+msgid "Install PyTorch CPU version and corresponding Ascend extension."
+msgstr "安装 PyTorch CPU 版本和相应的昇腾扩展。"
+
+#: ../../source/getting_started/installation_npu.rst:14
+msgid "Take PyTorch v2.1.0 as example."
+msgstr "以 PyTorch v2.1.0 为例。"
+
+#: ../../source/getting_started/installation_npu.rst:20
+msgid ""
+"Then install `Ascend extension for PyTorch "
+"<https://github.com/Ascend/pytorch>`_."
+msgstr ""
+"接着安装 `昇腾 PyTorch 扩展 "
+"<https://gitee.com/ascend/pytorch>`_."
+
+#: ../../source/getting_started/installation_npu.rst:28
+msgid "Running below command to see if it correctly prints the Ascend NPU count."
+msgstr "运行如下命令查看，如果正常运行，会打印昇腾 NPU 的个数。"
+
+#: ../../source/getting_started/installation_npu.rst:35
+msgid "Installing Xinference"
+msgstr "安装 Xinference"
+
+#: ../../source/getting_started/installation_npu.rst:41
+msgid ""
+"Now you can use xinference according to :ref:`doc <using_xinference>`. "
+"``Transformers`` backend is the only available engine supported for "
+"Ascend NPU for open source version."
+msgstr ""
+"现在你可以参考 :ref:`文档 <using_xinference>` 来使用 Xinference。"
+"``Transformers`` 是开源唯一支持的昇腾 NPU 的引擎。"
+
+#: ../../source/getting_started/installation_npu.rst:45
+msgid "Enterprise Support"
+msgstr "企业支持"
+
+#: ../../source/getting_started/installation_npu.rst:46
+msgid ""
+"If you encounter any performance or other issues for Ascend NPU, please "
+"reach out to us via `link <https://xorbits.io/community>`_."
+msgstr ""
+"如果你在昇腾 NPU 遇到任何性能和其他问题，欢迎垂询 Xinference 企业版，"
+"在 `这里 <https://xorbits.cn/community>`_ 可以找到我们，亦可以 "
+"`填写表单 <https://w8v6grm432.feishu.cn/share/base/form/shrcn9u1EBXQxmGMqILEjguuGoh>`_ 申请企业版试用。"
+