Skip to content

Commit

Permalink
add support for mthreads devices
Browse files Browse the repository at this point in the history
Signed-off-by: mccxadmin <[email protected]>
  • Loading branch information
mccxadmin committed Oct 16, 2024
1 parent 00daa3e commit 5941e01
Show file tree
Hide file tree
Showing 20 changed files with 511 additions and 15 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ will see 3G device memory inside container
[![cambricon MLU](https://img.shields.io/badge/Cambricon-Mlu-blue)](docs/cambricon-mlu-support.md)
[![hygon DCU](https://img.shields.io/badge/Hygon-DCU-blue)](docs/hygon-dcu-support.md)
[![iluvatar GPU](https://img.shields.io/badge/Iluvatar-GPU-blue)](docs/iluvatar-gpu-support.md)
[![mthreads GPU](https://img.shields.io/badge/Mthreads-GPU-blue)](docs/mthreads-support.md)
[![ascend NPU](https://img.shields.io/badge/Ascend-GPU-blue)](https://github.com/Project-HAMi/ascend-device-plugin/blob/main/README.md)

## Architect

Expand Down
2 changes: 2 additions & 0 deletions README_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
[![寒武纪 MLU](https://img.shields.io/badge/寒武纪-Mlu-blue)](docs/cambricon-mlu-support_cn.md)
[![海光 DCU](https://img.shields.io/badge/海光-DCU-blue)](docs/hygon-dcu-support.md)
[![天数智芯 GPU](https://img.shields.io/badge/天数智芯-GPU-blue)](docs/iluvatar-gpu-support_cn.md)
[![摩尔线程 GPU](https://img.shields.io/badge/摩尔线程-GPU-blue)](docs/mthreads-support_cn.md)
[![华为昇腾 NPU](https://img.shields.io/badge/华为昇腾-NPU-blue)](https://github.com/Project-HAMi/ascend-device-plugin/blob/main/README_cn.md)


## 简介
Expand Down
8 changes: 8 additions & 0 deletions charts/hami/templates/scheduler/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ data:
},
{{- end }}
{{- end }}
{{- if .Values.devices.mthreads.enabled }}
{{- range .Values.devices.mthreads.resources }}
{
"name": "{{ . }}",
"ignoredByScheduler": true
},
{{- end }}
{{- end }}
{
"name": "{{ .Values.resourceName }}",
"ignoredByScheduler": true
Expand Down
6 changes: 6 additions & 0 deletions charts/hami/templates/scheduler/configmapnew.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,4 +55,10 @@ data:
ignoredByScheduler: true
{{- end }}
{{- end }}
{{- if .Values.devices.mthreads.enabled }}
{{- range .Values.devices.mthreads.resources }}
- name: {{ . }}
ignoredByScheduler: true
{{- end }}
{{- end }}
{{- end }}
4 changes: 4 additions & 0 deletions charts/hami/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,10 @@ devicePlugin:
tolerations: []

devices:
mthreads:
enabled: false
resources:
- mthreads.com/vgpu
ascend:
enabled: false
image: ""
Expand Down
67 changes: 67 additions & 0 deletions docs/mthreads-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
## Introduction

**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including:

***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks.

***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary.

***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary.

## Important Notes

1. Device sharing for multi-cards is not supported.

2. Only one mthreads device can be shared in a pod(even there are multiple containers).

3. Support allocating exclusive mthreads GPU by specifying mthreads.com/vgpu only.

4. These features are tested on MTT S4000

## Prerequisites

* [MT CloudNative Toolkits > 1.9.0](https://docs.mthreads.com/cloud-native/cloud-native-doc-online/)
* driver version >= 1.2.0

## Enabling GPU-sharing Support

* Deploy MT-CloudNative Toolkit on mthreads nodes (Please consult your device provider to aquire its package and document)

> **NOTICE:** *You can remove mt-mutating-webhook and mt-gpu-scheduler after installation(optional).*
* set the 'devices.mthreads.enabled = true' when installing hami

```
helm install hami hami-charts/hami --set scheduler.kubeScheduler.imageTag={your kubernetes version} --set device.mthreads.enabled=true -n kube-system
```

## Running Mthreads jobs

Mthreads GPUs can now be requested by a container
using the `mthreads.com/vgpu`, `mthreads.com/sgpu-memory` and `mthreads.com/sgpu-core` resource type:

```
apiVersion: v1
kind: Pod
metadata:
name: gpushare-pod-default
spec:
restartPolicy: OnFailure
containers:
- image: core.harbor.zlidc.mthreads.com:30003/mt-ai/lm-qy2:v17-mpc
imagePullPolicy: IfNotPresent
name: gpushare-pod-1
command: ["sleep"]
args: ["100000"]
resources:
limits:
mthreads.com/vgpu: 1
mthreads.com/sgpu-memory: 32
mthreads.com/sgpu-core: 8
```

> **NOTICE1:** *Each unit of sgpu-memory indicates 512M device memory*
> **NOTICE2:** *You can find more examples in [examples/mthreads folder](../examples/mthreads/)*

68 changes: 68 additions & 0 deletions docs/mthreads-support_cn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
## 简介

本组件支持复用摩尔线程GPU设备,并为此提供以下几种与vGPU类似的复用功能,包括:

***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡

***可限制分配的显存大小***: 你现在可以用显存值(例如3000M)来分配MLU,本组件会确保任务使用的显存不会超过分配数值、

***可限制分配的算力核组比例***: 你现在可以用算力核组数量(例如8个)来分配GPU,本组件会确保任务使用的显存不会超过分配数值

## 注意事项

1. 暂时不支持多卡切片,多卡任务只能分配整卡

2. 一个pod只能使用一个GPU生成的切片,即使该pod中有多个容器

3. 支持独占模式,只指定`mthreads.com/vgpu`即为独占申请

4. 本特性目前只支持MTT S4000设备

## 节点需求

* [MT CloudNative Toolkits > 1.9.0](https://docs.mthreads.com/cloud-native/cloud-native-doc-online/)
* 驱动版本 >= 1.2.0

## 开启GPU复用

* 部署'gpu-manager',天数智芯的GPU共享需要配合厂家提供的'MT-CloudNative Toolkit'一起使用,请联系设备提供方获取

> **注意:** *(可选),部署完之后,卸载掉mt-mutating-webhook与mt-scheduler组件,因为这部分功能将由HAMi调度器提供*
* 在安装HAMi时配置'devices.mthreads.enabled = true'参数

```
helm install hami hami-charts/hami --set scheduler.kubeScheduler.imageTag={your kubernetes version} --set device.mthreads.enabled=true -n kube-system
```

## 运行GPU任务

通过指定`mthreads.com/vgpu`, `mthreads.com/sgpu-memory` and `mthreads.com/sgpu-core`这3个参数,可以确定容器申请的切片个数,对应的显存和算力核组

```
apiVersion: v1
kind: Pod
metadata:
name: gpushare-pod-default
spec:
restartPolicy: OnFailure
containers:
- image: core.harbor.zlidc.mthreads.com:30003/mt-ai/lm-qy2:v17-mpc
imagePullPolicy: IfNotPresent
name: gpushare-pod-1
command: ["sleep"]
args: ["100000"]
resources:
limits:
mthreads.com/vgpu: 1
mthreads.com/sgpu-memory: 32
mthreads.com/sgpu-core: 8
```

> **注意1:** *每一单位的sgpu-memory代表512M的显存.*
> **注意2:** *查看更多的[用例](../examples/mthreads/).*



17 changes: 17 additions & 0 deletions examples/mthreads/default_use.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: v1
kind: Pod
metadata:
name: gpushare-pod-default
spec:
restartPolicy: OnFailure
containers:
- image: core.harbor.zlidc.mthreads.com:30003/mt-ai/lm-qy2:v17-mpc
imagePullPolicy: IfNotPresent
name: gpushare-pod-1
command: ["sleep"]
args: ["100000"]
resources:
limits:
mthreads.com/vgpu: 1
mthreads.com/sgpu-memory: 32
mthreads.com/sgpu-core: 8
15 changes: 15 additions & 0 deletions examples/mthreads/multi_cards.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: v1
kind: Pod
metadata:
name: gpushare-pod-multi-cards
spec:
restartPolicy: OnFailure
containers:
- image: core.harbor.zlidc.mthreads.com:30003/mt-ai/lm-qy2:v17-mpc
imagePullPolicy: IfNotPresent
name: gpushare-pod-1
command: ["sleep"]
args: ["100000"]
resources:
limits:
mthreads.com/vgpu: 2
15 changes: 15 additions & 0 deletions examples/mthreads/use_exclusive.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: v1
kind: Pod
metadata:
name: gpushare-pod-exclusive
spec:
restartPolicy: OnFailure
containers:
- image: core.harbor.zlidc.mthreads.com:30003/mt-ai/lm-qy2:v17-mpc
imagePullPolicy: IfNotPresent
name: gpushare-pod-1
command: ["sleep"]
args: ["100000"]
resources:
limits:
mthreads.com/vgpu: 1
10 changes: 7 additions & 3 deletions pkg/device/ascend/device.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ func (dev *Devices) CommonWord() string {
return dev.config.CommonWord
}

func (dev *Devices) MutateAdmission(ctr *corev1.Container) (bool, error) {
func (dev *Devices) MutateAdmission(ctr *corev1.Container, p *corev1.Pod) (bool, error) {
count, ok := ctr.Resources.Limits[corev1.ResourceName(dev.config.ResourceName)]
if !ok {
return false, nil
Expand Down Expand Up @@ -197,7 +197,7 @@ func (dev *Devices) CheckType(annos map[string]string, d util.DeviceUsage, n uti
func (dev *Devices) CheckUUID(annos map[string]string, d util.DeviceUsage) bool {
userUUID, ok := annos[dev.useUUIDAnno]
if ok {
klog.V(5).Infof("check uuid for Iluvatar user uuid [%s], device id is %s", userUUID, d.ID)
klog.V(5).Infof("check uuid for ascend user uuid [%s], device id is %s", userUUID, d.ID)
// use , symbol to connect multiple uuid
userUUIDs := strings.Split(userUUID, ",")
for _, uuid := range userUUIDs {
Expand All @@ -210,7 +210,7 @@ func (dev *Devices) CheckUUID(annos map[string]string, d util.DeviceUsage) bool

noUserUUID, ok := annos[dev.noUseUUIDAnno]
if ok {
klog.V(5).Infof("check uuid for Iluvatar not user uuid [%s], device id is %s", noUserUUID, d.ID)
klog.V(5).Infof("check uuid for ascend not user uuid [%s], device id is %s", noUserUUID, d.ID)
// use , symbol to connect multiple uuid
noUserUUIDs := strings.Split(noUserUUID, ",")
for _, uuid := range noUserUUIDs {
Expand Down Expand Up @@ -268,3 +268,7 @@ func (dev *Devices) GenerateResourceRequests(ctr *corev1.Container) util.Contain
}
return util.ContainerDeviceRequest{}
}

func (dev *Devices) CustomFilterRule(allocated *util.PodDevices, toAllocate util.ContainerDevices, device *util.DeviceUsage) bool {
return true
}
6 changes: 5 additions & 1 deletion pkg/device/cambricon/device.go
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ func (dev *CambriconDevices) AssertNuma(annos map[string]string) bool {
return false
}

func (dev *CambriconDevices) MutateAdmission(ctr *corev1.Container) (bool, error) {
func (dev *CambriconDevices) MutateAdmission(ctr *corev1.Container, p *corev1.Pod) (bool, error) {
_, ok := ctr.Resources.Limits[corev1.ResourceName(MLUResourceCount)]
return ok, nil
}
Expand Down Expand Up @@ -308,3 +308,7 @@ func (dev *CambriconDevices) PatchAnnotations(annoinput *map[string]string, pd u
}
return *annoinput
}

func (dev *CambriconDevices) CustomFilterRule(allocated *util.PodDevices, toAllocate util.ContainerDevices, device *util.DeviceUsage) bool {
return true
}
7 changes: 6 additions & 1 deletion pkg/device/devices.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import (
"github.com/Project-HAMi/HAMi/pkg/device/cambricon"
"github.com/Project-HAMi/HAMi/pkg/device/hygon"
"github.com/Project-HAMi/HAMi/pkg/device/iluvatar"
"github.com/Project-HAMi/HAMi/pkg/device/mthreads"
"github.com/Project-HAMi/HAMi/pkg/device/nvidia"
"github.com/Project-HAMi/HAMi/pkg/util"
"github.com/Project-HAMi/HAMi/pkg/util/client"
Expand All @@ -39,7 +40,7 @@ import (

type Devices interface {
CommonWord() string
MutateAdmission(ctr *corev1.Container) (bool, error)
MutateAdmission(ctr *corev1.Container, pod *corev1.Pod) (bool, error)
CheckHealth(devType string, n *corev1.Node) (bool, bool)
NodeCleanUp(nn string) error
GetNodeDevices(n corev1.Node) ([]*api.DeviceInfo, error)
Expand All @@ -50,6 +51,7 @@ type Devices interface {
ReleaseNodeLock(n *corev1.Node, p *corev1.Pod) error
GenerateResourceRequests(ctr *corev1.Container) util.ContainerDeviceRequest
PatchAnnotations(annoinput *map[string]string, pd util.PodDevices) map[string]string
CustomFilterRule(allocated *util.PodDevices, toAllicate util.ContainerDevices, device *util.DeviceUsage) bool
// This should not be associated with a specific device object
//ParseConfig(fs *flag.FlagSet)
}
Expand All @@ -74,12 +76,14 @@ func InitDevices() {
devices[nvidia.NvidiaGPUDevice] = nvidia.InitNvidiaDevice()
devices[hygon.HygonDCUDevice] = hygon.InitDCUDevice()
devices[iluvatar.IluvatarGPUDevice] = iluvatar.InitIluvatarDevice()
devices[mthreads.MthreadsGPUDevice] = mthreads.InitMthreadsDevice()
//devices[d.AscendDevice] = d.InitDevice()
//devices[ascend.Ascend310PName] = ascend.InitAscend310P()
DevicesToHandle = append(DevicesToHandle, nvidia.NvidiaGPUCommonWord)
DevicesToHandle = append(DevicesToHandle, cambricon.CambriconMLUCommonWord)
DevicesToHandle = append(DevicesToHandle, hygon.HygonDCUCommonWord)
DevicesToHandle = append(DevicesToHandle, iluvatar.IluvatarGPUCommonWord)
DevicesToHandle = append(DevicesToHandle, mthreads.MthreadsGPUCommonWord)
//DevicesToHandle = append(DevicesToHandle, d.AscendDevice)
//DevicesToHandle = append(DevicesToHandle, ascend.Ascend310PName)
for _, dev := range ascend.InitDevices() {
Expand Down Expand Up @@ -138,6 +142,7 @@ func GlobalFlagSet() *flag.FlagSet {
hygon.ParseConfig(fs)
iluvatar.ParseConfig(fs)
nvidia.ParseConfig(fs)
mthreads.ParseConfig(fs)
fs.BoolVar(&DebugMode, "debug", false, "debug mode")
klog.InitFlags(fs)
return fs
Expand Down
6 changes: 5 additions & 1 deletion pkg/device/hygon/device.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ func ParseConfig(fs *flag.FlagSet) {
fs.StringVar(&HygonResourceCores, "dcu-cores", "hygon.com/dcucores", "dcu core resource")
}

func (dev *DCUDevices) MutateAdmission(ctr *corev1.Container) (bool, error) {
func (dev *DCUDevices) MutateAdmission(ctr *corev1.Container, p *corev1.Pod) (bool, error) {
_, ok := ctr.Resources.Limits[corev1.ResourceName(HygonResourceCount)]
return ok, nil
}
Expand Down Expand Up @@ -237,3 +237,7 @@ func (dev *DCUDevices) PatchAnnotations(annoinput *map[string]string, pd util.Po
}
return *annoinput
}

func (dev *DCUDevices) CustomFilterRule(allocated *util.PodDevices, toAllocate util.ContainerDevices, device *util.DeviceUsage) bool {
return true
}
6 changes: 5 additions & 1 deletion pkg/device/iluvatar/device.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ func ParseConfig(fs *flag.FlagSet) {
fs.StringVar(&IluvatarResourceCores, "iluvatar-cores", "iluvatar.ai/vcuda-core", "iluvatar core resource")
}

func (dev *IluvatarDevices) MutateAdmission(ctr *corev1.Container) (bool, error) {
func (dev *IluvatarDevices) MutateAdmission(ctr *corev1.Container, p *corev1.Pod) (bool, error) {
count, ok := ctr.Resources.Limits[corev1.ResourceName(IluvatarResourceCount)]
if ok {
if count.Value() > 1 {
Expand Down Expand Up @@ -217,3 +217,7 @@ func (dev *IluvatarDevices) GenerateResourceRequests(ctr *corev1.Container) util
}
return util.ContainerDeviceRequest{}
}

func (dev *IluvatarDevices) CustomFilterRule(allocated *util.PodDevices, toAllocate util.ContainerDevices, device *util.DeviceUsage) bool {
return true
}
Loading

0 comments on commit 5941e01

Please sign in to comment.