-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a API to check SVE Length support on ARM CPU. #255
Conversation
Signed-off-by: maajidkhann <[email protected]>
@malfet @digantdesai Please review this PR. This will make it easy for people to use API's from CPUINFO into PyTorch to get SVE vector length info from the hardware. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix formatting (see lint failure in CI)
Also, all links to the images seems to be broken
And last but not least, please explain why boolean has_sve128
are better than single get_max_sve_len
API?
src/arm/linux/aarch64-isa.c
Outdated
#define PR_SVE_GET_VL 51 | ||
#define PR_SVE_VL_LEN_MASK 0xffff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you want to add something like #ifndef PR_SVE_GET_VL
guard here? And also, please mention which system headers they should be defined in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tools/isa-info.c
Outdated
printf("\tSVE 128-bit: %s\n", cpuinfo_support_arm_sve128() ? "Yes" : "No"); | ||
printf("\tSVE 256-bit: %s\n", cpuinfo_support_arm_sve256() ? "Yes" : "No"); | ||
printf("\tSVE 512-bit: %s\n", cpuinfo_support_arm_sve512() ? "Yes" : "No"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://developer.arm.com/documentation/101458/2404/Compiler-options/-msve-vector-bits- seems to indicate, that SVE length can be up to 2048? If this is the case, wouldn't it be better to add cpuinfo_get_arm_sve_lengh()
method rather than enum? Also, if it's always a power of two, wouldn't it be easier if it returned a bitshifts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Done.
My intention in the first place was to introduce functions like how AVX has done for AVX2 and AVX512 which returns a bool and for SVE as only VL=128, 256 and 512 are relevant in ARM ecosystem, I added three different functions.
But it makes sense to just add a single function that returns maximum VL supported on that hardware and now the user can work out the logic required with the value returned.
include/cpuinfo.h
Outdated
@@ -1670,6 +1670,7 @@ struct cpuinfo_arm_isa { | |||
bool sve; | |||
bool sve2; | |||
bool i8mm; | |||
int svelen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why signed type? Is -1
used to indicate unsupported
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As that was the initial thought, -1 = unsupported. Now I have changed it to unsigned int as we don't deal with negative values and svelen value always remains >=0 in our context.
src/arm/linux/aarch64-isa.c
Outdated
#define PR_SVE_VL_LEN_MASK 0xffff | ||
int ret = prctl(PR_SVE_GET_VL); | ||
if (ret < 0) { | ||
perror("prctl(PR_SVE_GET_VL) failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perror("prctl(PR_SVE_GET_VL) failed"); | |
cpuinfo_log_error("prctl(PR_SVE_GET_VL) failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
->Addresses review comments and adds a function that directly returns the max VL on ARM supported CPU's. ->Fixes Lint Issues. Signed-off-by: maajidkhann <[email protected]>
Addresses all the comments. Lint issues also have been now fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the updates, looks good, but please consider addressing few minor remarks
include/cpuinfo.h
Outdated
static inline int cpuinfo_get_max_arm_sve_length(void) { | ||
#if CPUINFO_ARCH_ARM64 | ||
if ((cpuinfo_isa.sve || cpuinfo_isa.sve2) && (cpuinfo_isa.svelen > 0)) | ||
return cpuinfo_isa.svelen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood your previous diff correctly, svelen is returned in bytes, but API developers usually think about it in bitlength, therefore it must be multiplied by 8, shouldn't it?
return cpuinfo_isa.svelen; | |
return cpuinfo_isa.svelen * 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes its better to return bit length directly compared to bytes to avoid confusion to the end users.
Changes done.
include/cpuinfo.h
Outdated
@@ -1670,6 +1670,7 @@ struct cpuinfo_arm_isa { | |||
bool sve; | |||
bool sve2; | |||
bool i8mm; | |||
unsigned int svelen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned int svelen; | |
uint32_t svelen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
include/cpuinfo.h
Outdated
static inline int cpuinfo_get_max_arm_sve_length(void) { | ||
#if CPUINFO_ARCH_ARM64 | ||
if ((cpuinfo_isa.sve || cpuinfo_isa.sve2) && (cpuinfo_isa.svelen > 0)) | ||
return cpuinfo_isa.svelen; | ||
else | ||
return -1; // Returns -1 on Non SVE supported ARM CPU's. | ||
#else | ||
return -1; // Returns -1 on Non ARM CPU's. | ||
#endif | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep it simple? One does not need a check, on whether or not svelen is supported, as it should be initialized to 0, shouldn't it?
static inline int cpuinfo_get_max_arm_sve_length(void) { | |
#if CPUINFO_ARCH_ARM64 | |
if ((cpuinfo_isa.sve || cpuinfo_isa.sve2) && (cpuinfo_isa.svelen > 0)) | |
return cpuinfo_isa.svelen; | |
else | |
return -1; // Returns -1 on Non SVE supported ARM CPU's. | |
#else | |
return -1; // Returns -1 on Non ARM CPU's. | |
#endif | |
} | |
static inline uint32_t cpuinfo_get_max_arm_sve_length(void) { | |
#if CPUINFO_ARCH_ARM64 | |
return cpuinfo_isa.svelen; | |
#else | |
return 0; | |
#endif | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, changes done.
src/arm/linux/aarch64-isa.c
Outdated
// Mask out the SVE vector length bits | ||
isa->svelen = ret & PR_SVE_VL_LEN_MASK; | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#endif | |
#else | |
isa->svelen = 0; | |
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not required anymore as we have removed the following check - #if defined(linux) (Not required as the code is compiled for Linux anyways).
src/arm/linux/aarch64-isa.c
Outdated
@@ -151,4 +151,25 @@ void cpuinfo_arm64_linux_decode_isa_from_proc_cpuinfo( | |||
if (features & CPUINFO_ARM_LINUX_FEATURE_ASIMDFHM) { | |||
isa->fhm = true; | |||
} | |||
|
|||
#if defined(__linux__) | |||
#include <sys/prctl.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding #include
directive in the body of the function feels very fragile, can you please move it to the top level of the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/arm/linux/aarch64-isa.c
Outdated
@@ -151,4 +151,25 @@ void cpuinfo_arm64_linux_decode_isa_from_proc_cpuinfo( | |||
if (features & CPUINFO_ARM_LINUX_FEATURE_ASIMDFHM) { | |||
isa->fhm = true; | |||
} | |||
|
|||
#if defined(__linux__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it needs #ifdef __linux__
, as this file should only be compiled for Linux anyway, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Removed this macro.
Signed-off-by: maajidkhann <[email protected]>
Regression PR : pytorch/cpuinfo#255 Signed-off-by: Nikhil Gupta <[email protected]> Change-Id: I56cec061072be11ec33ccb661114360b979fc7aa
Regression PR : pytorch/cpuinfo#255 Signed-off-by: Nikhil Gupta <[email protected]> Change-Id: I56cec061072be11ec33ccb661114360b979fc7aa
…re (#135857) Regression PR : pytorch/cpuinfo#255 Change-Id: I56cec061072be11ec33ccb661114360b979fc7aa Pull Request resolved: #135857 Approved by: https://github.com/digantdesai, https://github.com/malfet
…re (pytorch#135857) Regression PR : pytorch/cpuinfo#255 Change-Id: I56cec061072be11ec33ccb661114360b979fc7aa Pull Request resolved: pytorch#135857 Approved by: https://github.com/digantdesai, https://github.com/malfet
This pull request introduces a new feature to the cpuinfo library that adds an API to return the maximum supported Scalable Vector Extension (SVE) vector length on the given ARM CPU. This enhancement will allow users to query and determine the maximum SVE vector lengths on a given ARM CPU, providing better insights and flexibility for optimizing applications that utilize SVE.
Key Features:
New API Function:
Introduces a single API function - cpuinfo_get_max_arm_sve_length() that returns the maximum SVE Vector Length supported on the given ARM CPU.
The function is designed to be easy to integrate with existing code in other projects like PyTorch (pytorch/pytorch#119571) and provides a straightforward interface for querying SVE VL.
Here's the sample output on SVE supported instance:
Query:
Output on SVE256 supported Hardware - Graviton3: