You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I build from source and the version of flashinfer is 0.1.6
pip install -e .
Obtaining file:///home/hutl/api/flashinfer/python
Preparing metadata (setup.py) ... done
Installing collected packages: flashinfer
DEPRECATION: Legacy editable install of flashinfer==0.1.6 from file:///home/hutl/api/flashinfer/python (setup.py develop) is deprecated. pip 25.0 will enforce this behaviour change. A possible replacement is to add a pyproject.toml or enable --use-pep517, and use setuptools >= 64. If the resulting installation is not behaving as expected, try using --config-settings editable_mode=compat. Please consult the setuptools documentation for more information. Discussion can be found at https://github.com/pypa/pip/issues/11457
Running setup.py develop for flashinfer
Successfully installed flashinfer-0.1.6
When I ran the demo of sglang, it gave me an error
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/home/hutl/api/sglang/python/sglang/launch_server.py", line 6, in <module>
from sglang.srt.server import launch_server
File "/home/hutl/api/sglang/python/sglang/srt/server.py", line 49, in <module>
from sglang.srt.managers.data_parallel_controller import (
File "/home/hutl/api/sglang/python/sglang/srt/managers/data_parallel_controller.py", line 29, in <module>
from sglang.srt.managers.scheduler import run_scheduler_process
File "/home/hutl/api/sglang/python/sglang/srt/managers/scheduler.py", line 61, in <module>
from sglang.srt.managers.tp_worker import TpModelWorker
File "/home/hutl/api/sglang/python/sglang/srt/managers/tp_worker.py", line 27, in <module>
from sglang.srt.model_executor.model_runner import ModelRunner
File "/home/hutl/api/sglang/python/sglang/srt/model_executor/model_runner.py", line 44, in <module>
from sglang.srt.layers.attention.flashinfer_backend import FlashInferAttnBackend
File "/home/hutl/api/sglang/python/sglang/srt/layers/attention/flashinfer_backend.py", line 33, in <module>
from flashinfer.decode import _grouped_size_compiled_for_decode_kernels
ImportError: cannot import name '_grouped_size_compiled_for_decode_kernels' from 'flashinfer.decode' (/home/hutl/api/flashinfer/python/flashinfer/decode.py)
I tried reconfiguring the environment but it didn't work.
Are there any other possible solutions?
The text was updated successfully, but these errors were encountered:
That function was removed recently (because with the new JIT feature, all group size can be compiled with JIT) in mainline, I can add this back for backward compatibility, but it's better to not rely on this function in sglang.
We can use some heuristic to control whether to use tensor cores or not. @hnyls2002@merrymercy WDTY?
@yzh119 Can you provide a utility function inside flashinfer to decide whether to use tensor core?
Or can you do this decision automatically inside flashinfer when I pass use_tensor_core=="auto"?
I build from source and the version of flashinfer is 0.1.6
When I ran the demo of sglang, it gave me an error
I tried reconfiguring the environment but it didn't work.
Are there any other possible solutions?
The text was updated successfully, but these errors were encountered: