Prerequisites
Supported Devices
- Atlas 800I A2 inference series (Atlas 800I A2)
- Atlas 800I A3 inference series (Atlas 800I A3)
Setup environment using container
Notice: The following commands are based on Atlas 800I A3 machines. If you are using Atlas 800I A2, some changes are needed.- The image tag needs to be
main-cann8.5.0-a3for Atlas 800I A3 andmain-cann8.5.0-910bfor Atlas 800I A2. - The device mapping in
docker runcommand needs to be changed todavinci[0-7]for Atlas 800I A2.
Command
Usage
The SGLang server is installed in the container by default. You can usepip show sglang to check the version.
Start SGLang server
SGLang will automatically download the model from Hugging Face.Command
Output
Send a test request
You can do inference using the server:Command
Stop server and exit container
The SGLang server is running as a background process. You can send aSIGINT signal to stop it.
Command
Output
ps -ef | grep sglang, then exit the container by pressing Ctrl+D.