[Bug]: ZhipuAI/GLM-4-32B-0414 failed to start in enage and graph model

Your current environment

image: v0.9.2rc1

VLLM_USE_MODELSCOPE=True vllm serve ZhipuAI/GLM-4-32B-0414 --tensor_parallel_size 2 --trust_remote_code --enforce-eager & VLLM_USE_MODELSCOPE=True vllm serve ZhipuAI/GLM-4-32B-0414 --tensor_parallel_size 2 --trust_remote_code &

🐛 Describe the bug

eager bug:

(VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] WorkerProc hit an exception. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] Traceback (most recent call last): (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 541, in worker_busy_loop (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] output = func(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 155, in determine_available_memory (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] self.model_runner.profile_run() (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1863, in profile_run (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = self._dummy_run(self.max_num_tokens, (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return func(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1838, in _dummy_run (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = model( (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/glm4.py", line 285, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = self.model(input_ids, positions, intermediate_tensors, (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 206, in __call__ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self.forward(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/llama.py", line 392, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states, residual = layer(positions, hidden_states, residual) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/glm4.py", line 207, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = self.post_mlp_layernorm(hidden_states) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/custom_op.py", line 44, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._forward_method(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/ops/layernorm.py", line 82, in forward_oot (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] x, residual = torch_npu.npu_rms_norm(x, self.weight, self.variance_epsilon) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/_ops.py", line 1116, in __call__ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._op(*args, **(kwargs or {})) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, pleace set the environment variable ASCEND_LAUNCH_BLOCKING=1. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] [ERROR] 2025-08-04-03:13:46 (PID:42874, Device:0, RankID:-1) ERR00100 PTA call acl api failed. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] Traceback (most recent call last): (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 541, in worker_busy_loop (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] output = func(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 155, in determine_available_memory (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] self.model_runner.profile_run() (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1863, in profile_run (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = self._dummy_run(self.max_num_tokens, (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return func(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1838, in _dummy_run (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = model( (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/glm4.py", line 285, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = self.model(input_ids, positions, intermediate_tensors, (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 206, in __call__ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self.forward(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/llama.py", line 392, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states, residual = layer(positions, hidden_states, residual) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/glm4.py", line 207, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] hidden_states = self.post_mlp_layernorm(hidden_states) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/custom_op.py", line 44, in forward (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._forward_method(*args, **kwargs) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/ops/layernorm.py", line 82, in forward_oot (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] x, residual = torch_npu.npu_rms_norm(x, self.weight, self.variance_epsilon) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/_ops.py", line 1116, in __call__ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] return self._op(*args, **(kwargs or {})) (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, pleace set the environment variable ASCEND_LAUNCH_BLOCKING=1. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] [ERROR] 2025-08-04-03:13:46 (PID:42874, Device:0, RankID:-1) ERR00100 PTA call acl api failed. (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] (VllmWorker rank=0 pid=42874) ERROR 08-04 03:13:46 [multiproc_executor.py:546] ERROR 08-04 03:13:46 [core.py:632] EngineCore failed to start. ERROR 08-04 03:13:46 [core.py:632] Traceback (most recent call last): ERROR 08-04 03:13:46 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 623, in run_engine_core ERROR 08-04 03:13:46 [core.py:632] engine_core = EngineCoreProc(*args, **kwargs) ERROR 08-04 03:13:46 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-04 03:13:46 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 441, in __init__ ERROR 08-04 03:13:46 [core.py:632] super().__init__(vllm_config, executor_class, log_stats, ERROR 08-04 03:13:46 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 86, in __init__ ERROR 08-04 03:13:46 [core.py:632] self._initialize_kv_caches(vllm_config) ERROR 08-04 03:13:46 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 158, in _initialize_kv_caches ERROR 08-04 03:13:46 [core.py:632] self.model_executor.determine_available_memory()) ERROR 08-04 03:13:46 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-04 03:13:46 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 76, in determine_available_memory ERROR 08-04 03:13:46 [core.py:632] output = self.collective_rpc("determine_available_memory") ERROR 08-04 03:13:46 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-04 03:13:46 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 237, in collective_rpc ERROR 08-04 03:13:46 [core.py:632] result = get_response(w, dequeue_timeout) ERROR 08-04 03:13:46 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-04 03:13:46 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 224, in get_response ERROR 08-04 03:13:46 [core.py:632] raise RuntimeError( ERROR 08-04 03:13:46 [core.py:632] RuntimeError: Worker failed with error 'The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation. ERROR 08-04 03:13:46 [core.py:632] Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, pleace set the environment variable ASCEND_LAUNCH_BLOCKING=1. ERROR 08-04 03:13:46 [core.py:632] Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging. ERROR 08-04 03:13:46 [core.py:632] [ERROR] 2025-08-04-03:13:46 (PID:42874, Device:0, RankID:-1) ERR00100 PTA call acl api failed. ERROR 08-04 03:13:46 [core.py:632] ', please check the stack trace above for the root cause [rank1]:[E804 03:13:46.347801128 compiler_depend.ts:429] RopeOperation setup failed! Exception raised from OperationSetup at build/third_party/op-plugin/op_plugin/CMakeFiles/op_plugin_atb.dir/compiler_depend.ts:148 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0xb8 (0xffff8248c908 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x6c (0xffff8243b404 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch/lib/libc10.so) frame #2: atb::OperationSetup(atb::VariantPack, atb::Operation*, atb::Context*) + 0xc8 (0xffff67843b3c in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so) frame #3: <unknown function> + 0x83be4 (0xffff67843be4 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so) frame #4: <unknown function> + 0x192b0f0 (0xffff75d8b0f0 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #5: <unknown function> + 0x811794 (0xffff74c71794 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #6: <unknown function> + 0x8139c4 (0xffff74c739c4 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #7: <unknown function> + 0x810334 (0xffff74c70334 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #8: <unknown function> + 0x4c9e4c (0xffff824c9e4c in /usr/local/python3.11.13/lib/python3.11/site-packages/torch/lib/libc10.so) frame #9: <unknown function> + 0x7d5b8 (0xffff8d04d5b8 in /lib/aarch64-linux-gnu/libc.so.6) frame #10: <unknown function> + 0xe5edc (0xffff8d0b5edc in /lib/aarch64-linux-gnu/libc.so.6) ERROR 08-04 03:13:57 [multiproc_executor.py:140] Worker proc VllmWorker-0 died unexpectedly, shutting down executor. Process EngineCore_0: Traceback (most recent call last): File "/usr/local/python3.11.13/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/local/python3.11.13/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 636, in run_engine_core raise e File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 623, in run_engine_core engine_core = EngineCoreProc(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 441, in __init__ super().__init__(vllm_config, executor_class, log_stats, File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 86, in __init__ self._initialize_kv_caches(vllm_config) File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 158, in _initialize_kv_caches self.model_executor.determine_available_memory()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 76, in determine_available_memory output = self.collective_rpc("determine_available_memory") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 237, in collective_rpc result = get_response(w, dequeue_timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 224, in get_response raise RuntimeError( RuntimeError: Worker failed with error 'The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation. Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, pleace set the environment variable ASCEND_LAUNCH_BLOCKING=1. Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging. [ERROR] 2025-08-04-03:13:46 (PID:42874, Device:0, RankID:-1) ERR00100 PTA call acl api failed. ', please check the stack trace above for the root cause Traceback (most recent call last): File "/usr/local/python3.11.13/bin/vllm", line 8, in <module> sys.exit(main()) ^^^^^^ File "/vllm-workspace/vllm/vllm/entrypoints/cli/main.py", line 54, in main args.dispatch_function(args) File "/vllm-workspace/vllm/vllm/entrypoints/cli/serve.py", line 52, in cmd uvloop.run(run_server(args)) File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvloop/__init__.py", line 105, in run return runner.run(wrapper()) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/python3.11.13/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvloop/__init__.py", line 61, in wrapper return await main ^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 1791, in run_server await run_server_worker(listen_address, sock, args, **uvicorn_kwargs) File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 1811, in run_server_worker async with build_async_engine_client(args, client_config) as engine_client: File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 158, in build_async_engine_client async with build_async_engine_client_from_engine_args( File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 194, in build_async_engine_client_from_engine_args async_llm = AsyncLLM.from_vllm_config( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 163, in from_vllm_config return cls( ^^^^ File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 117, in __init__ self.engine_core = EngineCoreClient.make_async_mp_client( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 98, in make_async_mp_client return AsyncMPClient(*client_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 677, in __init__ super().__init__( File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 408, in __init__ with launch_core_engines(vllm_config, executor_class, File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 144, in __exit__ next(self.gen) File "/vllm-workspace/vllm/vllm/v1/engine/utils.py", line 697, in launch_core_engines wait_for_engine_startup( File "/vllm-workspace/vllm/vllm/v1/engine/utils.py", line 750, in wait_for_engine_startup raise RuntimeError("Engine core initialization failed. " RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {} [ERROR] 2025-08-04-03:13:59 (PID:42358, Device:-1, RankID:-1) ERR99999 UNKNOWN applicaiton exception

graph bug

frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xe4 (0xffff850e3e44 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch/lib/libc10.so) frame #2: atb::OperationSetup(atb::VariantPack, atb::Operation*, atb::Context*) + 0x254 (0xffff6294ac24 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so) frame #3: <unknown function> + 0x8b7bc (0xffff6294b7bc in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libop_plugin_atb.so) frame #4: <unknown function> + 0x22887d4 (0xffff778c87d4 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #5: <unknown function> + 0x8fb170 (0xffff75f3b170 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #6: <unknown function> + 0x8fd504 (0xffff75f3d504 in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #7: <unknown function> + 0x8f9e2c (0xffff75f39e2c in /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #8: <unknown function> + 0xd31fc (0xffff84f531fc in /lib/aarch64-linux-gnu/libstdc++.so.6) frame #9: <unknown function> + 0x7d5b8 (0xffff9121d5b8 in /lib/aarch64-linux-gnu/libc.so.6) frame #10: <unknown function> + 0xe5edc (0xffff91285edc in /lib/aarch64-linux-gnu/libc.so.6) (VllmWorker rank=0 pid=190808) Traceback (most recent call last): (VllmWorker rank=0 pid=190808) File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 393, in __call__ (VllmWorker rank=0 pid=190808) return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc] (VllmWorker rank=0 pid=190808) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl (VllmWorker rank=0 pid=190808) return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=190808) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl (VllmWorker rank=0 pid=190808) return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=190808) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) File "<eval_with_key>.3", line 22, in forward (VllmWorker rank=0 pid=190808) mul = silu * getitem_5; silu = getitem_5 = None (VllmWorker rank=0 pid=190808) ~~~~~^~~~~~~~~~~ (VllmWorker rank=0 pid=190808) RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation. (VllmWorker rank=0 pid=190808) Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, please set the environment variable ASCEND_LAUNCH_BLOCKING=1. (VllmWorker rank=0 pid=190808) Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging. (VllmWorker rank=0 pid=190808) [ERROR] 2025-08-07-04:00:10 (PID:190808, Device:0, RankID:-1) ERR00100 PTA call acl api failed. (VllmWorker rank=0 pid=190808) (VllmWorker rank=0 pid=190808) (VllmWorker rank=0 pid=190808) Call using an FX-traced Module, line 22 of the traced Module's generated forward function: (VllmWorker rank=0 pid=190808) getitem_5 = linear_1[(Ellipsis, slice(11520, None, None))]; linear_1 = None (VllmWorker rank=0 pid=190808) mul = silu * getitem_5; silu = getitem_5 = None (VllmWorker rank=0 pid=190808) (VllmWorker rank=0 pid=190808) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE (VllmWorker rank=0 pid=190808) linear_2 = torch._C._nn.linear(mul, l_self_modules_layers_modules_0_modules_mlp_modules_down_proj_parameters_weight_, None); mul = l_self_modules_layers_modules_0_modules_mlp_modules_down_proj_parameters_weight_ = None (VllmWorker rank=0 pid=190808) (VllmWorker rank=0 pid=190808) all_reduce_1 = torch.ops._c10d_functional.all_reduce(linear_2, 'sum', '3') (VllmWorker rank=0 pid=190808) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] WorkerProc hit an exception. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] Traceback (most recent call last): (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 541, in worker_busy_loop (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] output = func(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 157, in determine_available_memory (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] self.model_runner.profile_run() (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1965, in profile_run (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] hidden_states = self._dummy_run(self.max_num_tokens, (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return func(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1940, in _dummy_run (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] hidden_states = model( (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/glm4.py", line 285, in forward (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] hidden_states = self.model(input_ids, positions, intermediate_tensors, (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 272, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] output = self.compiled_callable(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 655, in _fn (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return fn(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/llama.py", line 368, in forward (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] def forward( (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 838, in _fn (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return fn(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 830, in call_wrapped (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._wrapped_call(self, *args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 406, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] raise e (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 393, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc] (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "<eval_with_key>.124", line 505, in forward (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] submod_2 = self.submod_2(getitem_3, s0, l_self_modules_layers_modules_0_modules_self_attn_modules_o_proj_parameters_weight_, l_self_modules_layers_modules_0_modules_post_self_attn_layernorm_parameters_weight_, getitem_4, l_self_modules_layers_modules_0_modules_post_attention_layernorm_parameters_weight_, l_self_modules_layers_modules_0_modules_mlp_modules_gate_up_proj_parameters_weight_, l_self_modules_layers_modules_0_modules_mlp_modules_down_proj_parameters_weight_, l_self_modules_layers_modules_0_modules_post_mlp_layernorm_parameters_weight_, l_self_modules_layers_modules_1_modules_input_layernorm_parameters_weight_, l_self_modules_layers_modules_1_modules_self_attn_modules_qkv_proj_parameters_weight_, l_positions_, s1, l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_); getitem_3 = l_self_modules_layers_modules_0_modules_self_attn_modules_o_proj_parameters_weight_ = l_self_modules_layers_modules_0_modules_post_self_attn_layernorm_parameters_weight_ = getitem_4 = l_self_modules_layers_modules_0_modules_post_attention_layernorm_parameters_weight_ = l_self_modules_layers_modules_0_modules_mlp_modules_gate_up_proj_parameters_weight_ = l_self_modules_layers_modules_0_modules_mlp_modules_down_proj_parameters_weight_ = l_self_modules_layers_modules_0_modules_post_mlp_layernorm_parameters_weight_ = l_self_modules_layers_modules_1_modules_input_layernorm_parameters_weight_ = l_self_modules_layers_modules_1_modules_self_attn_modules_qkv_proj_parameters_weight_ = None (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/compilation/piecewise_backend.py", line 123, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self.compiled_graph_for_general_shape(*args) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 830, in call_wrapped (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._wrapped_call(self, *args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 404, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] raise e.with_traceback(None) # noqa: B904 (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, please set the environment variable ASCEND_LAUNCH_BLOCKING=1. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] [ERROR] 2025-08-07-04:00:10 (PID:190808, Device:0, RankID:-1) ERR00100 PTA call acl api failed. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] Traceback (most recent call last): (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 541, in worker_busy_loop (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] output = func(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 157, in determine_available_memory (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] self.model_runner.profile_run() (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1965, in profile_run (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] hidden_states = self._dummy_run(self.max_num_tokens, (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return func(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1940, in _dummy_run (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] hidden_states = model( (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/glm4.py", line 285, in forward (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] hidden_states = self.model(input_ids, positions, intermediate_tensors, (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 272, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] output = self.compiled_callable(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 655, in _fn (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return fn(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm/vllm/model_executor/models/llama.py", line 368, in forward (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] def forward( (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 838, in _fn (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return fn(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 830, in call_wrapped (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._wrapped_call(self, *args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 406, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] raise e (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 393, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc] (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._call_impl(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return forward_call(*args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "<eval_with_key>.124", line 505, in forward (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] submod_2 = self.submod_2(getitem_3, s0, l_self_modules_layers_modules_0_modules_self_attn_modules_o_proj_parameters_weight_, l_self_modules_layers_modules_0_modules_post_self_attn_layernorm_parameters_weight_, getitem_4, l_self_modules_layers_modules_0_modules_post_attention_layernorm_parameters_weight_, l_self_modules_layers_modules_0_modules_mlp_modules_gate_up_proj_parameters_weight_, l_self_modules_layers_modules_0_modules_mlp_modules_down_proj_parameters_weight_, l_self_modules_layers_modules_0_modules_post_mlp_layernorm_parameters_weight_, l_self_modules_layers_modules_1_modules_input_layernorm_parameters_weight_, l_self_modules_layers_modules_1_modules_self_attn_modules_qkv_proj_parameters_weight_, l_positions_, s1, l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_); getitem_3 = l_self_modules_layers_modules_0_modules_self_attn_modules_o_proj_parameters_weight_ = l_self_modules_layers_modules_0_modules_post_self_attn_layernorm_parameters_weight_ = getitem_4 = l_self_modules_layers_modules_0_modules_post_attention_layernorm_parameters_weight_ = l_self_modules_layers_modules_0_modules_mlp_modules_gate_up_proj_parameters_weight_ = l_self_modules_layers_modules_0_modules_mlp_modules_down_proj_parameters_weight_ = l_self_modules_layers_modules_0_modules_post_mlp_layernorm_parameters_weight_ = l_self_modules_layers_modules_1_modules_input_layernorm_parameters_weight_ = l_self_modules_layers_modules_1_modules_self_attn_modules_qkv_proj_parameters_weight_ = None (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/vllm-workspace/vllm-ascend/vllm_ascend/compilation/piecewise_backend.py", line 123, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self.compiled_graph_for_general_shape(*args) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 830, in call_wrapped (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] return self._wrapped_call(self, *args, **kwargs) (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/fx/graph_module.py", line 404, in __call__ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] raise e.with_traceback(None) # noqa: B904 (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, please set the environment variable ASCEND_LAUNCH_BLOCKING=1. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] [ERROR] 2025-08-07-04:00:10 (PID:190808, Device:0, RankID:-1) ERR00100 PTA call acl api failed. (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] (VllmWorker rank=0 pid=190808) ERROR 08-07 04:00:10 [multiproc_executor.py:546] ERROR 08-07 04:00:10 [core.py:632] EngineCore failed to start. ERROR 08-07 04:00:10 [core.py:632] Traceback (most recent call last): ERROR 08-07 04:00:10 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 623, in run_engine_core ERROR 08-07 04:00:10 [core.py:632] engine_core = EngineCoreProc(*args, **kwargs) ERROR 08-07 04:00:10 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-07 04:00:10 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 441, in __init__ ERROR 08-07 04:00:10 [core.py:632] super().__init__(vllm_config, executor_class, log_stats, ERROR 08-07 04:00:10 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 86, in __init__ ERROR 08-07 04:00:10 [core.py:632] self._initialize_kv_caches(vllm_config) ERROR 08-07 04:00:10 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 158, in _initialize_kv_caches ERROR 08-07 04:00:10 [core.py:632] self.model_executor.determine_available_memory()) ERROR 08-07 04:00:10 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-07 04:00:10 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 76, in determine_available_memory ERROR 08-07 04:00:10 [core.py:632] output = self.collective_rpc("determine_available_memory") ERROR 08-07 04:00:10 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-07 04:00:10 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 237, in collective_rpc ERROR 08-07 04:00:10 [core.py:632] result = get_response(w, dequeue_timeout) ERROR 08-07 04:00:10 [core.py:632] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 08-07 04:00:10 [core.py:632] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 224, in get_response ERROR 08-07 04:00:10 [core.py:632] raise RuntimeError( ERROR 08-07 04:00:10 [core.py:632] RuntimeError: Worker failed with error 'The Inner error is reported as above. The process exits for this inner error, and the current working operator name is RopeOperation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: ZhipuAI/GLM-4-32B-0414 failed to start in enage and graph model #2258

Your current environment

🐛 Describe the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: ZhipuAI/GLM-4-32B-0414 failed to start in enage and graph model #2258

Description

Your current environment

🐛 Describe the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions