...
2025-11-10 14:17:55,786 - INFO - 192.168.32.85:32898 - "POST /v1/chat/completions HTTP/1.1" 200
2025-11-10 14:17:56,448 - ERROR - LLM inference failed!
Traceback (most recent call last):
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/src/server/worker_registry.py", line 86, in infer_llm
async for item in llm_instance.generate_type(packet.gen_config):
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/src/engine/ov_genai/llm.py", line 96, in generate_text
prompt_token_ids = self.prepare_inputs(gen_config.messages, gen_config.tools)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/src/engine/ov_genai/llm.py", line 47, in prepare_inputs
prompt_token_ids = self.encoder_tokenizer.apply_chat_template(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1652, in apply_chat_template
rendered_chat, generation_indices = render_jinja_template(
^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/transformers/utils/chat_template_utils.py", line 498, in render_jinja_template
rendered_chat = compiled_template.render(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/jinja2/environment.py", line 1295, in render
self.environment.handle_exception()
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/jinja2/environment.py", line 942, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 62, in top-level template code
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/jinja2/sandbox.py", line 401, in call
return __context.call(__obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/transformers/utils/chat_template_utils.py", line 423, in raise_exception
raise jinja2.exceptions.TemplateError(message)
jinja2.exceptions.TemplateError: Tool call IDs should be alphanumeric strings with length 9!
2025-11-10 14:17:56,449 - ERROR - [Ministral-8B LLM Worker] Inference failed, triggering model unload...
2025-11-10 14:17:56,747 - INFO - [Ministral-8B] unloaded successfully
I feel model unloading when something goes wrong with jinja is bit aggressive 🤔 - also not sure if there's general issue with tool calling, I tried both Letta and AutoGen - both would cause above error when trying to have model respond after tool use.
I kinda just threw this out, so if you need any further context, information or need me to do code changes for debugging, let me know 👍
Screenshots from Letta side.
with llama.cpp, tool calls are executed just fine, ID is valid:

with OpenArc, ID is invalid (contains call_)

Also, it seems OpenArc can't handle multiple user messages before an assistant response, causing jinja to fail in that case as well. While llama.cpp handles this gracefully.
I feel model unloading when something goes wrong with jinja is bit aggressive 🤔 - also not sure if there's general issue with tool calling, I tried both Letta and AutoGen - both would cause above error when trying to have model respond after tool use.
I kinda just threw this out, so if you need any further context, information or need me to do code changes for debugging, let me know 👍
Screenshots from Letta side.

with llama.cpp, tool calls are executed just fine, ID is valid:
with OpenArc, ID is invalid (contains

call_)Also, it seems OpenArc can't handle multiple user messages before an assistant response, causing jinja to fail in that case as well. While llama.cpp handles this gracefully.