-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Closed as not planned
Labels
Description
I failed to use apply_chat_template
when using function calling.
Example
I downloaded tokenizer.json
and tokenizer_config.json
for testing function calling to see its full prompt.
The complete code:
from transformers import AutoTokenizer
model_path = "/data1/ztshao/pretrained_models/deepseek-ai/DeepSeek-V3"
def get_current_temperature(location: str) -> float:
"""
Get the current temperature at a location.
Args:
location: Get the location of the temperature in the format of "city, country"
Returns:
Displays the current temperature at the specified location as a floating point number (in the specified unit).。
"""
return 22.
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France"}}
messages = [
{"role": "system", "content": "You are a robot that responds to weather queries."},
{"role": "user", "content": "What is the temperature in Paris now?"},
{"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]},
]
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
inputs = tokenizer.apply_chat_template(messages,
tools=[get_current_temperature],
add_generation_prompt=False,
tokenize=False,
tools_in_user_message=False)
template in tokenizer
I checked the jinja2 template and saw that:
- It defines a condition when
message['content']
is None . - It concatenates a dict
tool['function']['arguments']
to str.
Template:
{%- if message['role'] == 'assistant' and message['content'] is none %}
{%- set ns.is_tool = false -%}
{%- for tool in message['tool_calls'] %}
{%- if not ns.is_first %}
{{ '<|Assistant|><|tool calls begin|><|tool call begin|>' + tool['type'] + '<|tool sep|>' + tool['function']['name'] + '\n```json\n' + tool['function']['arguments'] + '\n```<|tool call end|>' }}
{%- set ns.is_first = true -%}
{%- else %}
{{ '\n<|tool call begin|>' + tool['type'] + '<|tool sep|>' + tool['function']['name'] + '\n```json\n' + tool['function']['arguments'] + '\n```<|tool call end|>' }}
{{ '<|tool calls end|><|end of sentence|>' }}
{%- endif %}
{%- endfor %}
{%- endif %}
Error
After testing, I found that the problem lies in the last message.
The error:
Traceback (most recent call last):
File "/data1/ztshao/projects/flooding/test/test.py", line 34, in <module>
inputs = tokenizer.apply_chat_template(messages,
File "/data1/ztshao/.conda/envs/flooding/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1695, in apply_chat_template
rendered_chat = compiled_template.render(
File "/data1/ztshao/.conda/envs/flooding/lib/python3.10/site-packages/jinja2/environment.py", line 1295, in render
self.environment.handle_exception()
File "/data1/ztshao/.conda/envs/flooding/lib/python3.10/site-packages/jinja2/environment.py", line 942, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 10, in top-level template code
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'content'
Thus, I tried to add key-value pairs "content":None
in the last messages:
{"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}], "content":None},
However, there is a new error:
Traceback (most recent call last):
File "/data1/ztshao/projects/flooding/test/test.py", line 34, in <module>
inputs = tokenizer.apply_chat_template(messages,
File "/data1/ztshao/.conda/envs/flooding/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1695, in apply_chat_template
rendered_chat = compiled_template.render(
File "/data1/ztshao/.conda/envs/flooding/lib/python3.10/site-packages/jinja2/environment.py", line 1295, in render
self.environment.handle_exception()
File "/data1/ztshao/.conda/envs/flooding/lib/python3.10/site-packages/jinja2/environment.py", line 942, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 6, in top-level template code
TypeError: can only concatenate str (not "dict") to str
The problem is that tool['function']['arguments']
is a dict and can not be concatenated to a str. However, the jinja2 template concatenates it to a string.
Is there a problem with tokenizer_config.json
? Or is there anything I missed? Thanks a lot!