-
Notifications
You must be signed in to change notification settings - Fork 0
reconstruct tool calls #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
{"key": "8ff05b20", "response": "{\"model\":\"claude-sonnet-4-5-20250929\",\"id\":\"msg_01UkSZXczvtptdBLJemBkwnv\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"# Image Description\\n\\nThis adorable image shows a **Cavalier King Charles Spaniel puppy** with the classic Blenheim coloring (chestnut brown and white markings). \\n\\n## Key features visible:\\n- **Expressive brown eyes** looking directly at the camera\\n- **Soft, fluffy ears** with rich brown fur\\n- **White blaze** down the center of the face\\n- **White chest and paws**\\n- The puppy is lying on **green grass**\\n- **Purple flowers** (appear to be asters or similar) in the background\\n- Warm, soft lighting creating a charming portrait effect\\n\\nThe puppy has that irresistibly sweet, gentle expression that Cavalier King Charles Spaniels are famous for. This looks like a professional or carefully composed photograph, possibly for a breeder, pet portrait, or greeting card.\"}],\"stop_reason\":\"end_turn\",\"stop_sequence\":null,\"usage\":{\"input_tokens\":105,\"cache_creation_input_tokens\":0,\"cache_read_input_tokens\":0,\"cache_creation\":{\"ephemeral_5m_input_tokens\":0,\"ephemeral_1h_input_tokens\":0},\"output_tokens\":195,\"service_tier\":\"standard\"}}"} | ||
{"key": "130a52f1", "response": "{\"model\":\"claude-sonnet-4-5-20250929\",\"id\":\"msg_01NsVrovfY7JrTr5dPygRJhb\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\" D A C T E D\\n\\nI don't actually know your name - you haven't told me what it is yet! If you'd like me to spell your name, please let me know what it is first.\"}],\"stop_reason\":\"end_turn\",\"stop_sequence\":null,\"usage\":{\"input_tokens\":16,\"cache_creation_input_tokens\":0,\"cache_read_input_tokens\":0,\"cache_creation\":{\"ephemeral_5m_input_tokens\":0,\"ephemeral_1h_input_tokens\":0},\"output_tokens\":47,\"service_tier\":\"standard\"}}"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although these 2 items are unrelated to the pr, they were automatically added to the cache because their chat
history is linked to upstream calls. This dependency has been removed in this pr.
if not msgs: return [] | ||
if not isinstance(msgs, list): msgs = [msgs] | ||
res,role = [],'user' | ||
msgs = L(msgs).map(lambda m: _build_tool_hist(m) if "<details class='tool-usage-details'>" in m else [m]).concat() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line ensures that we automatically reconstruct the tool call history for every tool call summary msg in msgs.
yield f"\n<details class='tool-usage-details'>\n\n - `{fn.name}({_trunc_str(fn.arguments, replace='<TRUNCATED>')})`\n" | ||
elif isinstance(o, dict) and 'tool_call_id' in o: | ||
yield f" - `{_trunc_str(_clean_str(o.get('content')))}`\n\n</details>\n\n" | ||
yield f" - `{o['tool_call_id']}`\n\n - `{_trunc_str(_clean_str(o.get('content')),replace='<TRUNCATED>')}`\n\n</details>\n\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We needed to include the tool_call_id
in the summary so that we could fully reproduce the original tool call message. If we used a random id, it would bust the LLM cache and cachy's cache 😅 .
def mk_tc(func, args, tcid=None, idx=1): | ||
if not tcid: tcid = random_tool_id() | ||
return {'index': idx, 'function': {'arguments': args, 'name': func}, 'id': tcid, 'type': 'function'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes make it easy to create tool messages from the tool call summary message (i.e. when the called function and args are strings).
The downside is that there's a little more effort involved in creating a tool call message when the function and args are symbols.
For example. Here's the syntax on main
mk_tc(simple_add, a=5, b=7)
vs the syntax for this pr.
mk_tc(simple_add.__name__, json.dumps(dict(a=5, b=7)))
return hist | ||
|
||
# %% ../nbs/00_core.ipynb | ||
def mk_msgs(msgs, # List of messages (each: str, bytes, list, or dict w 'role' and 'content' fields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redefining mk_msgs
for the sake of a 1 line change isn't ideal. Is there a better way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the implementation feels a bit over-clever to me. It needn't be so integrated and automatic. It's quite special-case behaviour. Instead, I'd expect to have a function like "extract_tcs()" which you pass a message to, and it turns it into a list of messages with tool calls expanded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We went back and forth on this. We opted for the automatic implementation because we didn't see a strong reason not to expand these messages.
For extract_tcs()
would that be applied before passing hist to Chat? Maybe we could add a param extract_tcs
to Chat
which would automatically expand messages using the extract_tcs
fn?
def _details_extract(x): | ||
"Extract fn, args, tool_call_id, result from <details>" | ||
m = re.search(r'<details.*?>(.*?)</details>', x, re.DOTALL) | ||
tc, tcid, res = re.findall(r'-\s*`([^`]+)`', m.group(1)) | ||
fn, args = re.search(r'(\w+)\((.*?)\)', tc).groups() | ||
return fn, args, res, tcid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pr modifies the tool call summary message by including a tool call id in the details section.
As a result, _details_extract
will throw an error if it runs on a tool call summary message generated with the current version of lisette.
That version generates a message like this
I'll use the `addy` function to add 5 and 3 for you.
<details class='tool-usage-details'>
`addy({"a": 5, "b": 3})`
- `8`
</details>
The result is 8.
Whereas this pr expects this structure
I'll use the `addy` function to sum 5 and 7 for you.
<details class='tool-usage-details'>
- `simple_add({"a": 5, "b": 7})`
- `12`
- `toolu_01RPbSeouj8mc2N4rfjw2BaH`
</details>
The sum of 5 and 7 is **12**.
We could make it backwards compatible by using a random tool call id? Maybe this is overkill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using regexen here doesn't feel robust to me. I'd have thought that using json would be better - i.e. a proper structured format with a well-tested standard implementation. wdyt?
This is very exciting @comhar ! :D I don't think we should keep this design for long, but I'll release it for now so we've got something to play with. |
fixes #29
This pr reconstructs tool call messages from a tool call summary message. Concretely it converts a message like this:
to this:
This change should make lisette more resilient to tool call hallucinations.