-
Notifications
You must be signed in to change notification settings - Fork 6
[BB-9988] Add option to use conversation_id for custom llm service #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
650fbcd
to
a5f2403
Compare
fadf254
to
7b16d2e
Compare
7b16d2e
to
5e01d7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally everything looks good from a code perspective. Just have a few suggestions
ai_eval/llm_services.py
Outdated
url = self.completions_url | ||
# When reusing an existing thread, only send the latest user input and rely on | ||
# the provider to apply prior context associated with the conversation_id. | ||
if use_threads and thread_id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what purpose the use_threads
flag is fulfilling.
It would be derived from supports_threads
but the LLM Service already knows the value? So why are we calling a function on the LLM service to get this value and then passing it back to the LLM service?
For most cases a None value for thread_id will suffice. The only use use_threads
has, is that it creates a new thread.
I think it might be better to segregate creation of a new thread since it will support more rich scenarios such as having multiple conversation threads.
i.e. if in the future this block wants to support creating multiple threads for each user.
So the calling function flow can then be:
if llm.supports_threads:
thread_id = llm.new_thread()
llm. get_response(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use_threads
is used as the per-call directive whereas supports_threads()
is the capability/policy gate. So should there ever be a need to force stateless behaviour for a single call, we can just pass use_threads=False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that that is what it does, however, if you trace the usage of use_threads and thread_id there is scope for simplification:
use_threads | thread_id | action |
---|---|---|
False | Falsy | non-threaded conversation |
False | Truthy | non-threaded conversation |
True | Falsy | start new thread |
True | Truthy | threaded conversation |
So if use_threads if false, then sending a thread_id has no impact. So I feel it could be reorganised into two paths:
- Create a new thread in a separate function.
- If a thread_id is provided, the conversation is threaded, if not, it's non threaded, no need to gate it with use_threads, since you can just choose to omit thread_id and get the same result.
I feel it would be cleaner that way since right now the function is doing too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xitij2000 You're right, that would be much cleaner. Thanks for pointing that out. I've updated the code with your suggestions.
ai_eval/shortanswer.py
Outdated
if getattr(self, "thread_tag", "") != current_tag: | ||
self.thread_id = "" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is storing conversation history a chargeable thing? If so we should probably clean it up when no longer in use.
5d66c0b
to
8eef29c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code is looking good! I just had a few suggestions.
I think for the splitting out of the create_thread functionality, it seems like a good approach for me, however you've worked on this a lot more and have a deeper undersanding of the code. So if you feel that my suggestion doesn't make sense or is infeasible, then you can ignore it.
ai_eval/llm_services.py
Outdated
url = self.completions_url | ||
# When reusing an existing thread, only send the latest user input and rely on | ||
# the provider to apply prior context associated with the conversation_id. | ||
if use_threads and thread_id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that that is what it does, however, if you trace the usage of use_threads and thread_id there is scope for simplification:
use_threads | thread_id | action |
---|---|---|
False | Falsy | non-threaded conversation |
False | Truthy | non-threaded conversation |
True | Falsy | start new thread |
True | Truthy | threaded conversation |
So if use_threads if false, then sending a thread_id has no impact. So I feel it could be reorganised into two paths:
- Create a new thread in a separate function.
- If a thread_id is provided, the conversation is threaded, if not, it's non threaded, no need to gate it with use_threads, since you can just choose to omit thread_id and get the same result.
I feel it would be cleaner that way since right now the function is doing too much.
Also rename config variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
* feat: Add option to use conversation_id for custom service
Description
This PR adds optional provider‑side threads (conversation IDs) for custom LLM services, gated by the
USE_PROVIDER_THREADS
site flag. The XBlock caches a providerthread_id
and tracks athread_tag
fingerprint (provider/model/prompt/attachments) to auto‑invalidate when settings change; Reset clears both. The shared LLM wrapper passes and persists thread IDs when supported, while the default path remains stateless.
Important note: The XBlock/LMS remains the canonical chat history (source of truth) so that we can preserve vendor flexibility, continuity, and deterministic behavior; provider threads are treated only as a cache and we safely fall back when unsupported.
This update affects only chat interactions; the Coding XBlock and its code‑execution flow are not impacted as they are one-time feedback only (stateless).
Supporting information
OpenCraft Internal Jira task: BB-9988
Screenshot
Testing instructions
USE_PROVIDER_THREADS
flag set to True./admin/courseware/studentmodule/
to see that the user states have a conversation_id set.Other information
This depends on #14 as that contains a bunch of important code refactoring. Once #14 is merged, this will need to be rebased before merging.