Skip to content

Conversation

pkulkark
Copy link
Member

@pkulkark pkulkark commented Sep 3, 2025

Description

This PR adds optional provider‑side threads (conversation IDs) for custom LLM services, gated by the USE_PROVIDER_THREADS site flag. The XBlock caches a provider thread_id and tracks a thread_tag
fingerprint (provider/model/prompt/attachments) to auto‑invalidate when settings change; Reset clears both. The shared LLM wrapper passes and persists thread IDs when supported, while the default path remains stateless.

Important note: The XBlock/LMS remains the canonical chat history (source of truth) so that we can preserve vendor flexibility, continuity, and deterministic behavior; provider threads are treated only as a cache and we safely fall back when unsupported.

This update affects only chat interactions; the Coding XBlock and its code‑execution flow are not impacted as they are one-time feedback only (stateless).

Supporting information

OpenCraft Internal Jira task: BB-9988

Screenshot

Screenshot 2025-09-04 at 2 00 48 PM

Testing instructions

  1. Deploy this branch of the XBlock.
  2. Without making any config changes, verify that the default path (with the standard providers and models) is working as expected.
  3. Configure a custom LLM service through the site config, make sure to include the new USE_PROVIDER_THREADS flag set to True.
  4. Test the AI XBlocks; Both chat and coding xblocks should work without trouble.
  5. To verify that the conversation_id is set correctly and used, go to django admin and check /admin/courseware/studentmodule/ to see that the user states have a conversation_id set.

Other information

This depends on #14 as that contains a bunch of important code refactoring. Once #14 is merged, this will need to be rebased before merging.

@pkulkark pkulkark force-pushed the pooja/bb-9988-refactor branch 3 times, most recently from 650fbcd to a5f2403 Compare September 4, 2025 16:16
@pkulkark pkulkark force-pushed the pooja/bb-9988-refactor branch 4 times, most recently from fadf254 to 7b16d2e Compare September 4, 2025 17:28
@pkulkark pkulkark force-pushed the pooja/bb-9988-refactor branch from 7b16d2e to 5e01d7e Compare September 4, 2025 17:38
@pkulkark pkulkark marked this pull request as ready for review September 4, 2025 17:39
Copy link
Member

@xitij2000 xitij2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally everything looks good from a code perspective. Just have a few suggestions

url = self.completions_url
# When reusing an existing thread, only send the latest user input and rely on
# the provider to apply prior context associated with the conversation_id.
if use_threads and thread_id:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what purpose the use_threads flag is fulfilling.

It would be derived from supports_threads but the LLM Service already knows the value? So why are we calling a function on the LLM service to get this value and then passing it back to the LLM service?

For most cases a None value for thread_id will suffice. The only use use_threads has, is that it creates a new thread.

I think it might be better to segregate creation of a new thread since it will support more rich scenarios such as having multiple conversation threads.

i.e. if in the future this block wants to support creating multiple threads for each user.

So the calling function flow can then be:
if llm.supports_threads:
thread_id = llm.new_thread()
llm. get_response(...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_threads is used as the per-call directive whereas supports_threads() is the capability/policy gate. So should there ever be a need to force stateless behaviour for a single call, we can just pass use_threads=False.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that that is what it does, however, if you trace the usage of use_threads and thread_id there is scope for simplification:

use_threads thread_id action
False Falsy non-threaded conversation
False Truthy non-threaded conversation
True Falsy start new thread
True Truthy threaded conversation

So if use_threads if false, then sending a thread_id has no impact. So I feel it could be reorganised into two paths:

  1. Create a new thread in a separate function.
  2. If a thread_id is provided, the conversation is threaded, if not, it's non threaded, no need to gate it with use_threads, since you can just choose to omit thread_id and get the same result.

I feel it would be cleaner that way since right now the function is doing too much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xitij2000 You're right, that would be much cleaner. Thanks for pointing that out. I've updated the code with your suggestions.

Comment on lines 211 to 213
if getattr(self, "thread_tag", "") != current_tag:
self.thread_id = ""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is storing conversation history a chargeable thing? If so we should probably clean it up when no longer in use.

@pkulkark pkulkark force-pushed the pooja/bb-9988-refactor branch from 5d66c0b to 8eef29c Compare September 5, 2025 17:19
Copy link
Member

@xitij2000 xitij2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code is looking good! I just had a few suggestions.

I think for the splitting out of the create_thread functionality, it seems like a good approach for me, however you've worked on this a lot more and have a deeper undersanding of the code. So if you feel that my suggestion doesn't make sense or is infeasible, then you can ignore it.

url = self.completions_url
# When reusing an existing thread, only send the latest user input and rely on
# the provider to apply prior context associated with the conversation_id.
if use_threads and thread_id:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that that is what it does, however, if you trace the usage of use_threads and thread_id there is scope for simplification:

use_threads thread_id action
False Falsy non-threaded conversation
False Truthy non-threaded conversation
True Falsy start new thread
True Truthy threaded conversation

So if use_threads if false, then sending a thread_id has no impact. So I feel it could be reorganised into two paths:

  1. Create a new thread in a separate function.
  2. If a thread_id is provided, the conversation is threaded, if not, it's non threaded, no need to gate it with use_threads, since you can just choose to omit thread_id and get the same result.

I feel it would be cleaner that way since right now the function is doing too much.

Copy link
Member

@xitij2000 xitij2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@pkulkark pkulkark merged commit b5bd1c3 into pooja/bb-9936-changes Sep 9, 2025
5 checks passed
pkulkark added a commit that referenced this pull request Sep 23, 2025
* feat: Add option to use conversation_id for custom service
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants