-
Notifications
You must be signed in to change notification settings - Fork 156
Add a way to validate that MCP tool descriptions #1403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@jmartisk WDYT? |
This is done by utilizing an LLM to detect whether the tool description is malicious and could lead to a Tool Poisoning Attack (TPA)
|
Sure, but I don't want to wait a month in order to get this in :)
Can you elaborate a little more? |
Kinda understandable, but it will, in the long term, turn against us, if we keep adding stuff here that should be in the upstream project :( especially since I reckon this may be treated as a CVE in the future, it should be fixed across the board.
I mean you currently only detect maliciousness from the description, but maybe using the tool's name and argument list may help detect maliciousness too? Like, if an argument requests data that should be completely irrelevant to the tool, or the name says something completely different from the description. But it's just a thought, maybe this isn't really a problem. |
Doing it here would also be a bit troublesome for the caching of the tool list (which I hopefully will implement very soon, I'll try drafting it today) because to avoid validating the same tool multiple times, you'll have to somehow keep track of what tool you already validated. In the upstream, we can simply perform the validation once when we retrieve the initial list of tools, and when we receive a notification about a new tool. Quarkus won't have a hook into the processing of tool list change notifications (unless we implement some SPI) so we would have to keep some sort of map to keep track of what we already validated (and that kinda opens up a pathway to a DDoS attack by flooding the client with tool change notifications and making this map grow indefinitely). I will prioritize implementing the caching very soon if that helps, and then you could build on top of it |
... or just merge this now, if it's high priority, and then I will redo it when we have the proper solution in place (that should be possible in a backward-compatible way). I guess that would be the best course of action. |
It's not high priority, so I'll wait, no problem. |
I'll prioritize the tool list caching and change notifications implementation so we can get this going |
Thanks a lot! |
* The named model to use in order to judge whether the descriptions of the tools provided by the MCP server | ||
* are malicious. If they are, a warning will be printed and the tool will never be used. | ||
*/ | ||
Optional<String> toolValidationModelName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assumes to use a model within the default provider or how does actually end up working?
i.e. can I be using openai to validate but call via ollama sometihng for validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you need the default model, you would just set this to default
i.e. can I be using openai to validate but call via ollama sometihng for validation?
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the idea!
FYI langchain4j/langchain4j#2817 is for the tool list caching (draft for now, I need a new release of quarkus-mcp-server to get the tests passing) |
This is done by utilizing an LLM to detect whether the tool description is malicious and could lead to
a Tool Poisoning Attack (TPA)
TODO: