-
Notifications
You must be signed in to change notification settings - Fork 627
Add A Claude Code command for Hypothesis blog post
#4571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3f28719 to
d49b2d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(partial review, more later)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor copyedits below, but I'm looking forward to publishing this! (and #4556)
|
|
||
| ## Failure modes | ||
|
|
||
| We observed a few failure modes while developing `/hypothesis`. For example, AI models like to write strategies with unnecessary restrictions, like limiting the maximum length of a list even when the property should hold for all lengths of lists. We added explicit instructions in `/hypothesis` not to do this, though that doesn't appear to have fixed the problem entirely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fun to note here that many of our human users do the same thing - and docs don't stop them either 😅
Includes a new
/hypothesiscommand, rewritten from the base/hypocommand in the paper to focus on test writing for long-term maintainers and developers, not immediate bug hunting. Based on an initial draft from @mmaaz-git (thanks!).I'm very unsure where the best place to host
hypothesis.mdis. I don't really want to do aclaudedir, because (at least at the moment) this is a generic AI command, as long as your framework implements claude-style tools. I've put it inagents, even though it's not really an agent, becauseagentsclearly communicates "ai". One idea is we host it as a static file onhypothesis.works, and figure out a more permanent place when the ecosystem settles down.cc other paper authors: @Zac-HD @mmaaz-git @carlini
BTW @carlini I've kept you as an author on this blog post because it discusses the paper, and you're an author there. But I spend about half the post talking about non-paper things, so if you don't want to be listed as endorsing that, just let me know and I can remove you. Just didn't want to take paper credit away. Quite happy to keep you as an author as well of course!