Skip to content

Conversation

@Liam-DeVoe
Copy link
Member

Includes a new /hypothesis command, rewritten from the base /hypo command in the paper to focus on test writing for long-term maintainers and developers, not immediate bug hunting. Based on an initial draft from @mmaaz-git (thanks!).

I'm very unsure where the best place to host hypothesis.md is. I don't really want to do a claude dir, because (at least at the moment) this is a generic AI command, as long as your framework implements claude-style tools. I've put it in agents, even though it's not really an agent, because agents clearly communicates "ai". One idea is we host it as a static file on hypothesis.works, and figure out a more permanent place when the ecosystem settles down.

cc other paper authors: @Zac-HD @mmaaz-git @carlini

BTW @carlini I've kept you as an author on this blog post because it discusses the paper, and you're an author there. But I spend about half the post talking about non-paper things, so if you don't want to be listed as endorsing that, just let me know and I can remove you. Just didn't want to take paper credit away. Quite happy to keep you as an author as well of course!

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(partial review, more later)

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor copyedits below, but I'm looking forward to publishing this! (and #4556)


## Failure modes

We observed a few failure modes while developing `/hypothesis`. For example, AI models like to write strategies with unnecessary restrictions, like limiting the maximum length of a list even when the property should hold for all lengths of lists. We added explicit instructions in `/hypothesis` not to do this, though that doesn't appear to have fixed the problem entirely.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fun to note here that many of our human users do the same thing - and docs don't stop them either 😅

@Zac-HD Zac-HD enabled auto-merge November 1, 2025 20:46
@Zac-HD Zac-HD merged commit 4cbd566 into HypothesisWorks:master Nov 1, 2025
149 of 151 checks passed
@Liam-DeVoe Liam-DeVoe deleted the claude-code-blog branch November 1, 2025 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants