Skip to content

Conversation

ParamThakkar123
Copy link
Contributor

Fixes #74

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 12, 2025
@ParamThakkar123
Copy link
Contributor Author

@PaliC

Copy link
Contributor

@PaliC PaliC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

This is in the right direction, however, we have a colleague whose using this code to run experiments (@shaahins please take a look), so I want to be careful about touching this code.

Generally I have two points of feedback!

For this toy agent, there is a distinction between 1) Errors in the llm output (mostly we can't pull a kernel from it) and 2) our api provider doing something weird like rate limiting. When putting feedback back into the llm we only care about the 1st error class (so Agent Error should only cover the first case) as the 2nd is not actionable by the agent. (ie. Claude can't do anything about us not paying our anthropic bill lol). It is somewhat useful to have an error class for 2 for when we run experiments, but it is not necessary, so I wouldn't include it in this PR.

The second point of feedback is there is a lot of redundancy in your code. Generally, if we can't pull a kernel out of the code (in this iteration of the agent), that's enough to say "ohh something is wrong here".

# Agent error detection
if not result.get("kernel_code") or not isinstance(result.get("kernel_code"), str):
raise AgentError(f"Agent error: No kernel code produced for {op_name}.")
if "rate limit" in result.get("message", "").lower():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check and the check below are redundant / shouldn't get hit

except AgentError as e:
print(f"❌ {e}")
return "", False
except AgentError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the two exceptions here?

}

try:
# Agent error detection before compilation
if not kernel_code or not isinstance(kernel_code, str):
raise AgentError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to above, I'm not sure when the below two conditinals would get hit if there isn't kernel code

if torch.cuda.is_available():
torch.cuda.empty_cache()
torch.cuda.synchronize()

correct_count = 0
total_count = 0
correctness_results = []
# todo: this is to protect against IMA errors, however, we should make this work / make sense with multiple workers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a legit todo lol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I am sorry putting that back 😅

@@ -247,6 +247,10 @@ def test_kernel_correctness(

return is_correct, feedback_info

except AgentError as e:
feedback_info["agent_error"] = str(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default this should be empty

raise AgentError("Agent error: Empty response or rate limit encountered.")
return content
except anthropic.AnthropicError as e:
raise AgentError(f"Anthropic API error: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is a difference between the API not working (ie. getting rate limited) and the agent producing a not useful output (ie. something without a kernel). I think you'd want to distinguish between the two.

@PaliC
Copy link
Contributor

PaliC commented Sep 12, 2025

Also @ParamThakkar123 run pytest to make sure things work.

@ParamThakkar123
Copy link
Contributor Author

Sure @PaliC . I am using pytest to for testing. And all I noted all your feedbacks and suggestion. Will make sure all code changes are aligned with your feedbacks and I would all of them work. Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add error logging for llm generated code for things that may go wrong before the code is compiled
2 participants