-
Notifications
You must be signed in to change notification settings - Fork 7
Added Error Logging #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Added Error Logging #164
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
This is in the right direction, however, we have a colleague whose using this code to run experiments (@shaahins please take a look), so I want to be careful about touching this code.
Generally I have two points of feedback!
For this toy agent, there is a distinction between 1) Errors in the llm output (mostly we can't pull a kernel from it) and 2) our api provider doing something weird like rate limiting. When putting feedback back into the llm we only care about the 1st error class (so Agent Error should only cover the first case) as the 2nd is not actionable by the agent. (ie. Claude can't do anything about us not paying our anthropic bill lol). It is somewhat useful to have an error class for 2 for when we run experiments, but it is not necessary, so I wouldn't include it in this PR.
The second point of feedback is there is a lot of redundancy in your code. Generally, if we can't pull a kernel out of the code (in this iteration of the agent), that's enough to say "ohh something is wrong here".
# Agent error detection | ||
if not result.get("kernel_code") or not isinstance(result.get("kernel_code"), str): | ||
raise AgentError(f"Agent error: No kernel code produced for {op_name}.") | ||
if "rate limit" in result.get("message", "").lower(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check and the check below are redundant / shouldn't get hit
except AgentError as e: | ||
print(f"❌ {e}") | ||
return "", False | ||
except AgentError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the two exceptions here?
} | ||
|
||
try: | ||
# Agent error detection before compilation | ||
if not kernel_code or not isinstance(kernel_code, str): | ||
raise AgentError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to above, I'm not sure when the below two conditinals would get hit if there isn't kernel code
if torch.cuda.is_available(): | ||
torch.cuda.empty_cache() | ||
torch.cuda.synchronize() | ||
|
||
correct_count = 0 | ||
total_count = 0 | ||
correctness_results = [] | ||
# todo: this is to protect against IMA errors, however, we should make this work / make sense with multiple workers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a legit todo lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. I am sorry putting that back 😅
@@ -247,6 +247,10 @@ def test_kernel_correctness( | |||
|
|||
return is_correct, feedback_info | |||
|
|||
except AgentError as e: | |||
feedback_info["agent_error"] = str(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default this should be empty
raise AgentError("Agent error: Empty response or rate limit encountered.") | ||
return content | ||
except anthropic.AnthropicError as e: | ||
raise AgentError(f"Anthropic API error: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is a difference between the API not working (ie. getting rate limited) and the agent producing a not useful output (ie. something without a kernel). I think you'd want to distinguish between the two.
Also @ParamThakkar123 run |
Sure @PaliC . I am using pytest to for testing. And all I noted all your feedbacks and suggestion. Will make sure all code changes are aligned with your feedbacks and I would all of them work. Thank you so much! |
Fixes #74