Skip to content

Conversation

@JimFuller-RedHat
Copy link
Collaborator

@JimFuller-RedHat JimFuller-RedHat commented Sep 30, 2025

What

Added api/v1/feedback to capture feedback on feature analysis.

Why

Automation for capturing feedback will eventually support feedback training 'loop' as well as enable us to generate quality KPI.

Note: For now this is a simple log capture ... hence defining everything as strings ... in the future we may consider capturing type and have longer term persistence (database, etc).

How to Test

curl -X POST \
-H "Content-Type: application/json" \
-d '{"feature":"suggest-cwe", "cve_id":"CVE-2025-21879", "actual":"CWE-617","accept":"true"}' \
http://127.0.0.1:9000/api/v1/feedback

should return with HTTP status == 200

{"status":"Feedback received and logged successfully."}%             

and observe following entry in .config/aegis_ai/feedback.log

[12:43:06] INFO     [12:43:06] feedback_logger INFO FEEDBACK RECEIVED | feature: 'suggest-cwe' | cve_id:    main.py:296
                    'CVE-2025-21879' | actual: 'CWE-617' | expected: '' | request_time: '' | accept: 'True'            

Related Tickets

https://issues.redhat.com/browse/AEGIS-67

Summary by Sourcery

Add v1 feedback endpoint to capture and log user feedback for feature analysis

New Features:

  • Add POST /api/v1/feedback endpoint to accept and log user feedback

Enhancements:

  • Introduce Feedback pydantic model with field sanitization and setup_feedback_logger for rotating log capture

Documentation:

  • Add Feedback schema and endpoint to OpenAPI docs, document AEGIS_WEB_FEEDBACK_LOG in env-vars, and update CHANGELOG

Tests:

  • Add tests for feedback submission success, input sanitization, and validation errors with a temporary log fixture

@JimFuller-RedHat JimFuller-RedHat self-assigned this Sep 30, 2025
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Instead of manually calling sanitize_input on each field in the endpoint, consider adding Pydantic validators (or a root validator) to sanitize fields at model initialization and centralize that logic.
  • The code currently does accept = feedback.accept in truthy, but Pydantic will already parse accept as a bool—using it directly (e.g. accept = feedback.accept) would be more straightforward and less error-prone.
  • The log message concatenation is missing a separator before the accept: part; adding a consistent delimiter (e.g. | accept:) will improve readability of the feedback logs.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Instead of manually calling sanitize_input on each field in the endpoint, consider adding Pydantic validators (or a root validator) to sanitize fields at model initialization and centralize that logic.
- The code currently does `accept = feedback.accept in truthy`, but Pydantic will already parse `accept` as a bool—using it directly (e.g. `accept = feedback.accept`) would be more straightforward and less error-prone.
- The log message concatenation is missing a separator before the `accept:` part; adding a consistent delimiter (e.g. `| accept:`) will improve readability of the feedback logs.

## Individual Comments

### Comment 1
<location> `src/aegis_ai_web/src/__init__.py:32-38` </location>
<code_context>
+    accept: bool = Field(False)
+
+
+def sanitize_input(text) -> str:
+    """
+    basic content sanitize.
+    """
+    if text:
+        return re.sub(r"[^a-zA-Z0-9 .,!?]", "", text)
+    return ""
</code_context>

<issue_to_address>
**suggestion:** Sanitization may be overly restrictive for some feedback fields.

The regex currently strips characters like underscores and dashes, which may be needed for certain fields. Adjust sanitization logic to allow necessary characters or customize it per field to preserve valid input.

```suggestion
def sanitize_input(text) -> str:
    """
    Basic content sanitize.
    Allows letters, numbers, spaces, common punctuation, underscores, and dashes.
    """
    if text:
        # Allow a-z, A-Z, 0-9, space, period, comma, exclamation, question, underscore, dash
        return re.sub(r"[^a-zA-Z0-9 _.,!?-]", "", text)
    return ""
```
</issue_to_address>

### Comment 2
<location> `src/aegis_ai_web/src/main.py:287-294` </location>
<code_context>
+    Receive feedback, validate, sanitize, and log it to feedback log file.
+    """
+
+    try:
+        sanitized_username = sanitize_input(feedback.feature)
+        sanitized_cve_id = sanitize_input(feedback.cve_id)
+        sanitized_suggested_cwe = sanitize_input(feedback.suggested_cwe_id)
+        sanitized_suggested_impact = sanitize_input(feedback.suggested_impact)
+        sanitized_comment = sanitize_input(feedback.comment)
+
+        accept = feedback.accept in truthy
+
+        log_message = (
</code_context>

<issue_to_address>
**suggestion:** Boolean conversion for 'accept' may be inconsistent with Pydantic validation.

Using 'feedback.accept in truthy' may cause issues if 'accept' is not a string. It's better to use 'feedback.accept' directly and rely on Pydantic's type enforcement.
</issue_to_address>

### Comment 3
<location> `src/aegis_ai_web/tests/test_feedback.py:10-19` </location>
<code_context>
+client = TestClient(app)
+
+
+def test_save_feedback_success(cleanup_feedback_logs):
+    """
+    Test a successful feedback submission with valid data.
+    """
+    feedback_data = {
+        "feature": "suggest-cwe",
+        "cve_id": "CVE-2025-23395",
+        "comment": "Excellent service!",
+        "accept": "true",
+    }
+    response = client.post("/api/v1/feedback", json=feedback_data)
+
+    assert response.status_code == 200
+    assert response.json() == {"status": "Feedback received and logged successfully."}
+
+    assert os.path.exists(feedback_log)
+    with open(feedback_log, "r") as f:
+        log_content = f.read()
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for all optional fields in the Feedback model.

Please add a test that submits feedback with all optional fields populated to verify correct handling and logging.
</issue_to_address>

### Comment 4
<location> `src/aegis_ai_web/tests/test_feedback.py:32-41` </location>
<code_context>
+        assert "feature: 'suggest-cwe'" in log_content
+
+
+def test_save_feedback_sanitization(cleanup_feedback_logs):
+    """
+    Test potentially harmful characters are sanitized.
+    """
+    feedback_data = {
+        "feature": "suggest-cwe",
+        "cve_id": "CVE-2025-23395",
+        "comment": "Trying to inject a\nnewline.",
+        "accept": "true",
+    }
+
+    response = client.post("/api/v1/feedback", json=feedback_data)
+
+    assert response.status_code == 200
+
+    assert os.path.exists(feedback_log)
+    with open(feedback_log, "r") as f:
+        log_content = f.read()
+        # The sanitized log should NOT contain the malicious bits
+        assert "<script>" not in log_content
+        assert "\n" not in log_content
+
+        # Check sanitized version
+        assert "feature: 'suggest-cwe'" in log_content
+        assert "Comment: 'Trying to inject a newline.'" in log_content
+
+
</code_context>

<issue_to_address>
**issue (testing):** Sanitization test assertion may not match actual log output format.

Update the test assertion to use "comment: ..." (lowercase) to match the log output, or consider parametrizing the test to verify all sanitized fields consistently.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@JimFuller-RedHat JimFuller-RedHat marked this pull request as draft September 30, 2025 13:33
@JimFuller-RedHat JimFuller-RedHat force-pushed the capture-feedback branch 8 times, most recently from f01dd0e to a23bf49 Compare October 10, 2025 04:48
@JimFuller-RedHat JimFuller-RedHat marked this pull request as ready for review October 10, 2025 04:52
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The OpenAPI schema fields (comment, suggested_cwe_id, suggested_impact) don’t match the Pydantic Feedback model (email, actual, expected, request_time), so please align them for consistency.
  • The sanitize_input function strips out characters like '@' and '.', which will mangle emails—either whitelist common email chars or apply separate email validation.
  • You may want to include the email field in the logged message so you can trace which user submitted each piece of feedback.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The OpenAPI schema fields (comment, suggested_cwe_id, suggested_impact) don’t match the Pydantic Feedback model (email, actual, expected, request_time), so please align them for consistency.
- The sanitize_input function strips out characters like '@' and '.', which will mangle emails—either whitelist common email chars or apply separate email validation.
- You may want to include the email field in the logged message so you can trace which user submitted each piece of feedback.

## Individual Comments

### Comment 1
<location> `src/aegis_ai_web/src/__init__.py:64-69` </location>
<code_context>
+    accept: bool = Field(False)
+
+
+def sanitize_input(text) -> str:
+    """
+    basic content sanitize.
+    """
+    if text:
+        return re.sub(r"[^a-zA-Z0-9, -]", "", text)
+    return ""
+
</code_context>

<issue_to_address>
**suggestion:** Sanitization regex may be overly restrictive for some fields.

The regex may remove valid characters for certain fields, such as '@' and '.' in email addresses. Consider implementing field-specific sanitization to avoid unintended data loss.

Suggested implementation:

```python
def sanitize_input(text: str, field_type: str = "generic") -> str:
    """
    Field-specific content sanitize.
    """
    if not text:
        return ""
    if field_type == "expected":
        # Only allow alphanumeric, comma, space, and dash for 'expected' field
        return re.sub(r"[^a-zA-Z0-9, -]", "", text)
    elif field_type == "email":
        # Allow typical email characters
        return re.sub(r"[^a-zA-Z0-9@._+-]", "", text)
    # Add more field types as needed
    else:
        # Default: allow most printable characters except control chars
        return re.sub(r"[^\x20-\x7E]", "", text)

```

```python
    @field_validator("expected")
    @classmethod
    def sanitize_expected(cls, expected: str) -> str:
        return sanitize_input(expected, field_type="expected")

```

If you have other fields that require specific sanitization (e.g., email, username), add corresponding validators and pass the appropriate `field_type` to `sanitize_input`. For example, for an email field:

```python
@field_validator("email")
@classmethod
def sanitize_email(cls, email: str) -> str:
    return sanitize_input(email, field_type="email")
```
</issue_to_address>

### Comment 2
<location> `src/aegis_ai_web/src/__init__.py:31` </location>
<code_context>
+    def sanitize_feature(cls, feature: str) -> str:
+        return sanitize_input(feature)
+
+    cve_id: Optional[CVEID] = Field("", max_length=50)
+
+    email: Optional[str] = Field("", max_length=100)
</code_context>

<issue_to_address>
**suggestion:** Default value for Optional fields is an empty string instead of None.

Using None as the default for Optional fields avoids confusion between missing and empty values, and aligns with Pydantic's handling of nulls.

Suggested implementation:

```python
    cve_id: Optional[CVEID] = Field(None, max_length=50)

    email: Optional[str] = Field(None, max_length=100)

```

```python
    request_time: Optional[str] = Field(None, max_length=50)

```
</issue_to_address>

### Comment 3
<location> `src/aegis_ai_web/src/main.py:287-297` </location>
<code_context>
         )
+
+
[email protected]("/api/v1/feedback")
+async def save_feedback(feedback: Feedback):
+    """
+    Receive feedback, validate, sanitize, and log it to a separate log file.
+    """
+
+    try:
+        log_message = (
+            f"FEEDBACK RECEIVED | "
+            f"feature: '{feedback.feature}' | "
+            f"cve_id: '{feedback.cve_id}' | "
+            f"actual: '{feedback.actual}' | "
+            f"expected: '{feedback.expected}' | "
+            f"request_time: '{feedback.request_time}' | "
+            f"accept: '{feedback.accept}'"
+        )
+        feedback_logger.info(log_message)
+        return {"status": "Feedback received and logged successfully."}
+
+    except Exception as e:
+        logging.error(f"Failed to process feedback: {e}", exc_info=True)
+        raise HTTPException(
+            status_code=500,
+            detail="An internal error occurred while processing feedback.",
</code_context>

<issue_to_address>
**suggestion:** Feedback logging does not persist all fields from the OpenAPI schema.

Consider updating the logging to include all fields defined in the OpenAPI schema, such as comment, suggested_cwe_id, and suggested_impact, to ensure full traceability.

```suggestion
        log_message = (
            f"FEEDBACK RECEIVED | "
            f"feature: '{feedback.feature}' | "
            f"cve_id: '{feedback.cve_id}' | "
            f"actual: '{feedback.actual}' | "
            f"expected: '{feedback.expected}' | "
            f"request_time: '{feedback.request_time}' | "
            f"accept: '{feedback.accept}' | "
            f"comment: '{feedback.comment}' | "
            f"suggested_cwe_id: '{feedback.suggested_cwe_id}' | "
            f"suggested_impact: '{feedback.suggested_impact}'"
        )
        feedback_logger.info(log_message)
        return {"status": "Feedback received and logged successfully."}
```
</issue_to_address>

### Comment 4
<location> `src/aegis_ai_web/tests/test_feedback.py:10` </location>
<code_context>
+client = TestClient(app)
+
+
+def test_save_feedback_success():
+    """
+    Test a successful feedback submission with valid data.
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for feedback submissions with all optional fields omitted.

Please add a test case with only the required 'feature' field to verify correct handling of minimal valid input.
</issue_to_address>

### Comment 5
<location> `src/aegis_ai_web/tests/test_feedback.py:35` </location>
<code_context>
+        pytest.fail(f"feedback log file was not created at: {feedback_log}")
+
+
+def test_save_feedback_sanitization():
+    """
+    Test simple sanitization.
</code_context>

<issue_to_address>
**suggestion (testing):** Sanitization test could include more diverse malicious input.

Expand the test to cover additional cases like SQL injection patterns, HTML tags, and unicode characters to better validate the sanitization function.
</issue_to_address>

### Comment 6
<location> `src/aegis_ai_web/tests/test_feedback.py:59` </location>
<code_context>
+        pytest.fail(f"feedback log file was not created at: {feedback_log}")
+
+
+def test_save_feedback_validation_error_missing_field():
+    """
+    Test request with missing required field.
</code_context>

<issue_to_address>
**suggestion (testing):** Validation error tests could check for invalid types and overly long input.

Consider adding tests for fields that exceed max_length constraints and for invalid types, such as providing an integer for 'feature', to ensure comprehensive validation coverage.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@JimFuller-RedHat JimFuller-RedHat force-pushed the capture-feedback branch 4 times, most recently from 5dc8c95 to 173d79d Compare October 10, 2025 12:56
@JimFuller-RedHat JimFuller-RedHat force-pushed the capture-feedback branch 2 times, most recently from b0a65a1 to f36c582 Compare October 10, 2025 13:00
Copy link
Collaborator

@kdudka kdudka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A few minor remarks inline.

@JimFuller-RedHat JimFuller-RedHat merged commit 6907f21 into main Oct 10, 2025
7 of 10 checks passed
@JimFuller-RedHat JimFuller-RedHat deleted the capture-feedback branch October 10, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants