Skip to content

Conversation

@vearutop
Copy link
Member

Reverts #124

It turned out this implementation does not work well with actual types (e.g. you get ["1", "2", "3"] for a enum of []int).

@github-actions
Copy link

Lines Of Code

Language Files Lines Code Comments Blanks Complexity Bytes
Go 7 2388 (-18) 1627 (-13) 275 486 (-5) 628 (-6) 57.6K (-304B)
Go (test) 13 4191 (-31) 3114 (-27) 311 766 (-4) 51 106.1K (-1.2K)

@github-actions
Copy link

Go API Changes

# summary
Inferred base version: v0.3.75
Suggested version: v0.3.76

@github-actions
Copy link

Unit Test Coverage

total: (statements) 79.9%
changed lines: (statements) 100.0%

Coverage of changed lines
File Function Coverage
Total 100.0%
reflect.go 100.0%
reflect.go:1377 reflectEnum 100.0%
Coverage diff with base branch
File Function Base Coverage Current Coverage
Total 79.9% 79.9% (0.0%)
reflect.go searchDeepestSchema 71.4% no function

@github-actions
Copy link

Benchmark Result

Benchmark diff with base branch
name                        old time/op    new time/op    delta
Schema_UnmarshalJSON_raw-4    59.0µs ± 3%    58.5µs ± 0%   ~     (p=0.690 n=5+5)
Schema_UnmarshalJSON-4         462µs ± 1%     462µs ± 0%   ~     (p=0.905 n=5+4)
Schema_MarshalJSON_raw-4      38.7µs ± 0%    38.8µs ± 0%   ~     (p=0.548 n=5+5)
Schema_MarshalJSON-4           184µs ± 1%     184µs ± 1%   ~     (p=0.841 n=5+5)

name                        old alloc/op   new alloc/op   delta
Schema_UnmarshalJSON_raw-4    31.3kB ± 0%    31.3kB ± 0%   ~     (all equal)
Schema_UnmarshalJSON-4         180kB ± 0%     180kB ± 0%   ~     (p=1.000 n=5+5)
Schema_MarshalJSON_raw-4      14.6kB ± 0%    14.6kB ± 0%   ~     (p=1.000 n=5+5)
Schema_MarshalJSON-4          53.9kB ± 0%    53.9kB ± 0%   ~     (p=0.968 n=5+5)

name                        old allocs/op  new allocs/op  delta
Schema_UnmarshalJSON_raw-4       460 ± 0%       460 ± 0%   ~     (all equal)
Schema_UnmarshalJSON-4         1.85k ± 0%     1.85k ± 0%   ~     (all equal)
Schema_MarshalJSON_raw-4         370 ± 0%       370 ± 0%   ~     (all equal)
Schema_MarshalJSON-4             468 ± 0%       468 ± 0%   ~     (all equal)
Benchmark result
name                        time/op
Schema_UnmarshalJSON_raw-4  58.5µs ± 0%
Schema_UnmarshalJSON-4       462µs ± 0%
Schema_MarshalJSON_raw-4    38.8µs ± 0%
Schema_MarshalJSON-4         184µs ± 1%

name                        alloc/op
Schema_UnmarshalJSON_raw-4  31.3kB ± 0%
Schema_UnmarshalJSON-4       180kB ± 0%
Schema_MarshalJSON_raw-4    14.6kB ± 0%
Schema_MarshalJSON-4        53.9kB ± 0%

name                        allocs/op
Schema_UnmarshalJSON_raw-4     460 ± 0%
Schema_UnmarshalJSON-4       1.85k ± 0%
Schema_MarshalJSON_raw-4       370 ± 0%
Schema_MarshalJSON-4           468 ± 0%

@vearutop vearutop merged commit 5525989 into master May 12, 2025
7 checks passed
@vearutop vearutop deleted the revert-124-fix-enums-placement branch May 12, 2025 16:13
Copy link

@llamapreview llamapreview bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto Pull Request Review from LlamaPReview

1. Overview

1.1 Core Changes

  • Primary purpose and scope: This PR reverts a previous change (PR #124) that modified how enum values are placed in the generated JSON schema. The revert aims to fix a critical bug introduced by #124 where enum values for numeric types (e.g., []int) were incorrectly serialized as strings (e.g., ["1", "2", "3"]).
  • Key components modified: The primary changes are in reflect.go (specifically the reflectEnum function) and the corresponding test file reflect_test.go. The logic for determining the target schema for enum placement is reverted, and a test case validating the previous behavior is removed.
  • Cross-component impacts: This change directly affects the structure of the generated JSON schema, particularly for fields involving arrays or maps with associated enums. It impacts any downstream tooling or clients relying on the schema structure, potentially breaking compatibility with tools expecting enums on nested elements (as per the reverted logic) but fixing issues for tools expecting correct primitive type serialization.
  • Business value alignment: The PR prioritizes correcting the serialization of primitive enum types (e.g., integers) over the structural placement of enums within nested types (like array items). This fixes a critical data type mismatch bug but regresses on potentially more spec-compliant placement for container types, representing a trade-off.

2. Critical Findings

2.1 Must Fix (P0🔴)

Issue: Reintroduces JSON Schema Specification Non-Compliance for Container Types

  • Analysis Confidence: High
  • Impact: The reverted code applies enums directly to the schema of container types (like arrays or maps) instead of their items or values. This violates the JSON Schema specification, which requires enums to constrain the values within the container (e.g., items.enum for arrays). This can lead to incorrect validation and client generation issues for tools strictly adhering to the spec for nested structures. For example, an array of strings []string with enum:"a,b" will incorrectly generate {"type": "array", "enum": ["a","b"]} instead of the correct {"type": "array", "items": {"type": "string", "enum": ["a","b"]}}.
  • Resolution: The logic needs to correctly identify container types and apply the enum constraint to the appropriate nested schema (items for arrays, additionalProperties or patternProperties for maps) while ensuring the enum values themselves have the correct primitive type (addressing the original bug of #124).

Issue: Removal of Test Coverage for Nested Enum Placement

  • Analysis Confidence: High
  • Impact: The deletion of TestReflector_Reflect_EnumsPlacement removes validation for how enums are handled in complex scenarios, including nested arrays ([][][]string), maps (map[string]string), and combinations thereof. This significantly increases the risk of future regressions related to enum placement in container types.
  • Resolution: Restore the test case (TestReflector_Reflect_EnumsPlacement). Update its assertions to expect the correct behavior: enums placed within the nested items or additionalProperties schema, AND enum values having the correct primitive type (e.g., [1, 2] for int enums, not ["1", "2"]).

2.2 Should Fix (P1🟡)

Issue: Underlying Type Mismatch Issue Not Addressed

  • Analysis Confidence: High
  • Impact: The original problem leading to the revert (numeric enums becoming strings) indicates an issue in how enum values derived from struct tags (strings) are processed and typed. This revert avoids the symptom but doesn't fix the root cause: the lack of type-aware serialization for enum values based on the field's Go type. This underlying issue might surface again in other contexts.
  • Suggested Solution: Modify the enum.loadFromField logic (or where enum values are finalized) to inspect the actual Go type of the field being processed. Convert the string-based enum values from the tag into the correct target type (e.g., int, float, bool) before assigning them to schema.Enum or target.Enum.

Issue: Lack of Documentation on Current Enum Limitations

  • Analysis Confidence: Medium
  • Impact: With this revert, the library's behavior regarding enum placement for container types is now inconsistent with the JSON Schema specification. Users relying on struct tags for enums on arrays/maps might encounter unexpected schema structures or validation issues without clear guidance.
  • Suggested Solution: Add documentation (e.g., in the README or code comments near reflectEnum) explicitly stating the current limitation: enums defined via struct tags on array/map fields are applied to the container itself, not the items/values, and explain the reasoning (temporary state due to type serialization issues) or link to the relevant issue/PR (#126, #124).

2.3 Consider (P2🟢)

Area: Configurable Enum Placement Strategy

  • Analysis Confidence: Medium
  • Improvement Opportunity: To provide more flexibility and potentially bridge the gap between different interpretations or needs (direct field enum vs. item enum), consider introducing a way to explicitly control enum placement via struct tags. For example, enum:"a,b" could remain the default (current reverted behavior), while enumItems:"a,b" or a modifier like jsonschema:"enum=items" could enforce placement on the items schema for arrays. This would allow users to opt-in to spec-compliant placement where needed.

2.4 Summary of Action Items

  1. P0: Fix enum placement for container types (arrays/maps) to comply with JSON Schema spec (apply to items/additionalProperties).
  2. P0: Restore and update the deleted test case TestReflector_Reflect_EnumsPlacement to cover nested types with correct type expectations.
  3. P1: Implement type-aware processing of enum values derived from struct tags to fix the root cause of the string serialization bug.
  4. P1: Document the current (reverted) behavior and its limitations regarding enum placement on container types.

3. Technical Analysis

3.1 Code Logic Analysis

📁 reflect.go - reflectEnum

  • Submitted PR Code:
    func reflectEnum(schema *Schema, fieldTag reflect.StructTag, fieldVal interface{}) {
    	enum := enum{}
    	enum.loadFromField(fieldTag, fieldVal)

    	if len(enum.items) > 0 {
    		// highlight-start
    		// Reverted: Removed searchDeepestSchema call, applying enum directly to the top-level schema
    		schema.Enum = enum.items
    		// highlight-end
    		if len(enum.names) > 0 {
    			if schema.ExtraProperties == nil {
    				schema.ExtraProperties = make(map[string]interface{}, 1)
    			}

    			schema.ExtraProperties[XEnumNames] = enum.names
    		}
    	}
    }

    // highlight-start
    // Reverted: Removed the helper function that found the deepest nested schema
    // func searchDeepestSchema(in *Schema) *Schema { ... }
    // highlight-end

    // enum can be use for sending enum data that need validate.
    type enum struct {
    	items []interface{}
    	names []string
    }
  • Analysis:
    • Current logic reverts to applying the enum keyword directly to the schema associated with the Go field, regardless of whether it's a primitive, array, or map.
    • This fixes the immediate bug where searchDeepestSchema combined with type-unaware enum values caused incorrect serialization (e.g., []int enums becoming ["1", "2"]).
    • However, for container types (arrays/maps), applying enum to the container itself violates JSON Schema specification, which expects the constraint on the items/values within the container.
    • Edge cases involving complex nested types (arrays of arrays, maps of arrays) are no longer handled correctly regarding enum placement.
  • LlamaPReview Suggested Improvements:
    import (
    	"strconv" // Example: needed for string to number conversion
    	"reflect"
    )

    func reflectEnum(schema *Schema, fieldTag reflect.StructTag, fieldVal interface{}, fieldType reflect.Type) { // Added fieldType parameter
    	enum := enum{}
    	enum.loadFromField(fieldTag, fieldVal)

    	if len(enum.items) == 0 {
    		return
    	}

    	// Determine the actual target schema for the enum based on type
    	targetSchema := schema
    	valueType := fieldType // Type of the value the enum applies to

    	// Check if the field is an array or slice
    	if fieldType.Kind() == reflect.Slice || fieldType.Kind() == reflect.Array {
    		// Ensure Items schema exists
    		if schema.Items == nil {
    			schema.Items = &SchemaOrBool{}
    		}
    		if schema.Items.SchemaOrBool == nil {
    			// Initialize based on element type if possible, otherwise default
    			// This part might need more sophisticated type reflection from Reflector state
    			schema.Items.SchemaOrBool = &SchemaOrBool{TypeObject: &Schema{}}
    		}
    		if schema.Items.SchemaOrBool.TypeObject != nil {
    			targetSchema = schema.Items.SchemaOrBool.TypeObject
    			valueType = fieldType.Elem() // Enum applies to the element type
    		}
    	// Check if the field is a map (simplified: assumes string values for now)
    	// TODO: Handle map keys and more complex map value types if necessary
    	} else if fieldType.Kind() == reflect.Map {
    		if schema.AdditionalProperties == nil {
    			schema.AdditionalProperties = &SchemaOrBool{}
    		}
    		if schema.AdditionalProperties.SchemaOrBool == nil {
    			schema.AdditionalProperties.SchemaOrBool = &SchemaOrBool{TypeObject: &Schema{}}
    		}
    		if schema.AdditionalProperties.SchemaOrBool.TypeObject != nil {
    			targetSchema = schema.AdditionalProperties.SchemaOrBool.TypeObject
    			valueType = fieldType.Elem() // Enum applies to the value type
    		}
    	}

    	// Convert enum values to the correct type based on the valueType
    	typedEnumItems, err := convertEnumValues(enum.items, valueType)
    	if err != nil {
    		// Handle or log error: could not convert enum values to target type
    		// Maybe default to string or skip enum? For now, assign raw items.
    		targetSchema.Enum = enum.items // Fallback or log error
    	} else {
    		targetSchema.Enum = typedEnumItems // Assign type-corrected enum values
    	}


    	// Apply XEnumNames to the schema where the enum is defined
    	if len(enum.names) > 0 {
    		if targetSchema.ExtraProperties == nil {
    			targetSchema.ExtraProperties = make(map[string]interface{}, 1)
    		}
    		targetSchema.ExtraProperties[XEnumNames] = enum.names
    	}
    }

    // Helper function to convert enum values (interface{}) based on target Go type
    func convertEnumValues(items []interface{}, targetType reflect.Type) ([]interface{}, error) {
    	if len(items) == 0 {
    		return items, nil
    	}

    	convertedItems := make([]interface{}, len(items))
    	kind := targetType.Kind()

    	for i, item := range items {
    		itemStr, ok := item.(string) // Assume items from tag are initially strings
    		if !ok {
    			// If item is not string, attempt to use as is or handle error
    			// This might happen if loadFromField changes
    			convertedItems[i] = item // Keep original if not string
    			continue
    		}

    		switch kind {
    		case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
    			val, err := strconv.ParseInt(itemStr, 10, 64)
    			if err != nil {
    				return nil, fmt.Errorf("enum value %q is not a valid integer: %w", itemStr, err)
    			}
    			// Convert to specific int size if necessary, or keep as int64/interface{}
    			convertedItems[i] = val
    		case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
    			val, err := strconv.ParseUint(itemStr, 10, 64)
    			if err != nil {
    				return nil, fmt.Errorf("enum value %q is not a valid unsigned integer: %w", itemStr, err)
    			}
    			convertedItems[i] = val
    		case reflect.Float32, reflect.Float64:
    			val, err := strconv.ParseFloat(itemStr, 64)
    			if err != nil {
    				return nil, fmt.Errorf("enum value %q is not a valid float: %w", itemStr, err)
    			}
    			convertedItems[i] = val
    		case reflect.Bool:
    			val, err := strconv.ParseBool(itemStr)
    			if err != nil {
    				return nil, fmt.Errorf("enum value %q is not a valid boolean: %w", itemStr, err)
    			}
    			convertedItems[i] = val
    		case reflect.String:
    			convertedItems[i] = itemStr // Already a string
    		default:
    			// For other types (structs, interfaces, etc.), enums might not be applicable directly
    			// Or might require custom handling. Defaulting to string representation.
    			convertedItems[i] = itemStr
    		}
    	}
    	return convertedItems, nil
    }

    // Note: reflectEnum would need access to the reflect.Type of the field.
    // This might require passing it down from the caller in Reflector.Reflect.
  • Improvement rationale:
    • Technical benefits: This approach addresses both problems: it places the enum constraint correctly within items or additionalProperties for container types (restoring spec compliance) and attempts to convert the enum values to the correct primitive type based on the Go field's type (fixing the original bug). It uses a new helper convertEnumValues for type conversion.
    • Business value: Generates schemas that are both spec-compliant for structure and correct in terms of data types, leading to more reliable validation and client generation.
    • Risk assessment: Requires careful handling of type reflection and conversion edge cases. The reflectEnum function needs access to the reflect.Type of the field, which might require minor refactoring in the calling code (Reflector.Reflect). Error handling during conversion needs consideration (e.g., what to do if an enum tag value cannot be parsed as the target type).

📁 reflect_test.go - TestReflector_Reflect_EnumsPlacement

  • Submitted PR Code:
    --- a/reflect_test.go
    +++ b/reflect_test.go
    @@ -2648,41 +2648,10 @@
     	require.NoError(t, err)
     	assertjson.EqMarshal(t, `{
     	  "properties":{
    -		"a":{"description":"Hello world!","type":"string","format":"base64"},
    +		"a":{"description":"Hello world!","type":"string","format":"base64"}, // This context seems unrelated to the deleted test below
     		"b":{"description":"I am a RawMessage."},
     		"c":{"description":"I am a RawMessage pointer."}
     	  },
     	  "type":"object"
     	}`, s)
     }
    -
    -func TestReflector_Reflect_EnumsPlacement(t *testing.T) {
    -	r := jsonschema.Reflector{}
    -
    -	type Check struct {
    -		A int                      `json:"a" enum:"1"`
    -		B []string                 `json:"b" enum:"check-string"`
    -		C []withValNamedEnum       `json:"c"`
    -		D [][][]string             `json:"d" enum:"d"`
    -		F map[string]string        `json:"f" enum:"f"`
    -		G map[int]map[string][]int `json:"g" enum:"1"`
    -	}
    -
    -	got, err := r.Reflect(Check{})
    -	require.NoError(t, err)
    -
    -	assertjson.EqMarshal(t, `{
    -	  "definitions": {
    -		"JsonschemaGoTestWithValNamedEnum": {"enum": [""],"type": "string","x-enum-names": ["n:"]}
    -	  },
    -	  "properties":{
    -		"a":{"enum":["1"],"type":"integer"},
    -		"b":{"items":{"enum":["check-string"],"type":"string"},"type":["array","null"]},
    -		"c":{"items":{"$ref":"#/definitions/JsonschemaGoTestWithValNamedEnum"},"type":["array","null"]},
    -		"d":{"items":{"items":{"items":{"enum":["d"],"type":"string"},"type":"array"},"type":"array"},"type":["array","null"]},
    -		"f":{"additionalProperties":{"type":"string"},"enum":["f"],"type":["object","null"]},
    -		"g":{"additionalProperties":{"additionalProperties":{"items":{"type":"integer"},"type":"array"},"type":"object"},"enum":["1"],"type":["object","null"]}
    -	  },
    -	  "type":"object"
    -	}`, got)
    -}
  • Analysis:
    • The removal of TestReflector_Reflect_EnumsPlacement eliminates tests specifically designed to validate enum handling for various types, including primitives (int), simple arrays ([]string), arrays of custom types, deeply nested arrays ([][][]string), and maps (map[string]string, map[int]map[string][]int).
    • This leaves a gap in verifying the correctness of enum placement and typing, especially for the complex cases that the original PR #124 attempted to address.
  • LlamaPReview Suggested Improvements:
    import (
    	"testing"
    	"github.com/stretchr/testify/require"
    	"github.com/swaggest/assertjson"
    	"github.com/swaggest/jsonschema-go"
    )

    // Restore the test case
    func TestReflector_Reflect_EnumsPlacement(t *testing.T) {
    	r := jsonschema.Reflector{}

    	// Keep the original struct definition
    	type Check struct {
    		A int                      `json:"a" enum:"1"`
    		B []string                 `json:"b" enum:"check-string"`
    		// C []withValNamedEnum       `json:"c"` // Assuming withValNamedEnum handles its own enum correctly
    		D [][][]string             `json:"d" enum:"d"`
    		F map[string]string        `json:"f" enum:"f"`
    		G map[int]map[string][]int `json:"g" enum:"1"` // Enum on map keys is complex, focus on values if applicable
    	}

    	got, err := r.Reflect(Check{})
    	require.NoError(t, err)

    	// Update assertions to expect CORRECT placement and types
    	// This assumes the suggested fix in reflectEnum is applied
    	assertjson.EqMarshal(t, `{
    	  "definitions": {
    		// "JsonschemaGoTestWithValNamedEnum": {"enum": [""],"type": "string","x-enum-names": ["n:"]} // Keep if relevant
    	  },
    	  "properties":{
    		"a":{"enum":[1],"type":"integer"},                                                                                                // Expect number 1, not string "1"
    		"b":{"items":{"enum":["check-string"],"type":"string"},"type":["array","null"]},                                                  // Correct: enum in items
    		// "c":{"items":{"$ref":"#/definitions/JsonschemaGoTestWithValNamedEnum"},"type":["array","null"]},                               // Keep if relevant
    		"d":{"items":{"items":{"items":{"enum":["d"],"type":"string"},"type":"array"},"type":"array"},"type":["array","null"]},          // Correct: enum in deepest items
    		"f":{"additionalProperties":{"enum":["f"],"type":"string"},"type":["object","null"]},                                             // Correct: enum in additionalProperties (for map values)
    		"g":{"additionalProperties":{"additionalProperties":{"items":{"enum":[1],"type":"integer"},"type":"array"},"type":"object"},"type":["object","null"]} // Correct: enum in deepest items, and number 1
    	  },
    	  "type":"object"
    	}`, got)
    }

    // Dummy type for compilation if needed for the test struct
    // type withValNamedEnum string
    // func (w withValNamedEnum) Enum() []interface{} { return []interface{}{""} }
    // func (w withValNamedEnum) EnumNames() []string { return []string{"n:"} }
  • Improvement rationale:
    • Technical benefits: Restores crucial test coverage for various enum scenarios. The updated assertions validate both the correct placement (inside items or additionalProperties) and the correct typing of enum values (e.g., numbers for int fields).
    • Business value: Ensures that the library correctly handles a wider range of common enum use cases, preventing regressions and improving reliability for users employing enums with complex data structures.

3.2 Key Quality Aspects

  • Testing strategy and coverage: The removal of TestReflector_Reflect_EnumsPlacement represents a significant regression in test coverage for enum functionality, particularly concerning nested types. Restoring and updating this test is critical. The existing CI checks (lint, other tests) passed, but they didn't cover the specific scenarios removed.
  • Documentation needs: The change in behavior (reverting to potentially non-spec-compliant placement) necessitates updated documentation to manage user expectations and explain the current limitations or trade-offs.

4. Overall Evaluation

  • Technical assessment: This PR successfully fixes a critical bug related to the incorrect serialization of primitive enum types (e.g., int enums becoming strings). However, it achieves this by reverting to a previous state that is non-compliant with the JSON Schema specification regarding enum placement for container types (arrays/maps). The underlying issue of type-aware enum value processing remains unaddressed.
  • Business impact: Fixes a significant data type corruption bug, which is valuable. However, the reintroduction of spec non-compliance for nested enums might negatively impact users relying on strict validation or code generation based on the schema structure for arrays/maps. It represents a trade-off between two different kinds of correctness.
  • Risk evaluation: High risk of causing issues for users who depend on spec-compliant enum placement within container types. Medium risk that the underlying type-mismatch bug will reappear in other contexts if not addressed directly. Low risk in the sense that it fixes a definite bug reported by the author.
  • Notable positive aspects and good practices: Addresses a clear and impactful bug (#124 regression). The revert is straightforward.
  • Implementation quality: The revert itself is simple and correct. However, it leaves the codebase in a state with known specification deviations and reduced test coverage for the affected area.
  • Final recommendation: Request Changes 🟡. While fixing the primitive type serialization bug is important, reverting to a spec-non-compliant state for container enums and removing test coverage is not ideal. The recommended path is to:
    1. Re-introduce the logic for placing enums within items/additionalProperties (similar to the reverted #124).
    2. Crucially, add type-aware conversion logic (as suggested in P1/Code Logic Analysis) to ensure enum values match the target Go type, fixing the original bug of #124.
    3. Restore and update the deleted test case to validate both correct placement and correct typing.

💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant