[PP-7327] Don't compact GraphQL content items #3589

yndajas · 2025-09-10T10:45:35Z

Unfortunately the compacting we introduced via the GraphqlContentItemService results in content items that are invalid against the schema in at least some cases. This removes the compacting and adds a note about the limitations of the live content validation test

Closes #3587 (which went further down the line of removing empty fields)

This was the original motivation for creating this service, but we've since found that it results in a content item that's not valid against the schema for at least some content types For instance, the world index requires a description. Its actual description is nil, but to be valid against the schema we still need to include the property. This service was removing the property altogether because of the top-level compacting In practice the frontend apps are unlikely to be relying on the existence of keys whose values are nil, but we think that producing a schema-valid response is preferable to a minimal one, barring any performance issues This issue was surfaced when trying to expand the compacting to links and all nested objects. This led to more validation errors before closer inspection revealed the issue with the existing processing NOTE: this is a red commit. The live_content_spec won't pass with these changes. Another change to this file is necessary to produce a valid content item for news articles based on their current query - it feels preferable to break that into a separate commit and update the tests after ```rb def edition deep_transform_hash(query_result.dig("data", "edition")) end def deep_transform_hash(hash) hash.map { deep_transform(_1.to_s, _2) }.compact.to_h end def deep_transform(key, value) case [key, value] in [String, Hash] new_value = deep_transform_hash(value) [key, new_value] unless new_value == {} in [String, [Hash, *]] new_value = value.map(&method(:deep_transform_hash)) [key, new_value] unless new_value == [] in [String, nil | "" | [] | {}] nil else [key, value] end end ```

This didn't surface validation issues introduced when adding in some compacting via the GraphqlContentItemService. On the world index, For example: - the schema requires a description property even if it's nil - the service would remove the property from the real content item because the description is nil - the test would include a description via the factory-generated edition, so we weren't testing the behaviour when the description is nil I tried using `GovukSchemas::RandomSchemaGenerator` to generate a more realistic content item, but it's only vaguely realistic. I tried adding a new strategy to the generator to produce a minimal valid content item, but it required significant processing to be in the right shape to pass into a FactoryBot `create` method. I tried PUTting it, but that requires some stubbing We still think this provides some value insofar as telling us if we produce all the expected fields when they are present in the edition, but we could introduce some other kind of processing in the future with which this might not catch issues, hence the note A more thorough approach when introducing new content types is to validate against a replicated database (real data) using `script/live_content/validate` (a bulk version is incoming)

This mimics Content Store behaviour and addresses a schema validation issue. When validating against the schema: - the property being absent is fine (which is what happened when compacting the top-level fields) - presumably it's not required - the property being an empty hash is fine, since neither of its properties are required - the property being nil is not allowed We could leave the property out when not selected by a query and therefore not present in the result, but mimicking Content Store's behaviour seems preferable and doing so keeps the logic simple We could update the `edition_type` to return an empty hash when there's no withdrawn_notice instead of nil, but then when field selections are made on the withdrawn_notice they get set to nil, and you just end up with a more complex invalid object (withdrawn_notice's fields can't be nil) This was surfaced by news articles in the live_content_spec - news articles are the only content type whose query currently selects withdrawn_notice

Removing the compacting has surfaced some more requirements for the input in the live_content_spec. All of these should be required in order to publish an edition of the given type, so hopefully we aren't just filtering out realistic scenarios... ... but this is starting to require quite a bit of test setup overhead

spec/integration/graphql/live_content_spec.rb

yndajas · 2025-09-10T13:26:33Z

spec/integration/graphql/live_content_spec.rb

-    news_article: { details: { body: "" } },
+    ministers_index: {
+      details: { body: "", reshuffle: {} },
+    },
+    news_article: {
+      details: {
+        body: "",
+        change_history: [],
+        emphasised_organisations: [],
+        first_public_at: "2016-04-11T10:52:00.000+01:00",
+        image: { url: "" },
+        political: false,
+      },
+    },
+    role: { details: { body: "", supports_historical_accounts: false } },


issue: the approach I've taken to extending this is to check the validation errors and add the data into the edition we create. However, we get some of the same validation errors with real data (checked using script/live_content/bulk_validate), so we shouldn't assume these are all/necessarily issues with the shape of the test data 🤔

Interestingly, Content Store responses aren't necessarily valid: there's an error when validating https://www.gov.uk/api/content/government/news/foreign-secretary-welcomes-second-ipcc-report-on-climate-change. But if we do want to aim for fully valid content items, we should probably be relying more on the bulk validation script. I'm losing faith in this spec being useful given the amount of setup combined with its limitations

The errors reported by RSpec are truncated by default. It's useful to see all of them when there are any so you don't need to rerun the tests after addressing each one

yndajas · 2025-09-11T14:14:56Z

We probably won't do this since it just leads to other kinds of validation errors. Per the schemas, some fields are required and can be nil, some are optional but can't be nil, some are required and can't be nil, and neither the current nor this approach address all scenarios. We have a plan to do an 'intelligent' schema-aware compact instead

yndajas · 2025-09-20T02:42:09Z

Some of these commits were merged separately. The main purpose of this PR has been fulfilled by the new schema-aware compaction

yndajas force-pushed the dont-compact-graphql-content-items branch 2 times, most recently from 874f1cb to f643a49 Compare September 10, 2025 13:07

yndajas force-pushed the dont-compact-graphql-content-items branch from f643a49 to 5dbfc92 Compare September 10, 2025 13:10

yndajas added 3 commits September 10, 2025 14:12

yndajas force-pushed the dont-compact-graphql-content-items branch from 5dbfc92 to 40f8ce8 Compare September 10, 2025 13:12

yndajas changed the title ~~[PP-7327] Dont compact graphql content items~~ [PP-7327] Dont compact GraphQL content items Sep 10, 2025

yndajas changed the title ~~[PP-7327] Dont compact GraphQL content items~~ [PP-7327] Don't compact GraphQL content items Sep 10, 2025

mike3985 reviewed Sep 10, 2025

View reviewed changes

spec/integration/graphql/live_content_spec.rb Outdated Show resolved Hide resolved

yndajas commented Sep 10, 2025

View reviewed changes

Print all validation errors in live_content_spec

bb5e9df

The errors reported by RSpec are truncated by default. It's useful to see all of them when there are any so you don't need to rerun the tests after addressing each one

yndajas force-pushed the dont-compact-graphql-content-items branch from 40f8ce8 to bb5e9df Compare September 10, 2025 13:31

yndajas marked this pull request as draft September 11, 2025 14:12

yndajas closed this Sep 20, 2025

yndajas deleted the dont-compact-graphql-content-items branch September 20, 2025 02:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PP-7327] Don't compact GraphQL content items #3589

[PP-7327] Don't compact GraphQL content items #3589

Uh oh!

yndajas commented Sep 10, 2025

Uh oh!

Uh oh!

yndajas Sep 10, 2025 •

edited

Loading

Uh oh!

yndajas commented Sep 11, 2025

Uh oh!

yndajas commented Sep 20, 2025

Uh oh!

Uh oh!

[PP-7327] Don't compact GraphQL content items #3589

[PP-7327] Don't compact GraphQL content items #3589

Uh oh!

Conversation

yndajas commented Sep 10, 2025

Uh oh!

Uh oh!

yndajas Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yndajas commented Sep 11, 2025

Uh oh!

yndajas commented Sep 20, 2025

Uh oh!

Uh oh!

yndajas Sep 10, 2025 •

edited

Loading