-
Notifications
You must be signed in to change notification settings - Fork 16
[PP-7327] Don't compact GraphQL content items #3589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
874f1cb
to
f643a49
Compare
This was the original motivation for creating this service, but we've since found that it results in a content item that's not valid against the schema for at least some content types For instance, the world index requires a description. Its actual description is nil, but to be valid against the schema we still need to include the property. This service was removing the property altogether because of the top-level compacting In practice the frontend apps are unlikely to be relying on the existence of keys whose values are nil, but we think that producing a schema-valid response is preferable to a minimal one, barring any performance issues This issue was surfaced when trying to expand the compacting to links and all nested objects. This led to more validation errors before closer inspection revealed the issue with the existing processing NOTE: this is a red commit. The live_content_spec won't pass with these changes. Another change to this file is necessary to produce a valid content item for news articles based on their current query - it feels preferable to break that into a separate commit and update the tests after ```rb def edition deep_transform_hash(query_result.dig("data", "edition")) end def deep_transform_hash(hash) hash.map { deep_transform(_1.to_s, _2) }.compact.to_h end def deep_transform(key, value) case [key, value] in [String, Hash] new_value = deep_transform_hash(value) [key, new_value] unless new_value == {} in [String, [Hash, *]] new_value = value.map(&method(:deep_transform_hash)) [key, new_value] unless new_value == [] in [String, nil | "" | [] | {}] nil else [key, value] end end ```
f643a49
to
5dbfc92
Compare
This didn't surface validation issues introduced when adding in some compacting via the GraphqlContentItemService. On the world index, For example: - the schema requires a description property even if it's nil - the service would remove the property from the real content item because the description is nil - the test would include a description via the factory-generated edition, so we weren't testing the behaviour when the description is nil I tried using `GovukSchemas::RandomSchemaGenerator` to generate a more realistic content item, but it's only vaguely realistic. I tried adding a new strategy to the generator to produce a minimal valid content item, but it required significant processing to be in the right shape to pass into a FactoryBot `create` method. I tried PUTting it, but that requires some stubbing We still think this provides some value insofar as telling us if we produce all the expected fields when they are present in the edition, but we could introduce some other kind of processing in the future with which this might not catch issues, hence the note A more thorough approach when introducing new content types is to validate against a replicated database (real data) using `script/live_content/validate` (a bulk version is incoming)
This mimics Content Store behaviour and addresses a schema validation issue. When validating against the schema: - the property being absent is fine (which is what happened when compacting the top-level fields) - presumably it's not required - the property being an empty hash is fine, since neither of its properties are required - the property being nil is not allowed We could leave the property out when not selected by a query and therefore not present in the result, but mimicking Content Store's behaviour seems preferable and doing so keeps the logic simple We could update the `edition_type` to return an empty hash when there's no withdrawn_notice instead of nil, but then when field selections are made on the withdrawn_notice they get set to nil, and you just end up with a more complex invalid object (withdrawn_notice's fields can't be nil) This was surfaced by news articles in the live_content_spec - news articles are the only content type whose query currently selects withdrawn_notice
Removing the compacting has surfaced some more requirements for the input in the live_content_spec. All of these should be required in order to publish an edition of the given type, so hopefully we aren't just filtering out realistic scenarios... ... but this is starting to require quite a bit of test setup overhead
5dbfc92
to
40f8ce8
Compare
news_article: { details: { body: "" } }, | ||
ministers_index: { | ||
details: { body: "", reshuffle: {} }, | ||
}, | ||
news_article: { | ||
details: { | ||
body: "", | ||
change_history: [], | ||
emphasised_organisations: [], | ||
first_public_at: "2016-04-11T10:52:00.000+01:00", | ||
image: { url: "" }, | ||
political: false, | ||
}, | ||
}, | ||
role: { details: { body: "", supports_historical_accounts: false } }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: the approach I've taken to extending this is to check the validation errors and add the data into the edition we create. However, we get some of the same validation errors with real data (checked using script/live_content/bulk_validate), so we shouldn't assume these are all/necessarily issues with the shape of the test data 🤔
Interestingly, Content Store responses aren't necessarily valid: there's an error when validating https://www.gov.uk/api/content/government/news/foreign-secretary-welcomes-second-ipcc-report-on-climate-change. But if we do want to aim for fully valid content items, we should probably be relying more on the bulk validation script. I'm losing faith in this spec being useful given the amount of setup combined with its limitations
The errors reported by RSpec are truncated by default. It's useful to see all of them when there are any so you don't need to rerun the tests after addressing each one
40f8ce8
to
bb5e9df
Compare
We probably won't do this since it just leads to other kinds of validation errors. Per the schemas, some fields are required and can be nil, some are optional but can't be nil, some are required and can't be nil, and neither the current nor this approach address all scenarios. We have a plan to do an 'intelligent' schema-aware compact instead |
Some of these commits were merged separately. The main purpose of this PR has been fulfilled by the new schema-aware compaction |
Unfortunately the compacting we introduced via the
GraphqlContentItemService
results in content items that are invalid against the schema in at least some cases. This removes the compacting and adds a note about the limitations of the live content validation testCloses #3587 (which went further down the line of removing empty fields)