[Illustrative] WITH RECURSIVE link expansion #3291
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an illustration of an approach to link expansion using PostgreSQL WITH RECURSIVE Queries.
This is similar to the approach taken in https://github.com/alphagov/govuk-graphql, but uses plain old ActiveRecord instead of introducing the Sequel ORM.
Potentially, this approach could be used as an alternative to the current approach that uses LinkGraph, NodeCollectionFactory, LinkReference and friends. It could also be an alternative to Dataloader for our main GraphQL implementation.
The general idea is to start with an edition (the
BaseEdition
) and a JSON array describing the link types to follow (theLinkExpansionRules
).We can then find linked editions one of four ways - forward link set links, forward edition links, reverse link set links, or reverse edition links. Each of these has its own query, and the four are joined together with a
UNION ALL
(ActiveRecord does this if you pass an array to.with()
, like we're doing with.with(..., all_links: [...])
).The newly found editions are joined with the next level of link types in the lookahead, and the query recurses until we reach the leaves of the JSON structure, or until we can't find any linked editions for a particular branch.
There are two workarounds in this code which feel a bit ugly to me, but otherwise I actually think it's not too bad (particularly when compared to the existing link expansion code).
Workaround 1: we can't call
jsonb_to_recordset()
directly, sadly, because the query planner assumes it will always return 100 rows, which leads to terrible query plans. To avoid this, we have to wrap it in a subquery and limit the result tomax_links_count
. Ugly, but not show stoppingly bad.Workaround 2: The recursive case uses a second
.with()
clause for a few reasons (which I'll skip here, for brevity's sake). Unfortunately ActiveRecord will attempt to pass this toUNION ALL
without wrapping it in parens, which the postgres parser is not happy about. To force it to be wrapped with parens, I've wrapped it inArel.sql(recursive_case.to_sql)
. Ugly, but not show stoppingly bad.There's a bit more work we'd need to do to actually use this. It only returns edition ids / content ids, so we'd need to look up the actual editions. We'd also need to handle cases like "there's both a draft and a published edition for this content id" and "there's both an english and welsh document for this content id" and "there are both edition and link set links for this link type / edition / content_id combination". I also haven't handled withdrawn editions here.
The big potential motivation for this would be that this recursive CTE is faster than the Dataloader approach, which might not be true. It is very fast though. Loading the 843 linked edition ids we need to render the ministers index page takes 12ms on my laptop.