-
Notifications
You must be signed in to change notification settings - Fork 358
Description
What is the issue with the Fetch Standard?
I ran into this with web-platform-tests/wpt#54373 and was asked to open an issue here.
The story starts with #747, which noted that many servers expect Accept-Encoding:identity
to be set for HTTP range requests. The spec was changed in #751 so that Accept-Encoding:identity
is appended to the header list if the Range
header is set. This means that if Accept-Encoding
was previously set to any other values (like gzip, br
, etc), they will be kept.
The rationale given in the issue for using append
rather than set
is that no specified behavior otherwise touches the Accept-Encoding
header:
there cannot be a header named
Accept-Encoding
until that point as that's a forbidden header name. I guess the only exception would be a specification explicitly setting it, but that'd be a bug in that specification.
In practice, it seems every client sets Accept-Encoding
beforehand, probably because the HTTP spec says that a missing Accept-Encoding
allows the server to return any encoding it wants. If the client sets Accept-Encoding
to a wide range of values beforehand, then blindly appending identity
according to the spec will do nothing.
This was noted as a potential pitfall in #747, but dismissed:
youennf:
Reading the html version of the spec, we are currently appending 'identity' to Accept-Encoding header value. If this header is already set, we might combine the header value, probably defeating the purpose of this PR. Maybe there should be a note stating that there should be no Accept-Encoding header until that point or the header should be set and not appended.
annevk:
there cannot be a header named
Accept-Encoding
until that point as that's a forbidden header name. I guess the only exception would be a specification explicitly setting it, but that'd be a bug in that specification.
youennf:
I understand this is a layering violation, and that theoretically, set and append in that case should give the exact same value. In WebKit, our media engine will actually set this header beforehand. If we were to blindly implement the specification, we would probably end-up with 'identity, identity' as the header value for media loads. I do not feel strongly here though.
annevk:
If you blindly implemented all specifications that would not happen, but I see how it can happen if you do it for one, but not the other. I think in that case it's fair that you're on your own though. You're always allowed to implement things in whatever way you want, as long as it's black box indistinguishable. And the tests will help you with such setups.
It was perhaps a bit optimistic to assume the tests would ensure a correct implementation of the spec change. When Firefox tried to implement this in 2022, they made the very mistake originally predicted in #747, and naively appended Accept-Encoding:identity
after having already set it to a broader range of values as an ad-hoc behavior. However, when they saw the failing WPT fetch tests, they did not realize their mistake. Instead, they looked back at the spec, assumed that the incorrect behavior was intended, and changed the tests to allow the incorrect behavior in web-platform-tests/wpt#35804!
I think it takes a fair bit of reading-between-the-lines and carefully examining the spec to tell which headers the spec expects to never be set. This is especially the case because we're looking for headers that the spec does not mention--we can't Ctrl+F here! If an implementation sets any header that the spec assumes was previously absent, incorrect behavior could result if they then follow the spec and append
to it.
Other headers were changed to be append
ed instead of set
in #807; I believe they should be changed back to set
along with Accept-Encoding
. I have also not dug too deeply into the fetch API spec; I don't know if it touches on the topic of implementations setting their own headers, but that would probably be a good thing to clarify as well.