This article is part of the series on the rust-http redesign, Teepee.
I say Status-Line
, but actually I don’t care about part of it; Status-Line
is defined in RFC 2616 (HTTP/1.1) as HTTP-Version SP Status-Code SP Reason-Phrase CRLF
, but all I care about in this article is the Status-Code
(e.g. 200
) and its corresponding Reason-Phrase
(e.g. OK
).
The current position
At present, rust-http has http::status::Status
. Here’s the implementation, distilled:
On the surface this approach looks very well; unfortunately, it turns out that it has some problems. For now, the main thing to note is that it treats 404 Not Found
as the same as 404 not found
but as distinct from 404 File Not Found
.
The specification
Later on we’ll look at the practical applications of these things; for now let’s take a look at what the spec says. RFC 2616 (HTTP/1.1), section 6.1.1:
The problem
I have marked the key phrases which the current implementation does not take properly into account. Simply, they boil down to this: the reason phrase is absolutely meaningless (sigh). I didn’t notice or didn’t pay attention to this first time around.
I also treated the field as case insensitive, having observed things written in lowercase sometimes, and having looked at how something else, I forget what, did it. This is incorrect; while in various places the spec defines things as case insensitive, the Reason-Phrase does not fall under such a category. The Reason-Phrase, in all its uselessness, must thus be considered case sensitive.
Well then—can we just drop the reason phrase?
The answer is both yes and no.
Yes: the integrity of the protocol itself is not affected if the reason phrase is altered; the semantics of the protocol lie purely in the Status-Code.
No: at the lowest level, we must provide the Reason-Phrase intact, because it may be meaningful. For example, someone might be writing an HTTP inspection tool for which they wish to report this value, or someone else might be writing a proxy, where changing the value would be a highly suspect move.
No: although the specification declares the number to be all that matters to a machine, there are cases (mostly with unregistered status codes) where people have used the same status code with multiple distinct meanings, and one may need to figure out what that meaning is. For example, the status code 451 has been used by Microsoft as “Redirect” in Exchange ActiveSync (niche, granted), and as “Unavailable for Legal Reasons” now.
So then, how do we reconcile this?
The possible consequences
By the way, this is a genuine problem and must be fixed; as it is, it will lead to people comparing statuses and suddenly finding bugs appearing when servers use different reason codes and all of a sudden their comparisons are not working. It might also get people comparing codes, as status.code() == 200
if they know of this deficiency. I do not want either of these to happen.
Some solutions
One solution to the primary symptoms is to change equality checking on a Status
(the Eq
implementation) to just compare the Status-Code and not the Reason-Phrase. Another method can be provided to check strict equality, inclusive of Reason-Phrase:
This doesn’t sit especially well with me (in Rust, people expect equality comparison to check everything), but it would serve the purpose.
Another not incompatible solution is to make it so that unless the user needs the reason-phrase, a known status is normalised so that 200 All Good
will come through as Ok
rather than as UnregisteredStatus(200, ~"All Good")
. This could be arranged with a response object from a request containing two fields, status: Status
containing the normalised status and raw_status: Option<Status>
containing, if it differed, the unnormalised status. At the lowest level, of course, the status code will not be normalised.
Either of these solutions will take care of the majority of cases, and each leaves a gap of potentially surprising (hence undesirable) behaviour. I am mildly inclined at present to go with both, but I would like opinions on the matter.
The status code class
This part is still relevant in take two.
One other thing I am going to add is better support for the class of a status. I don’t want people to be writing status.code() >= 400 && status.code() < 500
; they should instead be able to write status.class() == ClientError
.
The representation technique: enum or struct?
At present Status
is an enum. It could also be represented as a simple struct with plenty of constants:
Is this feasible? Sure. Is it better? I don’t know.
Using a struct and constants would remove the clash on the name Ok
which the Result
type uses (it would become OK
).
An enum uses less memory and is faster to compare on.
Using a struct and constants would allow slightly better ergonomics for others using unregistered statuses (they could create their own statics, rather than needing to have a function that generated it, owing to the ~str
contained in the current model, though that could also be changed to use SendStr
), leveling the field a little on what is an extremely rare case.
Pattern matching doesn’t work on struct statics (or for that matter, non‐C‐style enums). (Actually, it complains “unsupported constant expr” at the static site, rather than at the match! If you’re careful, you can probably find an ICE nearby—pnkfelix did. Bad.)
At present I’m mildly in favour of the status quo. Either way, if using SendStr
, the built‐in statics would still be special if I applied the normalisation technique, unless I were also to retain a mapping of IDs to statuses, ([Option<&'static Status>, ..500]
, it would be, I suppose. Nasty global state.)
Bear in mind also that the method currently uses the same technique, and should, I think, use the same technique in general.
But really, I’m open to being swayed. Do you have an opinion? Same goes for the earlier questions.
Feel free to chip in to the discussion at /r/rust.