Re: Comments on HTTP draft [of 23 Nov 1994]
Henrik Frystyk Nielsen (frystyk@ptsun00.cern.ch)
Wed, 30 Nov 94 15:20:50 +0100
> > b) makes it absolutely clear what is required for a server to be stated
> > as conforming to the definition.
>
> Yes. Unfortunately (or fortunately, depending on how you look at it),
> almost all of the protocol is optional, and thus all the strict conformance
> areas must be surrounded by lots of IFs, HOWEVERs, and UNLESSes.
> But, getting it right is still the goal.
That's why there are so many `strongly recommended' ;-)
> > 2.1 The '#' rule implies that no whitespace is allowed after (or before)
> > the commas in a list. Is this correct? For example, in the example
> > in 7.1 there is a space after each comma (which certainly aids
> > readability).
>
> One major problem with the augmented BNF used by RFC822 (and MIME)
> is that it allows any amount of linear white space between tokens,
> but never specifies that in the rule definitions (it is treated as
> a general assumption for all rules). I can understand why -- it is
> much easier to read the rules without the extra clutter -- but it runs
> against my desire for formality.
>
> What do people think? Define a new rule, e.g.
>
> ESP = 1*( [CRLF] LWSP-char )
>
> and insert it everywhere that zero or more linear-white-space is allowed?
> Or, just stick with the current method but add an explanation in 2.1.
Roy and I discussed this and I think that are argued for the current
model. However, an explanation in 2.1 is a good idea. The only place
where this is a problem is in RequestLine and StatusLine where
linear-white-space isn't allowed. Maybe this is not pointed out clearly
enough!
> > (c) The ctext rule seems to be missing some characters (there's an
> > open quote, followed by an open single quote, but neither is
> > closed). Also, shouldn't LF be excluded too?
>
> Hmmm... should comments be allowed to fold (i.e. extend over multiple lines)?
> If so, then CR should not be excluded. If not, then LF needs to be excluded.
As there is no current implementation I know of using comments it could
actually be left out - especially when we discourage usage of
comments. The current spec is a compromise that I am quite happy with
- that is comments are allowed but not on multiple lines.
> Question for HTTP/1.1: Should we attempt to extend MIME C-T-E to be something
> useful, or just change Content-Encoding to be an ordered list of encodings
> rather than the current single-token?
I don't think this field is sufficiently important in HTTP to actually
do anything about it.
> > It would seem to be appropriate that the HTTP protocol specify that
> > Content-Length, in bytes, be Required--at least for Requests.
It is unfortunately not always possible to get the content-length
before transmission and at the same time preserve a reasonable
performance. An example is when a proxy is handling a FTP request and
returns the body to the client using HTTP.
> > (b) Does the server have to read the headers (and data, if any) on a
> > request? For example, if the customizing filter/script doesn't need
> > the information in order to determine the response, is the server
> > permitted to leave the data unread, or could this embarrass some
> > client(s) or TCP/IP stack(s)?
>
> The Request-Headers must be read, though implementations may want to optimize
> and assume that if the Request-Headers have not completed within the first
> read buffer, then the rest are not important. Since that is only a problem
> for SOME BROKEN CLIENTS, most servers could successfully pretend to be
> conformant while implementing the optimization. Obviously, if there is
> any data in the request, the server had better read it (unless the request
> line has already errored-out).
I fully agree - if the headers are not read then we are close to go
back to HTTP version 0.9!
> > 5. Request: The 0.9 requirement here (Simple Request must have Simple
> > Response) is somewhat onerous on a server. Is it possible to relax
> > or remove this requirement yet, or are there still 0.9-only clients
> > in use? I've noticed that at least some Simple Responses will not
> > go through some proxies transparently.
>
> I believe proxies require Full-Requests. Ari?
The CERN proxy talks HTTP 1.0 (almost) with proxy clients even if it
receives a 0.9 response from the origin server. I don't know how many
0.9 applications are still around, but I think that at least this spec
must have the notation documented. I think it can be taken out in futue
releases.
> > 5.2 Method and 5.2.2 Head: I'm surprised that the HEAD method *must* be
> > supported, as it is ill-defined. 5.2.2 simply says that there must
> > be no Object-Body; it seems that the header may or may not be
> > related to the header that would be sent if the method were GET, and
> > in particular, HEAD may as well just return an empty (null) header.
>
> The definition of all the request methods need fleshing-out, but support
> for HEAD will still be required.
>
> > Further, in many cases the cost of determining, building and sending
> > the header is going to be the major part of many transactions, so
> > should clients or proxies be encouraged to use this Method?
>
> That only reflects the cost for the server. The cost to the rest of the
> network is significantly less, and that is what is important here.
For some types of applications it can be useful, however whenever
possible a conditional GET should be used instead!
> > (b) Can the URI returned via a URI-header be a partial URI as
> > described in 5.4? Or does it have to be a full URI? [I infer the
> > latter.]
>
> Many people have requested the former, but I have yet to see an
> implemented example of it. Now that I have separated the URI-header
> definition from Location, we could define it as being allowed for
> URI but not for Location.
Sounds like a good idea!
> > 5.3 HTTP Version: Given that this definition is more rigorous than
> > earlier documents, and hence must be more constraining, it would
> > seem to be necessary to change the version number (perhaps to 1.1)
> > to reflect the stricter conditions for compliance. If the version
> > number is not changed, then the date of the relevant HTTP 1.0
> > document would have to be specified at every reference.
We have focused a lot on keeping the document in accordance with
existing implementations so I think that 1.0 would be appropriate. Roy
has a mail macro saying: "completely unacceptable this will break
existing implemtations!" ;-)
> > 5.4 URI Note 1: What's 'default escaping'? What characters may be
> > 'considered unsafe'? The "should" (probably meant to be "shall"?)
> > implies that a server must comply with these conditions, but they do
> > not seem to be well defined.
The note is taken from RFC 1630 and is meant to indicate that escaping
should be kept at a minimum. Unsafe characters are described in 1630.
> Yes, but proper implementation (where appropriate) of the conditional
> GET protocol will be strongly recommended and required for all new servers.
Refer to earlier note on HEAD!
> > 6.3.3 404 Method Not Allowed: is this only for the defined methods, or
> > should this also be used for a misspelled or unrecognized method
> > name?
>
> Only for methods implemented by the server, but not allowed for the object
> requested.
Misspelling etc are followed by a 400 code
> > 6.3.4 500 Server Error: 502 (twice) and 504 call this "500 Internal
> > Error". All should say "Server Error"?
> > 17 Bad servers: The first paragraph here sounds like a Compliance
> > statement. As such, it should be in the body of the document, not an
> > appendix? The document certainly needs a Compliance section.
>
> Hmmmm, seems a bit overboard to me. Henrik?
I my opinion, the specification itself should not consider `bad'
applications. Especially when so little is actually required in order
to be a HTTP 1.0 application. These things belong in an appendix.
-- cheers --
Henrik Frystyk
frystyk@W3.org
+ 41 22 767 8265
World-Wide Web Project,
CERN, CH-1211 Geneva 23,
Switzerland