Re: Issues-list item "CACHING-CGI"
David W. Morris (dwm@xpasc.com)
Tue, 29 Jul 1997 23:23:12 -0700 (PDT)
On Tue, 29 Jul 1997, Roy T. Fielding wrote:
> >Here's a revised version, to replace the second paragraph
> >in section 13.9:
> >
> > Some HTTP/1.0 cache operators have found that it is dangerous
> > to cache and reuse without revalidation responses to requests
> > for URLs that include any of the strings "cgi-bin", "htbin", or
> > "?", because applications have traditionally used these URLs in
> > conjunction with operations with significant side effects for
> > GET or HEAD methods. However, if such a response includes an
> > explicit, future, expiration time, then this implies that the
> > response may be cached and reused without revalidation until it
> > expires. If such a response includes a Last-Modified or Etag
> > header, this implies that the response may be reused after
> > revalidation (or without revalidation if explicitly fresh).
> >
> > A cache MUST NOT assign a heuristic expiration time to a
> > response for a URL that includes the strings "htbin", "cgi-bin", or
> > "?" in its rel_path part. If such a response does not
> > carry an explicit expiration time, it must be treated as
> > if it expires immediately.
>
> I'm pretty sure I said this before, but I don't know what list.
> I am completely opposed to this change. It is inaccurate to say that
> caching and reusing such responses is "dangerous". The *only* reason
> *some* caches do not provide heuristic caching of such responses is
> because the presence of query-based parameters make it unlikely to get
> a second "hit" on the cache, and because the the absence of a Last-Modified
> (and now Etag) makes it impossible to do an efficient update. In any case,
> this is an optimization which is dependent on the context and number of
> users of the cache, and not a requirement of the protocol.
>
> The protocol already provides mechanisms for marking a response as
> non-cachable. All other responses to a GET request are cachable.
I can't speak for the motivation of old cache authors, but I can speak as
an HTTP/1.0 application author from before any RFCs when one had to
reverse engineer everything and the empirical behavior I observed was that
GET requests which included a query part were not cached.
I support the behavior for handling HTTP/1.1 responses strictly
conforming to Roy's position but I believe somthing like Jeff's
proposed wording is necessary when the 1.1 cache is covering a 1.0
server.
Dave Morris