Caching Tutorial for Web Authors and Webmasters
June 21, 1999
|
This is an informational document. Although
technical in nature, it attempts to make the concepts involved
understandable and applicable in real-world situations. Because of
this, some aspects of the material are simplified or omitted, for
the sake of clarity. If you are interested in the minutia of the
subject, please explore the References
and Further
Information at the end.
|
What's a Web Cache? Why do people use them?
A Web cache sits between Web servers (or origin
servers) and a client or many clients, and watches requests
for HTML pages, images and files (collectively known as
objects) come by, saving a copy for itself. Then, if there is
another request for the same object, it will use the copy that it
has, instead of asking the origin server for it again.
There are two main reasons that Web caches are used:
- To reduce latency - Because the request is
satisfied from the cache (which is closer to the client) instead of
the origin server, it takes less time for the client to get the
object and display it. This makes Web sites seem more
responsive.
- To reduce traffic - Because each object is
only gotten from the server once, it reduces the amount of
bandwidth used by a client. This saves money if the client is
paying by traffic, and keeps their bandwidth requirements lower and
more manageable.
If you examine the preferences dialog of any modern browser
(like Internet Explorer or Netscape), you'll probably notice a
'cache' setting. This lets you set aside a section of your
computer's hard disk to store objects that you've seen, just for
you. The browser cache works according to fairly simple rules. It
will check to make sure that the objects are fresh, usually once a
session (that is, the once in the current invocation of the
browser).
This cache is useful when a client hits the 'back' button to go
to a page they've already seen. Also, if you use the same
navigation images throughout your site, they'll be served from the
browser cache almost instantaneously.
Web proxy caches work on the same principle, but a much larger
scale. Proxies serve hundreds or thousands of users in the same
way; large corporations and ISP's often set them up on their
firewalls.
Because proxy caches usually have a large number of users behind
them, they are very good at reducing latency and traffic. That's
because popular objects are requested only once, and served to a
large number of clients.
Most proxy caches are deployed by large companies or ISPs that
want to reduce the amount of Internet bandwidth that they use.
Because the cache is shared by a large number of users, there are a
large number of shared hits (objects that are requested by
a number of clients). Hit rates of 50% efficiency or greater are
not uncommon. Proxy caches are a type of shared cache.
- What's a Web Cache? Why do
people use them?
- Kinds of Web Caches
- Browser Caches
- Proxy Caches
- Aren't Web Caches bad for me? Why
should I help them?
- How Web Caches Work
- How (and how not) to Control Caches
- HTML Meta Tags vs. HTTP
Headers
- Pragma HTTP Headers (and why
they don't work)
- Controlling Freshness with the Expires
HTTP Header
- Cache-Control HTTP
Headers
- Validators and
Validation
- Tips for Building a Cache-Aware
Site
- Writing Cache-Aware
Scripts
- Frequently Asked Questions
- A Note About the HTTP
- Implementation Notes - Web
Servers
- Implementation Notes -
Server-Side Scripting
- References and Further
Information
- About This Document
Aren't Web Caches bad for me? Why should I help them?
|