Development of the Domain Name System

What is the Problem?

The original solution for naming was to share single file HOSTS.TXT that contained all the hostnames and their corresponding IP addresses. This originally worked fine, since the number of hosts was proportional to the number of timesharing systems. However, as the internet evolved to consist of many networks, each with many hosts, this solution became unscalable. Instead, a distributed database was needed to store the mappings of hostnames to IP addresses. This paper describes the development of the Domain Name System (DNS) to solve this problem.

Summary

Design Requirements

Architecture

Two main components: - name server: repository of mappings of names to data, responsible for answering queries - resolver: interface for client programs to query name servers

The lines are blurred, and particular this architecture allows for organizations to maintain a centralized name server/resolver to be shared by all hosts in the organization, meaning PCs wouldn't need to run their own resolver to resolve names.

Name space

Domains are organized hierarchically, with all names sharing a common "null" root. DNS doesn't make any assumptions about the structure or presentation of names, but does suggest that domains should model the organization they represent.

Data attached to names

DNS allows for arbitrary data to be attached to names, but does organize data into types. Each name has corresponding resource records (RRs) that contain a type and class, as well as unstructured application data. Type represents the abstract resource, and class represents the protocol family/instance, or in some case functionality (e.g. a universal mail registry).

Types and classes were originally expected to be often extended, but it is pretty rare in practice, and more bits than necessary were allocated to these fields.

Interestingly, when responding to queries, a name server is free to return any data in wants in addition to the requested data. It can therefore anticipate future queries to cut down on the number of round trips. For example, root servers include both the host address, and name when passing back the name of a host.

Database distribution

Zones

A complete description of a contiguous portion of the domain name space, along with some "pointers" to other zones. Zones can be either a single node, or the whole tree, but are typically a subtree.

An organization can get control of a zone by persuading its parent to insert RRs in its zone to mark a zone division, e.g. the CS dept. got cs.washington.edu by having washington.edu insert RRs to mark a zone boundary between washington.edu and cs.washington.edu.

An organization managing a zone should provide redundancy by having multiple name servers for the zone, and should maintain a "master file" and make it available within a "master" server. Then, secondary servers are either manually refreshed, or use a zone refresh algorithm which queries serial numbers and updates if it has increase. Zone transfers happen over TCP.

NSs can support any number of zones, which may or may not be contiguous. In fact, they don't even need to be in corresponding zone. When an NS responds to a query without cached data, it marks that response as "authoritative".

Caching

Each RR has an associated TTL in seconds, which is the maximum time a resolver can reuse the cached record (zero TTL means no caching). The administrator of a zone sets the TTL for each RR as part of their zone definition.

Cached answers should be "as good" as authoritative answers, and the design seeks to enable updates before TTL expiration. If both are available, the authoritative answer should be preferred.

As am optimization, negative responses should also be cached, e.g. non existent domain, or domain with no data associated.

Root servers

Resolvers search "downward" from domains they already know. They usually also have "hints" pointing at root servers and the top of their local domain. Thus, if a resolver can find a root server, it can find any domain. On the other hand, if a resolver is partitioned, it can at least resolve names in its local domain.

Root servers must be highly available and distributed geographically.

Key Insights

Notable Design Details/Strengths

Limitations/Weaknesses

Summary of Key Results

Open Questions