Thursday, November 5, 2009

Development of the Domain Name System

[paper]

This paper tells the story of the invention of DNS. Prior to DNS, every host had a plain text file named HOSTS.TXT that contained addressing information for all places they might want to reach. Clearly this was not scalable.

DNS has two components: name servers and resolvers. Name servers hold the addressing information & answer queries, and resolvers find name servers. Data for each DNS name is a "resource record" (RR) that carries a type, class field, and data. There are two mechanisms for transferring data from source to destination: zones and caching. A zone is a section of the tree controlled, updated, and maintained by a single organization. Resolvers (which can be centrally located or on a host itself) cache responses for each zone for as long as the TTL for that zone. A resolver is configured to point at servers for the root node and the top of the local domain. It then searches "downwards."

Surprises they encountered when DNS use became widespread:

-- Refinemenet of semantics. The form of data specification ended up confusing, eg how to order multiple IP addresses for a single host.
-- Performance. Initial performance was much worse than they expected. Root servers would get multiple copies of the same query from the same source because their responses were slower than resend timers. Measuring DNS performance precisely, however, proved problematic because of gateway changes, software releases, etc.
-- Negative caching. DNS can respond negatively to a query in 2 ways: name does not exist, or name exists but they don't have data for it. They expected negative responses to be rare but they ended up being extremely popular (20-60% of all queries) because of transition confusion (use of old addresses and shortcuts, etc). They expected it would drop off but it actually stayed between 15-50%. This is because a user would type one bad address and then their application would automatically try other potential addresses, in the process making a bunch of negative queries. They suggest that negative queries should be cached like positive queries to address this issue. I'm not sure what kind of a TTL should be put on negative query caches; perhaps there could be a force update command that would clear a negative query cache when that domain name got registered.

Their successes:

-- variable depth hierarchy
-- organizational structuring of names
-- datagram access
-- caching proved crucial!!

Shortcomings:

-- type & class growth -- either demand for new types & classes was misunderstood, or DNS makes new definitions too hard
-- upgrading from HOSTS.TXT -> DNS was hard, but seems like that isn't a long-term issue
-- distribution of control

I like this paper a lot, especially the "surprises" section. The negative caching is really interesting, I would never have guessed that.

Related paper that might be interesting: Protecting Browsers from DNS Rebinding Attacks. This paper is interesting, but I've never fully understood how it works. The basic idea is that Attacker.com wants to try to gain access to an internal resource (eg iternal.corporate.com). The attacker answers DNS queries for attacker.com with a really short TTL. After the user navigates correctly to attacker.com, the attacker then starts answering DNS queries for attacker.com with the IP address for internal.corporate.com (eg 10.10.10.105). A script on the loaded attacker.com website will issue a request for materials at attacker.com; this will then actually point to 10.10.10.105 and the attacker will successfully get the content. (Normally this operation should be denied, because attacker.com and internal.corporate.com are not the same domain.) The part I am confused about here is the administration of DNS. Why can any arbitrary person claim that any random IP address belongs to their domain name?

---------------

Class notes:

-- DNS is eventually consistent
-- Caching: scalability, reliability in outages, speed up flookup
-- Caching done in client resolution software + also in name servers

No comments:

Post a Comment

About Me

Berkeley EECS PhD student