Part 3: Implement our resolver¶
So far we’ve been making our queries to 8.8.8.8
, and letting 8.8.8.8
do all
the work of figuring out the IP address for example.com
.
Now we’re going to switch gears and figure out the IP address for example.com
on our own.
Our goal is to write a resolve
function that we call like this: resolve("example.com")
.
First, we’ll import the functions we wrote in the previous parts:
from part_1 import header_to_bytes, question_to_bytes, encode_dns_name
from part_2 import DNSHeader, DNSQuestion, DNSRecord, DNSPacket
from part_2 import decode_name, parse_header, parse_question, parse_dns_packet
from part_2 import ip_to_string
3.1: don’t ask for recursion¶
We need to make a small fix to our build_query
function from part 1. Previously when we
built our query, we were asking a DNS resolver (a cache), so we set flags
to RECURSION_DESIRED
. Now we’re asking an
authoritative nameserver (the source of truth), so we need to set flags=0
instead.
TYPE_A = 1
CLASS_IN = 1
import random
def build_query(domain_name, record_type):
name = encode_dns_name(domain_name)
id = random.randint(0, 65535)
header = DNSHeader(id=id, num_questions=1, flags=0) # changed this line
question = DNSQuestion(name=name, type_=record_type, class_=CLASS_IN)
return header_to_bytes(header) + question_to_bytes(question)
3.2: write a send_query
function¶
Next, let’s write a function that asks a DNS server about a domain name. This is almost exactly the same as the code from section 1.5
: we just call parse_dns_packet
at the end.
import socket
def send_query(ip_address, domain_name, record_type):
query = build_query(domain_name, record_type)
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.sendto(query, (ip_address, 53))
data, _ = sock.recvfrom(1024)
return parse_dns_packet(data)
There’s nothing special going on here – we just build the query, send it, and parse the response.
Let’s run it just to see that it’s working:
send_query("8.8.8.8", "example.com", TYPE_A).answers[0]
DNSRecord(name=b'example.com', type_=1, class_=1, ttl=18366, data=b']\xb8\xd8"')
We can also query for example.com
’s TXT records, just for fun:
TYPE_TXT = 16
send_query("8.8.8.8", "example.com", TYPE_TXT).answers
[]
3.3: improve our parsing a little bit¶
We’re going to need to deal with one more record type here: the NS
record
type. This record type says “hey, I don’t have the answer, but this other
server does, ask them instead”.
So we need some code to parse the domain name.
TYPE_A = 1
TYPE_NS = 2
import struct
def parse_record(reader):
name = decode_name(reader)
data = reader.read(10)
type_, class_, ttl, data_len = struct.unpack("!HHIH", data)
# It would be more hygenic here to store the raw data and the
# parsed result in separate fields in DNSRecord, but we're lazy.
if type_ == TYPE_NS: # here's the code we're adding
data = decode_name(reader)
elif type_ == TYPE_A:
data = ip_to_string(reader.read(data_len))
else:
data = reader.read(data_len)
return DNSRecord(name, type_, class_, ttl, data)
Now let’s redefine parse_dns_packet
from Part 2 to use our new parse_record
function.
from io import BytesIO
from part_2 import parse_header, parse_question, decode_name
def parse_dns_packet(data):
reader = BytesIO(data)
header = parse_header(reader)
questions = [parse_question(reader) for _ in range(header.num_questions)]
answers = [parse_record(reader) for _ in range(header.num_answers)]
authorities = [parse_record(reader) for _ in range(header.num_authorities)]
additionals = [parse_record(reader) for _ in range(header.num_additionals)]
return DNSPacket(header, questions, answers, authorities, additionals)
3.4: query the root nameserver¶
Every DNS query starts with a root nameserver, and "198.41.0.4"
is the IP address for one of the root nameservers – a.root-servers.net
.
Before we write our resolve
function, let’s play around a bit to see what things look like.
response = send_query("198.41.0.4", "google.com", TYPE_A)
First, let’s look at the list of answers. This is empty – 198.41.0.4
doesn’t know what the IP address for google.com
is.
response.answers
[]
Next, let’s look at the list of “authority” records. These are saying “a.gtld-servers.net
has the answer you need”
response.authorities
[DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'e.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'b.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'j.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'm.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'i.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'f.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'a.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'g.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'h.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'l.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'k.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'c.gtld-servers.net'),
DNSRecord(name=b'com', type_=2, class_=1, ttl=172800, data=b'd.gtld-servers.net')]
Finally, let’s look at the additional records. These are giving us the IP addresses for all of the servers mentioned in the “authority” section – for example, the IP for e.gtld-servers.net
is 192.12.94.30
.
response.additionals
[DNSRecord(name=b'e.gtld-servers.net', type_=1, class_=1, ttl=172800, data='192.12.94.30'),
DNSRecord(name=b'e.gtld-servers.net', type_=28, class_=1, ttl=172800, data=b' \x01\x05\x02\x1c\xa1\x00\x00\x00\x00\x00\x00\x00\x00\x000'),
DNSRecord(name=b'b.gtld-servers.net', type_=1, class_=1, ttl=172800, data='192.33.14.30'),
DNSRecord(name=b'b.gtld-servers.net', type_=28, class_=1, ttl=172800, data=b' \x01\x05\x03#\x1d\x00\x00\x00\x00\x00\x00\x00\x02\x000'),
DNSRecord(name=b'j.gtld-servers.net', type_=1, class_=1, ttl=172800, data='192.48.79.30'),
DNSRecord(name=b'j.gtld-servers.net', type_=28, class_=1, ttl=172800, data=b' \x01\x05\x02p\x94\x00\x00\x00\x00\x00\x00\x00\x00\x000'),
DNSRecord(name=b'm.gtld-servers.net', type_=1, class_=1, ttl=172800, data='192.55.83.30'),
DNSRecord(name=b'm.gtld-servers.net', type_=28, class_=1, ttl=172800, data=b' \x01\x05\x01\xb1\xf9\x00\x00\x00\x00\x00\x00\x00\x00\x000'),
DNSRecord(name=b'i.gtld-servers.net', type_=1, class_=1, ttl=172800, data='192.43.172.30'),
DNSRecord(name=b'i.gtld-servers.net', type_=28, class_=1, ttl=172800, data=b' \x01\x05\x039\xc1\x00\x00\x00\x00\x00\x00\x00\x00\x000'),
DNSRecord(name=b'f.gtld-servers.net', type_=1, class_=1, ttl=172800, data='192.35.51.30')]
where did we get “198.41.0.4” from?¶
You might be wondering: where did we get this IP address 198.41.0.4
from? Isn’t that cheating?
Real DNS resolvers actually do hardcode the IP addresses of the root nameservers. This is because if you’re implementing DNS, you have to start somewhere – if you’re implementing DNS, you can’t use DNS to look up the IP address. Some links:
the code in
bind
which hardcodes the root nameserver IPs. You can see there that many of them haven’t changed since the year 2000.
3.5: query e.gtld-servers.net
¶
The root nameserver told us to ask e.gtld-servers.net
, so let’s do that. (We could have picked k.gtld-servers.net
instead – the choice we’re making is arbitrary).
The additional
section tells us its IP is 192.12.94.30
. So let’s ask that IP what the IP address for google.com
is.
response = send_query("192.12.94.30", "google.com", TYPE_A)
Let’s look at the list of answers again. This is empty – a.gtld-servers.net
doesn’t have the IP address for google.com
either.
response.answers
[]
Next, let’s look at the list of authorities.
response.authorities
[DNSRecord(name=b'google.com', type_=2, class_=1, ttl=172800, data=b'ns2.google.com'),
DNSRecord(name=b'google.com', type_=2, class_=1, ttl=172800, data=b'ns1.google.com'),
DNSRecord(name=b'google.com', type_=2, class_=1, ttl=172800, data=b'ns3.google.com'),
DNSRecord(name=b'google.com', type_=2, class_=1, ttl=172800, data=b'ns4.google.com')]
This is telling us to ask ns1.google.com
, ns2.google.com
, ns3.google.com
, etc.
Next, the additional records:
response.additionals
[DNSRecord(name=b'ns2.google.com', type_=28, class_=1, ttl=172800, data=b' \x01H`H\x02\x004\x00\x00\x00\x00\x00\x00\x00\n'),
DNSRecord(name=b'ns2.google.com', type_=1, class_=1, ttl=172800, data='216.239.34.10'),
DNSRecord(name=b'ns1.google.com', type_=28, class_=1, ttl=172800, data=b' \x01H`H\x02\x002\x00\x00\x00\x00\x00\x00\x00\n'),
DNSRecord(name=b'ns1.google.com', type_=1, class_=1, ttl=172800, data='216.239.32.10'),
DNSRecord(name=b'ns3.google.com', type_=28, class_=1, ttl=172800, data=b' \x01H`H\x02\x006\x00\x00\x00\x00\x00\x00\x00\n'),
DNSRecord(name=b'ns3.google.com', type_=1, class_=1, ttl=172800, data='216.239.36.10'),
DNSRecord(name=b'ns4.google.com', type_=28, class_=1, ttl=172800, data=b' \x01H`H\x02\x008\x00\x00\x00\x00\x00\x00\x00\n'),
DNSRecord(name=b'ns4.google.com', type_=1, class_=1, ttl=172800, data='216.239.38.10')]
This is telling us that the IPv4 address for ns1.google.com
is 216.239.32.10
(+ the IPv6 addresses and the IPv4 addresses for the other nameservers). These additional records that give us a nameserver’s IP address are sometimes called “glue records”.
Finally, let’s ask 216.239.32.10
for the IP address for google.com
:
send_query("216.239.32.10", "google.com", TYPE_A).answers
[DNSRecord(name=b'google.com', type_=1, class_=1, ttl=300, data='172.217.13.110')]
It worked! Hooray! You can see the IP address for google.com
at the end there (data='...'
).
The actual IP address will depend on where in the world you run the code, because google.com
has different IP addresses in different places in the world. This is a pretty common thing, you can look up “GeoDNS” for more.
3.6: write a (wrong) resolve
function¶
Now, let’s write a function to do all the steps we did above:
def get_answer(packet):
# return the first A record in the Answer section
for x in packet.answers:
if x.type_ == TYPE_A:
return x.data
def get_nameserver_ip(packet):
# return the first A record in the Additional section
for x in packet.additionals:
if x.type_ == TYPE_A:
return x.data
def resolve_wrong(domain_name, record_type):
nameserver = "198.41.0.4"
while True:
print(f"Querying {nameserver} for {domain_name}")
response = send_query(nameserver, domain_name, record_type)
if ip := get_answer(response):
return ip
elif nsIP := get_nameserver_ip(response):
nameserver = nsIP
else:
raise Exception("something went wrong")
resolve_wrong("google.com", TYPE_A)
Querying 198.41.0.4 for google.com
Querying 192.5.6.30 for google.com
Querying 216.239.34.10 for google.com
'172.217.13.110'
resolve_wrong("facebook.com", TYPE_A)
Querying 198.41.0.4 for facebook.com
Querying 192.5.6.30 for facebook.com
Querying 129.134.30.12 for facebook.com
'157.240.241.35'
Everything’s looking good! Our function works!
But when we try twitter.com
, things go terribly wrong:
resolve_wrong("twitter.com", TYPE_A)
Querying 198.41.0.4 for twitter.com
Querying 192.5.6.30 for twitter.com
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Cell In [21], line 1
----> 1 resolve_wrong("twitter.com", TYPE_A)
Cell In [18], line 11, in resolve_wrong(domain_name, record_type)
9 nameserver = nsIP
10 else:
---> 11 raise Exception("something went wrong")
Exception: something went wrong
3.7: what went wrong?¶
Let’s look at what happens when we query 192.12.94.30
for twitter.com
:
response = send_query('192.12.94.30', 'twitter.com', TYPE_A)
response.answers
[]
response.authorities
[DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'a.r06.twtrdns.net'),
DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'b.r06.twtrdns.net'),
DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'c.r06.twtrdns.net'),
DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'd.r06.twtrdns.net'),
DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'b.u06.twtrdns.net'),
DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'a.u06.twtrdns.net'),
DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'c.u06.twtrdns.net'),
DNSRecord(name=b'twitter.com', type_=2, class_=1, ttl=172800, data=b'd.u06.twtrdns.net')]
response.additionals
[]
What’s going on here is that the .com
nameserver (a.gtld-servers.net
) has told us who to ask next (a.r06.twtrdns.net
), but it hasn’t given us the IP address.
So we’re stuck – or are we?
Luckily, we’re a DNS resolver! And if we just call our resolve_wrong
function on a.r06.twtrdns.net
, we can figure out its IP.
resolve_wrong('a.r06.twtrdns.net', TYPE_A)
Querying 198.41.0.4 for a.r06.twtrdns.net
Querying 192.12.94.30 for a.r06.twtrdns.net
Querying 205.251.195.207 for a.r06.twtrdns.net
'205.251.192.179'
This gives us the IP address we need to continue on our way
send_query('205.251.192.179', 'twitter.com', TYPE_A).answers
[DNSRecord(name=b'twitter.com', type_=1, class_=1, ttl=1800, data='104.244.42.193')]
twitter.com
’s IP address is 104.244.42.129
! Hooray.
3.8: write our final resolve
function¶
Our resolve_wrong
function was almost perfect – it just needs to handle one more case where we’re not given the nameserver IP address and we need to look it up.
def get_nameserver(packet):
# return the first NS record in the Authority section
for x in packet.authorities:
if x.type_ == TYPE_NS:
return x.data.decode('utf-8')
def resolve(domain_name, record_type):
nameserver = "198.41.0.4"
while True:
print(f"Querying {nameserver} for {domain_name}")
response = send_query(nameserver, domain_name, record_type)
if ip := get_answer(response):
return ip
elif nsIP := get_nameserver_ip(response):
nameserver = nsIP
# New case: look up the nameserver's IP address if there is one
elif ns_domain := get_nameserver(response):
nameserver = resolve(ns_domain, TYPE_A)
else:
raise Exception("something went wrong")
Let’s try it out:
resolve("twitter.com", TYPE_A)
Querying 198.41.0.4 for twitter.com
Querying 192.5.6.30 for twitter.com
Querying 198.41.0.4 for a.r06.twtrdns.net
Querying 192.5.6.30 for a.r06.twtrdns.net
Querying 205.251.195.207 for a.r06.twtrdns.net
Querying 205.251.192.179 for twitter.com
'104.244.42.1'
It works! Hooray!
We’re all done!¶
We’re finished our toy DNS resolver! If you’d like to do more, there are a bunch of exercises to extend it to be a little more like a real DNS resolver on the next page that you can try.