More practical Snabb - Unweaving the Web

Some time ago, in a Hacker News thread an user proposed the following use case for Snabb:

I have a ChromeCast on my home network, but I want sandbox/log its traffic. I would want to write some logic to ignore video data, because that’s big. But I want to see the metadata and which servers it’s talking to. I want to see when it’s auto-updating itself with new binaries and record them.

Is that a good use case for Snabb Switch, or is there is an easier way to accomplish what I want?

I decided to take this request and implement it as a tutorial. Hopefully, the resulting tutorial can be a valuable piece of information highlighting some of Snabb’s strengths:

Fine-grained control of the data-plane.
Wide variety of solid libraries for protocol parsing.
Rapid prototyping.

Limiting the project’s scope

Before putting my hands down on this project, I broke it down into smaller pieces and checked how much of it is already supported in Snabb. To fully implement this project I’d need:

To be able to discover Chromecast devices.
Identify their network flows.
Save the data to disk.

Snabb provides libraries to identify network flows as well as capturing packets and filter them by content. That pretty much covers bullets 2) and 3). However, Snabb doesn’t provide any tool or library to fully support bullet 1). Thus, I’m going to limit the scope of this tutorial to that single feature: Discover Chromecast and similar devices in a local network.

Multicast DNS

A fast lookup on Chromecast’s Wikipedia article reveals Chromecast devices rely on a protocol called Multicast DNS (mDNS).

Multicast DNS is standardized as RFC6762. The origin of the protocol goes back to Apple’s Rendezvous, later rebranded as Bonjour. Bonjour is in fact the origin of the more generic concept known as Zeroconf. Zeroconf’s goal is to automatically create usable TCP/IP computer networks when computers or network peripherals are interconnected. It is composed of three main elements:

Addressing: Self-Assigned Link-Local Addressing (RFC2462 and RFC3927). Automatically assigned addresses in the 169.254.0.0/16 network space.
Naming: Multicast DNS (RFC6762). Host name resolution.
Browsing: DNS Service Discovery (RFC6763). The ability of discovering devices and services in a local network.

Multicast DNS and DNS-SD are very similar and are often mixed up, although they are not strictly the same thing. The former is the description of how to do name resolution in a serverless DNS network, while DNS-SD, although a protocol as well, is an specific use of Multicast DNS.

One of the nicest things of Multicast DNS is that it reuses many of the concepts of DNS. This allowed mDNS to spread quickly and gain fast adoption, since existing software only required mininimal change. What’s more, programmers didn’t need to learn new APIs or study a completely brand-new protocol.

Today Multicast DNS is featured in a myriad of small devices, ranging from Google Chromecast to Amazon’s FireTV or Philips Hue lights, as well as software such as Apple’s Bonjour or Spotify.

This tutorial is going to focus pretty much on mDNS/DNS-SD. Since Multicast DNS reuses many of the ideas of DNS, I am going to review DNS first. Feel free to skip the next section if you are already familiar with DNS.

DNS basis

The most common use case of DNS is resolving host names to IP addresses:

$ dig igalia.com -t A +short
91.117.99.155

In the command above, flag ‘-t A’ means an Address record. There are actually many different types of DNS records. The most common ones are:

A (Address record). Used to map hostnames to IPv4 address.
AAAA (IPv6 address record). Used to map hostnames to IPv6 address.
PTR (Pointer record). Used for reverse DNS lookups, that means, IP addresses to hostnames.
SOA (Start of zone of authority). DNS can be seen as a distributed database which is organized in a hierarchical layout of subdomains. A DNS zone is a contiguous portion of the domain space for which a server is responsible of. The top-level DNS zone is known as the DNS root zone, which consists of 13 logical root name servers (although there are more than 13 instances) that contain the top-level domains, generic top-level domains (.com, .net, etc) and country code top-level domains. The command below prints out how the domain www.google.com gets resolved (I trimmed down the output for the sake of clarity).

$ dig @8.8.8.8 www.google.com +trace

; <<>> DiG 9.10.3-P4-Ubuntu <<>> @8.8.8.8 www.google.com +trace
; (1 server found)
;; global options: +cmd
.                       181853  IN      NS      k.root-servers.net.
.                       181853  IN      NS      g.root-servers.net.
.                       181853  IN      NS      j.root-servers.net.
.                       181853  IN      RRSIG   NS 8 0 518400 20180117170000 20180104160000 41824 ....
;; Received 525 bytes from 8.8.8.8#53(8.8.8.8) in 48 ms

com.                    172800  IN      NS      j.gtld-servers.net.
com.                    172800  IN      NS      k.gtld-servers.net.
com.                    172800  IN      NS      l.gtld-servers.net.
com.                    86400   IN      RRSIG   DS 8 1 86400 20180118170000 20180105160000 41824 ...
;; Received 1174 bytes from 199.7.83.42#53(l.root-servers.net) in 44 ms

google.com.             172800  IN      NS      ns2.google.com.
google.com.             172800  IN      NS      ns1.google.com.
google.com.             172800  IN      NS      ns3.google.com.
google.com.             172800  IN      NS      ns4.google.com.

;; Received 664 bytes from 192.26.92.30#53(c.gtld-servers.net) in 44 ms

www.google.com.         300     IN      A       216.58.201.132
;; Received 48 bytes from 216.239.32.10#53(ns1.google.com) in 48 ms

The domain name is split in parts. First the top-level domain is consulted which returns a list of name servers. The root server l.root-servers.net gets consulted to resolve the subdomain .com. That also returns a list of generic top-level domain name servers. Name server c.gtld-servers.net is picked and returns another list of name servers for google.com. Finally www.google.com gets resolved by ns1.google.com, that returns the A record containing the domain name IPv4 address.

Using DNS is also possible to resolve an IP address to a domain name.

$ dig -x 8.8.4.4 +short
google-public-dns-b.google.com.

In this case, the type record is PTR. The command above is equivalent to:

$ dig 4.4.8.8.in-addr.arpa -t PTR +short
google-public-dns-b.google.com.

When using PTR records for reverse lookups, the target IPv4 addres has to be part of the domain in-addr.arpa. This is an special domain registered under the top-level domain arpa and it’s used for reverse IPv4 lookup. Reverse lookup is the most common use of PTR records, but in fact PTR records are just pointers to a canonical name and other uses are possible as we will see later.

Summarizing:

DNS helps solving a host name to an IP address. Other types of record resolution are possible.
DNS is a centralized protocol where DNS servers respond to DNS queries.
DNS names are grouped in zones or domains, forming a hierarchical structure. Each SOA is responsible of the name resolution within its area.

DNS Service Discovery

Unlike DNS, Multicast DNS doesn’t require a central server. Instead devices listen on port 5353 for DNS queries to a multicast address. In the case of IPv4, this destination address is 224.0.0.251. In addition, the destination Ethernet address of a mDNS request must be 01:00:5E:00:00:FB.

The Multicast DNS standard defines the domain name local as a pseudo-TLD (top-level domain) under which hosts and services can register. For instance, a laptop computer might answer to the name mylaptop.local (replace mylaptop for your actual laptop’s name).

$ dig @224.0.0.251 -p 5353 mylaptop.local. +short
192.168.0.13

To discover all the services and devices in a local network, DNS-SD sends a PTR Multicast DNS request asking for the domain name `services._dns-sd._udp.local.

$ dig @224.0.0.251 -p 5353 -t PTR _services._dns-sd._udp.local

The expected result should be a set of PTR records announcing their domain name. In my case the dig command doesn’t print out any PTR records, but using tcpdump I can check I’m in fact receiving mDNS responses:

$ sudo tcpdump "port 5353" -t -qns 0 -e -i wlp3s0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlp3s0, link-type EN10MB (Ethernet), capture size 262144 bytes
44:85:00:4f:b8:fc > 01:00:5e:00:00:fb, IPv4, length 99: 192.168.86.30.58722 > 224.0.0.251.5353: UDP, length 57
54:60:09:fc:d6:04 > 01:00:5e:00:00:fb, IPv4, length 82: 192.168.86.57.5353 > 224.0.0.251.5353: UDP, length 40
54:60:09:fc:d6:04 > 01:00:5e:00:00:fb, IPv4, length 299: 192.168.86.57.5353 > 224.0.0.251.5353: UDP, length 257
54:60:09:fc:d6:04 > 01:00:5e:00:00:fb, IPv4, length 119: 192.168.86.57.5353 > 224.0.0.251.5353: UDP, length 77
f4:f5:d8:d3:de:dc > 01:00:5e:00:00:fb, IPv4, length 299: 192.168.86.61.5353 > 224.0.0.251.5353: UDP, length 257
f4:f5:d8:d3:de:dc > 01:00:5e:00:00:fb, IPv4, length 186: 192.168.86.61.5353 > 224.0.0.251.5353: UDP, length 144

Why dig doesn’t print out the PTR records is still a mystery to me. So instead of dig I used Avahi, the free software implementation of mDNS/DNS-SD, to browse the available devices:

$ avahi-browse -a
+ wlp3s0 IPv4 dcad2b6c-7a21-10c310-568b-ad83b4a3ea1e          _googlezone._tcp     local
+ wlp3s0 IPv4 1ebe35f6-26f1-bc92-318c-9e35fdcbe11d          _googlezone._tcp     local
+ wlp3s0 IPv4 Google-Cast-Group-71010755f10ad16b10c231437a5e543d1dc3 _googlecast._tcp     local
+ wlp3s0 IPv4 Chromecast-Audio-fd7d2b9d29c92b24db10be10661010eebb9f _googlecast._tcp     local
+ wlp3s0 IPv4 Google-Home-d81d02e1e48a1f0b7d2cbac88f2df820  _googlecast._tcp     local
+ wlp3s0 IPv4 dcad2b6c7a2110c310-0                            _spotify-connect._tcp local

Each row identifies a service instance name. The structure of a service instance name is the following:

Service Instance Name = <Instance> . <Service> . <Domain>

For example, consider the following record “_spotify-connect._tcp.local”:

Domain: local. The pseudo-TLD used by Multicast DNS.
Service: spotify-connect._tcp. The service names consists of a pair of DNS labels. The first label identifies what the service does (_spotify-connect is a service that allows an user to continue playing Spotify from a phone to a desktop computer, and viceversa). The second label identifies what protocol the service uses, in this case TCP.
Instance: dcad2b6c7a2110c310-0. An user friendly name that identifies the service.

Besides a PTR record, an instance also replies with several additional DNS records that might be useful for the requester. These extra records are part of the PTR record and are embed in the DNS additional records field. These extra records are of 3 types:

SRV: Gives the target host and port where the service instance can be reached.
TXT: Gives additional information about this instance, in a structured form using key/value pairs.
A: IPv4 address of the reached instance.

Snabb’s DNS-SD

Now that we have a fair understanding of Multicast DNS and DNS-SD, we can start coding the app in Snabb. Like on the previous posts I decided not to past the code directly here, instead I’ve pushed the code to a remote branch and will comment on the most relevant parts. To checkout this repo do:

$ git clone https://github.com/snabbco/snabb.git
$ cd snabb
$ git remote add dpino https://github.com/dpino/snabb.git
$ git checkout dns-sd

Highlights:

The app needs to send a DNS-SD packet through a network interface managed by the OS. I used Snabb’s RawSocket app to do that.
A DNSSD app emits one DNS-SD request every second. This is done in DNSSD’s pull method. There’s a helper class called mDNSQuery that is in charge of composing this request.
The DNSSD app receives responses on its push method. If the response is a Multicast DNS packet, it will print out all the contained DNS records in stdout.
A Multicast DNS packet is composed by a header and a body. The header contains control information such as number of queries, answers, additional records, etc. The body contains DNS records. If the mDNS packet is a response packet, these are the DNS records we would need to print out.
To help me handling Multicast DNS packets I created a MDNS helper class. Similarly, I added a DNS helper class that helps me parsing the necessary DNS records: PTR, SRV, TXT and A records.

Here is Snabb’s dns-sd command in use:

$ sudo ./snabb dnssd --interface wlp3s0
PTR: (name: _services._dns-sd._udp.local; domain-name: _spotify-connect._tcp )
SRV: (target: dcad2b6c7a2110c310-0)
TXT: (CPath=/zc/0;VERSION=1.0;Stack=SP;)
Address: 192.168.86.55
PTR: (name: _googlecast._tcp.local; domain-name: Chromecast-Audio-fd7d2b9d29c92b24db10be10661010eebb9f)
SRV: (target: 1ebe35f6-26f1-bc92-318c-9e35fdcbe11d)
TXT: (id=fd7d2b9d29c92b24db10be10661010eebb9f;cd=224708C2E61AED24676383796588FF7E;
rm=8F2EE2757C6626CC;ve=05;md=Chromecast Audio;ic=/setup/icon.png;fn=Jukebox;
ca=2052;st=0;bs=FA8FCA9E3FC2;nf=1;rs=;)
Address: 192.168.86.57

Finally I’d like to share some trick or practices I used when coding the app:

1) I started small by capturing a DNS-SD’s request emited from Avahi. Then I sent that very same packet from Snabb and checked the response was a Multicast DNS packet:

$ avahi-browse -a

$ sudo tcpdump -i wlp3s0 -w mdns.pcap

Then open mdns.pcap with Wireshark, mark the request packet only and save it to disk. Then use od command to dump the packet as text:

$ od -j 40 -A x -tx1 mdns_request.pcap
000028 01 00 5e 00 00 fb 44 85 00 4f b8 fc 08 00 45 00
000038 00 55 32 7c 00 00 01 11 8f 5a c0 a8 56 1e e0 00
000048 00 fb e3 53 14 e9 00 41 89 9d 25 85 01 20 00 01
000058 00 00 00 00 00 01 09 5f 73 65 72 76 69 63 65 73
000068 07 5f 64 6e 73 2d 73 64 04 5f 75 64 70 05 6c 6f
000078 63 61 6c 00 00 0c 00 01 00 00 29 10 00 00 00 00
000088 00 00 00

This dumped packet can be copied raw into Snabb such in MDNS’s selftest.

NOTE: text2pcap command can also be a very convenient tool to convert a dumped packet in text format to a pcap file.

2) Instead of sending requests on the wire to obtain responses, I saved a bunch of responses to a .pcap file and used the file as an input for the DNS parser. In fact the command supports a –pcap flag that can be used to print out DNS records.

$ sudo ./snabb dnssd --pcap /home/dpino/avahi-browse.pcap
Reading from file: /home/dpino/avahi-browse.pcap
PTR: (name: _services._dns-sd._udp.local; domain-name: _spotify-connect._tcp)
PTR: (name: ; domain-name: dcad2b6c7a2110c310-0)
SRV: (target: dcad2b6c7a2110c310-0)
TXT: (CPath=/zc/0;VERSION=1.0;Stack=SP;)
Address: 192.168.86.55
..._

3) When sending a packet to the wire, checkout the packet’s header checksum are correct. Wireshark has a mode to verify whether a packet’s header checksums are correct or not, which is very convenient. Snabb counts with protocol libraries to calculate a IP, TCP or UDP checksums. Check out how mDNSQuery does it.

Last thoughts

Implementing this tool has helped me to understand DNS better, specially the Multicast DNS/DNS-SD part. I never expected it could be so interesting.

Going from an idea to a working prototype with Snabb is really fast. It’s one of the advantages of user-space networking and one of the things I enjoy the most. That said the resulting code has been bigger that I initially expected. I think that to avoid losing this work I will try to land the DNS and mDNS libraries into Snabb.

This post puts an end to this series of practical Snabb posts. I hope you found them interesting as much as I enjoyed writing them. Luckily in the future these posts can be useful for anyone interested in user-space networking to try out Snabb.

Please drop me a line if you have any feedback. Thanks!

igalia networking