Musings on Protocol Design

Stanislav Shalunov, 2005-06-09

This rant considers a few points of protocol design for computer communications networks. My opinion only.

Text vs binary

According to Padlipsky, the first protocol to use ASCII strings, terminated by newlines, as commands was FTP. This was so that you could, after logging onto a terminal server, send files directly from a file server to a printer. This is also the reason the FTP protocol is layered on top of TELNET (you would use the telnet program to connect to the printer and the file server and issue FTP commands manually). This was a reasonable design for the time when people actually typed FTP protocol messages; those who are used to modern command-line interfaces would find it a bit too complicated, but usable.

Since then, a large number of protocols that use difficult-to-parse pseudo-human-friendly formats of messages have been foisted upon us. Enough already. When humans talk to computers, the protocol needs to be human-friendly, because the computer is the human's servant; when computers talk to each other, or when different programs talk to each other within the same computer, the protocol has no reason, whatsoever, to be human-friendly. Debugging (which is often given as an excuse for unwieldy protocol design) of unnecessarily complex protocols is more, not less, difficult than debugging of protocols that are only complex enough to do what they need to.

Round-trip times

Consider the case of email transmission between strangers on a non-secure packet network. (This is the problem SMTP/TCP/IP solves.) How many round-trip times are necessary to send a short message (less than a maximum packet size) when no packets are lost? Here's what needs to happen:
  1. sender --> receiver: Message.
  2. receiver --> sender: Did you send message X? I got it.
  3. sender --> receiver: Yes, I did send message X.
That's three messages for 1.5 round-trip times. How does SMTP compare? (The following exchange has some silly places where several messages in a row are sent by one side; this is how it works on FreeBSD 5.3. It doesn't affect the number of round-trip times, but is shown here for realism. If these messages were merged and sent in a single packet, the exchange could have taken 20 or even 18 packets. Note that I am not complaining about the number of packets, but about the number of round-trip times.)
  1. sender --> receiver: SYN.
  2. receiver --> sender: SYN+ACK.
  3. sender --> receiver: ACK.
  4. receiver --> sender: ACK.
  5. receiver --> sender: 220.
  6. sender --> receiver: ACK.
  7. sender --> receiver: HELO.
  8. receiver --> sender: 250.
  9. sender --> receiver: ACK.
  10. sender --> receiver: MAIL.
  11. receiver --> sender: 250.
  12. sender --> receiver: ACK.
  13. sender --> receiver: RCPT.
  14. receiver --> sender: 250.
  15. sender --> receiver: ACK.
  16. sender --> receiver: DATA.
  17. receiver --> sender: 354.
  18. sender --> receiver: ACK.
  19. sender --> receiver: Message.
  20. receiver --> sender: 250.
  21. sender --> receiver: ACK.
  22. sender --> receiver: QUIT.
  23. receiver --> sender: 221.
  24. receiver --> sender: FIN.
  25. sender --> receiver: ACK.
  26. sender --> receiver: FIN+ACK.
  27. receiver --> sender: ACK.
That's nine round-trip times, or six times more than necessary.

I use SMTP as an example only here. If one were to try to analyse ESMTP as it is deployed today for the number of round-trip times, two considerations would need to be taken into view: the PIPELINING extension, which can reduce the number of round-trip times by two, and DNS queries, which, since they are all done serially in this case, can increase the number of round-trip times quite a bit.

Bit economy

Perversely, the same crowd happy with wasting round-trip times can easily come down on anyone whose protocol is ``wasteful.'' Bits are cheap. Every year, they get twice cheaper. Round-trip times are fundamental and bounded from below by distance divided by speed of light.

Use as many bits as needed. Worry about bits only when something that would fit into a single packet stops doing so.

Layers

Layers are a really clever idea. They let you trade off effort for efficiency. When the efficiency that you lose is in bits alone, using layers liberally makes sense.

Other ways to lose efficiency could be in round-trip times, CPU use, protocol complexity, and implementation size (and, thus, the number of bugs). In addition, at some point, layers start interacting in ways that are difficult to predict; such emergent behaviors can be surprising to protocol designers. These are arguments in favor of fewer layers (it is, perhaps, not surprising that the more successful Internet has fewer layers than the deadborn ISO network design). An even more important argument is that layers can be misleading; for example, most applications using TLS or SSH as a layer between transport and application don't get what their designers might think they are getting from the security layer: existential forgery is trivially possible with most deployed applications; such state of affairs would rightly be considered disastrous by most designers if they considered the system as a whole.

Be liberal---or not

This is the most harmful part of the conventional wisdom. This is not how you make protocols interoperate; this is how you make them interoperate poorly, while burying booby traps for the future.

Be pedantic in what you send and fascist in what you accept.

The good protocols (IPv4, IPv6, BGP, etc.) are actually not at all liberal in what they accept. Being liberal in what you accept gives you HTML/HTTP.

Protocol versions and feature negotiation

Protocols evolve. How does one provide for the evolution in the beginning? Different implementations of an evolving protocol provide different sets of features. How does one implementation learn what the other supports? Here's how:
  1. IP version;
  2. IP protocol;
  3. Port number (for IP protocols 6 and 17);
  4. Protocol version;
  5. Feature negotiation;
  6. Trial and error;
  7. Wild optimistic guess.
There are too many ways to negotiate that interact in unpredictable ways.

Particularly useless are protocol version numbers (with possible exception of the stablest, the most basic protocols, such as IP). Version numbers provide for linear ordering of sets of supported features, but set inclusion relationship is not a linear order. Not only is protocol version mechanism limited, it is redundant, too; protocol versions don't buy you anything you don't already have with port numbers. Better in that respect is feature negotiation. Even feature negotiation is too often wasteful and unnecessarily complicated.

Where it gets harmful is security. There's a push to make everything---every little primitive, every little option---negotiable. The resulting monsters are impossible to analyze and are almost certain to contain security holes. In many cases, the security holes are inserted intentionally for ``debugging purposes.''

There's far too much versioning and feature negotiation. Port numbers do the job quite often. When they don't, capability strings or feature negotiation work better than protocol versions.


Acknowledgments

I would like to thank Simon Leinen for his comments and for discussion.

Comments