A datagram is a basic transfer unit associated with a packet-switched network. Datagrams are typically structured in header and payload sections. Datagrams provide a connectionless communication service across a packet-switched network. The delivery, arrival time, and order of arrival of datagrams need not be guaranteed by the network.
In the early 1970s, the term datagram was created by combining the words data and telegram by the CCITT rapporteur on packet switching, Halvor Bothner-By.
While the word was new, the concept had already a long history.
In 1962, Paul Baran described, in a RAND Corporation report, a hypothetical military network having to resist a nuclear attack. Small standardized "message blocks", bearing source and destination addresses, were stored and forwarded in computer nodes of a highly redundant meshed computer network. "The network user who has called up a "virtual connection" to an end station and has transmitted messages ... might also view the system as a black box providing an apparent circuit connection".
In 1967, Donald Davies published a seminal article in which he introduced the now largely used words packet and packet switching. His core network is similar to that of Paul Baran although it has been independently designed. To deal with datagram permutations (due to dynamically updated routing preferences) and to datagram losses (unavoidable when fast sources send to a slow destinations), he assumes that "all users of the network will provide themselves with some kind of error control" (what will be called later on a pure datagram service). His target is, for the first time in packet switching, a "common-carrier communication network". To support remote access to computer services by user terminals, which at that time transmitted in general character by character, he included at the network periphery interface computers that convert character flows into packet flows and conversely.
In 1970, Lawrence Roberts and Barry D. Wessler published an article about ARPANET, the first multi-node packet-switching network. An accompanying paper described its switching nodes (the IMPs) and its packet formats. The network core performed datagram switching as in Baran's and Davies' model, but provision was added within the network, at its periphery, to deal with datagram losses and permutations. A reliable message transfer service was thus offered to user computers, thus greatly simplifying their own work, and keeping it less dependent on further research.
In 1973, Louis Pouzin presented his design for Cyclades, the first real size network implementing the pure datagram model of Donald Davies. The Cyclades team has thus been first to tackle the highly complex problem of providing to user applications a reliable virtual circuit service (the equivalent of an Internet TCP connection) while using an end to end network service known to possibly produce non negligible datagram losses and permutations. Although Pouzin's concern "in a first stage is not to make breakthrough in packet switching technology, but to build a reliable communications tool for Cyclades", two members of his team, Hubert Zimmerman and Gérard Le Lann, made significant contributions to the design of Internet's TCP that Vint Cerf, its main designer, acknowledged.
In 1981, the Defense Advanced Research Projects Agency (DARPA) issued the first specification the Internet protocol (IP). It introduced a major evolution of the datagram concept: fragmentation. With fragmentation, some parts of the global network may use large packet size (typically local area networks for processing power minimization), while some others may impose smaller packet sizes (typically wide area networks for response time minimization). Network nodes may split a packet of a datagram into several smaller packets of the same datagram.
In 1999, the Internet Engineering Task Force (IETF) officialised the use of the already largely deployed Network address translation (NAT) whereby each public address can be shared by several private devices. With it, the forthcoming Internet Address exhaustion was delayed, leaving enough time to introduce IPv6, the new generation of Internet packets supporting longer addresses. The initial principle of full end to end network transparency to datagrams was for this relaxed: NAT nodes had to manage per-connection states, making them in part connection oriented.
In 2015, the IETF upgraded its weak "informational" recommendation of 1998, that datagram switching nodes perform active queue management (AQM), to make it a stronger and more detailed "best current practice" recommendation. While the initial datagram queueing model was simple to implement and needed no more tuning than queue lengths, support of more sophisticated and parametrized mechanisms were found necessary "to improve and preserve Internet performance" (RED, ECN etc.). Further research on the subject was also called for, with a list of identified items.
The term datagram is defined as follows:
“A self-contained, independent entity of data carrying sufficient information to be routed from the source to the destination computer without reliance on earlier exchanges between this source and destination computer and the transporting network.”— RFC 1594
A datagram needs to be self-contained without reliance on earlier exchanges because there is no connection of fixed duration between the two communicating points as there is, for example, in most voice telephone conversations.
Datagram service is often compared to a mail delivery service; the user only provides the destination address, but receives no guarantee of delivery, and no confirmation upon successful delivery. Datagram service is therefore considered unreliable. Datagram service routes datagrams without first creating a predetermined path. Datagram service is therefore considered connectionless. There is also no consideration given to the order in which it and other datagrams are sent or received. In fact, many datagrams in the same group can travel along different paths before reaching the same destination.
Each datagram has two components, a header and a data payload. The header contains all the information sufficient for routing from the originating equipment to the destination without relying on prior exchanges between the equipment and the network. Headers may include source and destination addresses as well as a type field. The payload is the data to be transported. This process of nesting data payloads in a tagged header is called encapsulation.
|Layer 4||Data segment|
|Layer 3||Data packet|
|Layer 2||Ethernet frame (IEEE 802.3)|
Wireless LAN frame (IEEE 802.11)
|Layer 1||Chip (CDMA)|
The Internet Protocol (IP) defines standards for several types of datagrams. The internet layer is a datagram service provided by an IP. For example, UDP is run by a datagram service on the internet layer. IP is an entirely connectionless, best effort, unreliable, message delivery service. TCP is a higher level protocol running on top of IP that provides a reliable connection-oriented service.