Changing operational software is a risky business. Omar Bashir offers a case study in matching risk with reward.
The article 'Trouble with TCP' in the CVu issue of December 2006 [ CVu06 ] highlighted issues in implementing (near) real-time point to multipoint communications over TCP. The author had highlighted the constraint of minimum code rewrite that most developers face while upgrading legacy systems to resolve various issues. This constraint had prevented the author from applying any of the alternative solutions that he mentioned at the end of his article. This article describes a similar legacy system that faced serious performance issues when traffic on the system increased from moderate to high levels. A solution that did not involve legacy software rewrite was attempted to resolve these performance issues. Various aspects of this solution are explained here.
The legacy system
The legacy system was a monitoring application aggregating data from a number of sensors in a data multiplexer (MUX) and displaying this data in real time on a number of workstations on a LAN. The sensors are connected to the MUX via dedicated and secure communication links. In addition to the workstations, the LAN also hosts a database server logging the data from the sensors for historical analysis and an application server on which various near real time trend analysis applications are executed. Figure 1 shows the physical topology of the system.
Operator workstations, database server and application server (referred to as clients) established connections to the MUX over TCP upon boot up. Copies of every message received by the MUX from the field sensors were transmitted over each established connection. At a modest message arrival rate of 100 messages a second and with the resulting LAN packet size of 256 bytes, communicating these messages to 5 clients only will result in traffic rates of over 1 Mbps. Therefore, this system operated satisfactorily at low message rates for a few clients. Only by increasing the number of clients, the data rate on the LAN increased appreciably. This coupled with higher incoming message rates (due to either increased field events being monitored or increased number of field sensors) could cause congestion either due to network limitations or limitations of the MUX platform software (e.g., buffer sizes).
As often happens with useful applications, a few months after its induction, the number of operator workstations was increased and so were the number of field sensors. This resulted in a noticeable delay between monitoring events at the sensors and those events being displayed on the operator workstations.
It was decided to consider changing the transport protocol to UDP with minimal (preferably no) change to the existing software. The remaining article discusses various factors that were considered while deciding the change in the transport protocol for the application and the resulting solution that required no change in the existing software.
Considering a UDP-based solution
TCP, because of the reliability guarantees it offers, is generally considered suitable for loss sensitive applications whereas UDP is connectionless and inherently unreliable. However, compared to UDP, TCP's reliability is associated with overheads in its implementation and operation. This difference in the fundamental principles of these transport protocols has given rise to some practices in network programming. For example, TCP is considered suitable for reliable and sequenced but non real-time delivery of application data. UDP, on the other hand, has been the protocol of choice for (near) real-time applications that are insensitive to a degree of loss.
The choice of transport protocol, however, needs to be based on a comprehensive analysis of application requirements. For instance, the performance of TCP in comparison to that of UDP has been argued in high throughput and loss sensitive environments. The performance of UDP-based applications is expected to drop due to the retransmission of packets lost because of the absence of flow control [ Snader00 ]. On the other hand, broadcast and multicast communications is a feature inherently supported by the connectionless nature of UDP whereas this feature has to be engineered at the application layer in TCP-based applications as multiple unicast transmissions. As discussed above, the feasibility of the latter approach is questionable at higher throughputs coupled with increasing number of receivers as the transmitter iterates over a list of receivers practically replicating every packet for every receiver.
UDP's susceptibility to loss can cause failure in applications sensitive to packet loss. Even if every packet is not required for the real-time operation, it may be necessary to record all data communicated between different components of the system for historical analysis. A degree of reliability can be engineered over UDP at the application layer but over-engineering leading to re-engineering TCP at the application layer over UDP is usually discouraged (e.g., [ Snader00 ]).
A simple approach to mitigate packet losses over UDP is defined for facsimile transmissions over UDP in the ITU-T.38 standard [ ITU-T.38 ]. This approach employs redundancy to overcome packet losses. Each UDP datagram encapsulates, along with the current Internet fax packet, a predetermined number of previously transmitted Internet fax packets. For example, with Internet fax packet n5, packets n4, n3 and n2 are also transmitted and with Internet fax packet n6, packets n5, n4, and n3 are also transmitted. Thus, even if packets n4, n3 and n2 are lost, n5 packet allows the recreation of the missing stream. Implementations of redundancy-based schemes to mitigate packet losses should restrict the overall packet size to lower than the smallest MTU (Maximum Transmission Unit) along the packet's path otherwise fragmentation of the packet may occur thereby increasing the packet loss probability. This is because fragmented packets are only reassembled at the destination and the loss of even a single fragment results in considering the entire packet being lost [ RFC0791 ]. UDP may also be appropriate for tunneling non-Internet protocols that provide end-to-end reliability, e.g. facsimile communication [ ITU-T.30 ].
|TCP and UDP|
TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) form the transport layer protocols (layer 4 protocols) for the Internet Protocol suite. Although the primary objective of both these protocols is the same, i.e. to allow distributed and networked application components to communicate with each other via message passing without concerning themselves with the characteristics of the network(s) connecting them, the characteristics of these two protocols are very different making them suitable for applications with specific communication requirements.
The network layer protocol of the Internet Protocol suite is called the Internet Protocol (IP) and it is a best-effort protocol. Therefore, it makes the best possible effort (without any guarantees) to deliver packets containing user messages to the destination. It treats each packet differently, determines the route these packets need to take to reach the destinations and possibly even break them into smaller packets if required. If a packet is fragmented, it is reassembled only at the destination and not at the intermediate nodes (routers). Packets may, therefore, get lost, duplicated or arrive out of order at the destination.
TCP sits over IP and provides message validation facilities. It ensures that messages sent by the remote application layer processes are received without errors and in the sequence in which they are transmitted. Therefore, TCP handles retransmission of lost and corrupted packets, discards duplicates and rearranges the received packets to reconstruct the message stream. Furthermore, TCP manages flow of the data streams to avoid and alleviate congestion.
In order to perform these operations, a connection needs to be established between source and destination applications. In order to establish a connection, one process listens for a connection request where as the other attempts the connection. A listener is usually the process that provides services to the connecting process and is referred to as a server. The connector is, therefore, referred to as a client. However, logically it is possible but not common for clients to be listeners and servers to connect to listening clients and provide information when the required information is available.
As multiple networked applications may execute concurrently over the same host, they need to be identified using a unique identifier called the port number. Combination of the port number of an application and the IP address of the host on which it executes uniquely identifies that application on the Internet. This combination is referred to as socket endpoint. A TCP connection is an association between two endpoints, one identifying the client and the other identifying the server. Therefore, a connected socket endpoint pair uniquely identifies a TCP connection.
Messages are transmitted between the connected applications as a stream of sequenced bytes. Therefore, for most implementations of TCP, an application may provide TCP with discrete units of data at the transmitting end but TCP returns to the application at the receiving end an unfragmented stream of bytes. If messages are to be retrieved as discrete units of data then message boundaries need to be explicitly specified and then looked for within the received byte stream. For example, if lines of text are transmitted using TCP and the receiving application needs to receive and process these lines individually, the received byte stream needs to contain line feeds that the receiving application can look for to determine line boundaries within the stream. A common method of delineating binary messages in a TCP byte stream is to insert a message header of a fixed size containing the message length. The receiver initially reads the header (as it is of fixed size), determines the size of the remaining message and then read it. This process is repeated for the subsequent messages.
IP creates packets containing portions of these streams and transmits these packets to the destination. TCP at the receiving end attempts to recreate the stream using the sequence numbers providing by TCP at the transmitting end. Receiving TCP acknowledges the receipt of the last sequenced byte received indicating the receipt of all bytes up to the sequence number. Absence of an acknowledgment at the transmitting TCP indicates a packet loss, necessitating the retransmission of all packets from the last successful acknowledgment.
The connection-oriented nature of TCP does not allow multicast or broadcast communication of messages. This is accomplished by UDP. UDP is a connectionless protocol, i.e. the receiver of messages can be sent messages arbitrarily as long as transmitters know the socket addresses of the receiver without establishing a connection prior to transmitting these messages. UDP is a very simple protocol that provides the same best effort delivery of messages that IP offers to packets at the network layer. Therefore, each message (or datagram) at UDP is treated independently; it can be lost, duplicated and delivered out of sequence. For this reason, delineation of messages with UDP is not required. Multiple receiver processes can bind to a multicast or the broadcast IP address and a common port to simultaneously receive broadcast or multicast messages. Usually, applications using UDP need to provide mechanisms to deal with packet losses, duplications and out or sequence arrival. However, care needs to exercised while developing such applications so as not to re-engineer TCP over UDP.
Solution architecture and implementation
In the system described earlier, the communication protocol was required to be changed from TCP to UDP with preferably no software change in the existing software suite. This was accomplished by implementing proxy servers on both the MUX and the client platforms. The MUX-based proxy server executes on the MUX platform and acts as a client application to the MUX. In this case, the MUX has only one client and that is the local proxy server, which connects to the MUX over TCP via local loopback. Upon receiving data from the MUX application, the proxy server broadcasts the data on the LAN over UDP to be picked up by the client-based proxy servers.
The client-based proxy servers are bound to a broadcast port and receive the data broadcast by the MUX-based proxy server. These client-based proxy servers act as MUX applications for the client applications executing on their respective platforms. Client applications are reconfigured to connect to their local proxy servers over TCP via local loopback. Client applications thus consider their local proxy servers to be the MUX application. The data received by the client-based proxy servers via the broadcast ports is transmitted to the respective local client applications via a TCP connection over the local loopback. Figure 2 shows the high-level architecture of the proposed solution.
Client applications may need to communicate with the sensors in the field via the MUX. Data from the client applications is communicated to their respective local proxy servers over TCP connections via local loopback. The local client-based proxy servers transmit these messages over UDP to the MUX-based proxy server, which passes it to the MUX application over the TCP connection via the local loopback. The MUX application can determine from the message the sensor to which this message is to be transmitted and this message is then transmitted over the relevant dedicated link to the sensor.
Because only one client (i.e., the MUX-based proxy server) connects to the MUX application, only one copy of each incoming message from the sensors is transmitted over the local loopback to the proxy server. As these messages are broadcast over the LAN, only one copy of each message ever exists over the network.
Detailed description of proxy servers
Figure 3 shows the high-level block diagram of the proxy server. For full-duplex communication, the proxy server needs four threads. Two of the threads are used for communicating over two UDP sockets and the remaining two are used for communicating over a single TCP socket. For the MUX-based proxy server, a UDP socket is used to transmit to a broadcast port and a socket bound to a unicast port is used for reception. For client-based proxy servers, a socket bound to a broadcast port is used for reception and messages for the MUX-based server are transmitted by another UDP socket to a port open at the MUX-based proxy server for reception.
The TCP Receiver Thread communicates with the UDP Transmitter Thread using a queue. Similarly, the UDP Receiver Thread communicates with the TCP Transmitter Thread over another queue. These queue objects are instantiations of a wrapper around an STL queue providing thread safety and blocking read operation using mutexes and condition variables provided by the operating system.
A relatively more detailed class diagram of the proxy server is shown in figure 4. Communicator Thread is the abstract base class for the classes that implement thread objects in the proxy server. Each communicator thread uses a Thread Safe Queue object to either write received messages to (in case of objects of the Receiver Thread sub-class) or read received messages from (in case of objects of the Transmitter Thread sub-class).
A proxy server may be required to transform messages it relays between the systems it connects. This transformation is performed in the Transmitter Thread class using an appropriate concrete extension of the Transformer abstract class. Transformation is performed in the Transmitter Thread as it may be more time consuming for large messages or messages requiring significant processing during transformation. If performed in the Receiver Thread, it may cause packet losses. It is possible to apply different transformations to different types of messages or messages containing specific content by implementing transformers using the Strategy design pattern [ Gamma95 ].
Transmitter and Receiver Thread objects use objects of concrete sub-classes of the Communicator abstract base class to receive and transmit data over sockets. Objects of UdpCommunicator class are used to communicate over UDP sockets whereas objects of TcpCommunicator class are used to communicate over TCP sockets. Messages received over TCP need to be delineated from the input stream. TcpCommunicator objects use objects of sub-classes of AbstractDelineator to perform delineation of the input stream. These classes implement application specific stream delineation logic.
TcpCommunicator objects are created by Initiator or Acceptor factory classes which are derived from the Connector abstract base class. This is a variation of the Abstract Factory design pattern [ Gamma95 ]. An Acceptor object accepts connection requests from TCP clients and returns a TcpCommunicator object encapsulating the connected socket. Similarly, a TCP client uses an Initiator object to initiate a connection to a TCP server. Initiator also returns a TcpCommunicator object encapsulating the connected socket.
TCP is usually the transport protocol of choice in data communication applications that are loss sensitive. However, TCP's inherent inability to handle point to multipoint communication can severely restrict system scalability resulting in latencies that may be unacceptable even in systems with relatively relaxed delay sensitivities. Increase in message sizes, number of clients or number of messages to be broadcast per unit time can result in increased delays as well as resource utilization at the servers.
UDP's ability to broadcast and multicast packets over a LAN can provide the required scalability. However, UDP being a best effort protocol is not considered suitable for loss sensitive applications. Some application level enhancements can provide a degree of resilience against packet losses. However, migrating an application from TCP to UDP may require a significant rewrite. A proxy server based approach is presented here that allows TCP based point to multi-point applications to be migrated to UDP without rewriting the existing application. Each host on the network executes a proxy server, which communicates over UDP with other proxy servers. Proxy servers communicate with the application components on the local host via TCP over local loopback.
Using proxy servers helped in future product growth as further developments could be phased appropriately. The design mentioned above was generalized and formalized as the Active Transceiver architectural pattern for data communication systems [ Bashir03 ]. The next phase of development resulted in a MUX software that communicated over UDP without the MUX-based proxy server while not changing any of the client applications. In the subsequent phases, each different client application was modified to communicate over UDP, in addition to other application specific enhancements.
[ Bashir03] Omar Bashir, Mubashir Hayat, Active Transceiver Design Pattern for Data Communication Applications, IEEE 7th International Multi-topic Conference 2003 (INMIC 2003), Islamabad, Pakistan, November 2003.