Odi's astoundingly incomplete notes

Fundamental communication theorem

Every programmer should know this one: Any protocol over an unreliable medium (such as a network) either allows for losing a message or accepting duplicate messages. There is nothing in between. You can't have both at the same time (see below for an explanation).

This doesn't just apply to individual network packets unfortunately (TCP already handles that case fine). But it also applies to larger messages spanning multiple packets: HTTP, SMTP, messaging protocols such as JMS or those of any proprietary SOA product (MQSeries, ActiveMQ, etc.), remote database protocols, etc.

Even a simple HTTP GET request exhibits the problem: As long as the client hasn't read the "200 OK" status code, it can't even know if the request has reached the server. So in an absence of that status code it would have to retry the request, resulting in a possibly duplicate request on the server.

This simple fact has direct and heavy impact on transactional behaviour: you will have to embedd additional data in your protocol to handle loss, misordering and duplicates. If you don't do that your protocol is not transactionally safe. You will lose data or end up with duplicate execution of the same transaction.

How can you secure your protocol?

Transaction tokens: the client has to acquire a transaction token from the server and can use that token only once.
Message sequence numbers: the client sends a unique sequential number with every message. If it has to repeat a message it uses the same sequence number again. The server stores the last used sequence number. If it detects a repeated message, it just replays the last response without doing anything. If it detects an older sequence number it discards the message. If it detects a higher sequence number server and client are out of sync and must renegotiate sequence numbers. NB: timestamps are usually insuffient as sequence numbers because of their limited precision and you can't detect loss.

Explanation
"Unreliable medium" means that messages may be lost or invalidated (scrambled) on the way. So a protocol may choose to detect the message loss. The loss may be detected sender and/or recipient side:

recipient sees an out of order message or a gap in the message sequence
sender gets a negative acknowledgement (NAK) from the sender, or doesn't see an acknowledgement (ACK) from the sender for a certain time

Unfortunately that detection is always unreliable as well and it will detect slightly more incidents than actually happened. So the protocol will detect a message loss when in fact the message was received fine. What does it do if a message is lost? Of course it will have to repeat it. Thus duplicates may occur.
posted on 2010-02-23 15:31 UTC in Code | 0 comments | permalink

Add comment