Sunday, May 17, 2009

My RS485 Network Protocol

After hours of research, I was unable to find a suitable real-time, multi-master/peer-to-peer network protocol for RS485. So I made my own. Please evaluate it and comment on it.

The protocol is designed to work over RS485, using an RS485 transceiver hooked up to a PIC's UART pins. 8 bit, no parity, 1 stop bit. Maximum length of data segment of packet is 0xFF bytes long, but since most midrange controllers don't have that much, I'll stick with 26 bytes. The protocol has a token bus type topology (or at least that's what I think it's called), so it's deterministic, collision free and real time.

The packet looks like this:

<preamble> <destination> <source> <flags> <length of following data> <data> ... <data> <checksum>


<preamble>: 1 byte. Start of packet. 0xAA.


<destination>, <source>: 1 byte each. 0x01 - 0xFE. 0x00 represents the bus owner, configuration node or hub (unimplemented for now). 0xFF is broadcast, for all nodes.

<flags>: 1 byte <0, 1, 2, 3, 4, 5, 6, 7>
0 - ACK - acknowledge
1 - NACK - no acknowledge (could not use the packet, for some reason)
2 - ACK_REQ - when sending, request an acknowledgement from target node
3 - ID_REQ - request identification information from the target node
4 - TYPE - is the data segment a special command type or a normal data type?
5 - TOKEN - this packet is a token and contains data of zero length; all other flag bits must be zero
6 - not used/application specific
7 - not used/application specific


<length>: the length (in bytes) of the data segment that follows


<data>: the data (or command) segment itself


<checksum>: the 8 bit polynomial checksum

Notes:


The lowest ID on the bus has voice (starts with token). If it has a message to send, it can send up to 4 packets. Then it MUST give up its voice and give the token to the next node. -> The 5th (or less if it has less or even no data to send) packet must be a token that it passes to the next node.

If the node has no data to send, it will pass the token to the next node in the bus, which is it's own ID + 1.

IDs are static on the bus and are assigned incrementally (I want to make them dynamic, though).

When a byte is received, it is buffered in memory in the "interrupt on RX" function. After the last byte (checksum) has been received, the packet is checked, validated and used. It is discarded if the packet does not belong to the node.

The packets shall be examined as CPU time permits. One question to you all though: since this type of network requires the constant examining of packets, I assume it takes a lot of CPU time.

The token bus type network guarantees that a node will be able to transmit every x units of time. However, for small, embedded applications, two other types of networks are also very possible: CSMA/CD and Polling. All the different types of networks are discussed on this page: http://www.ece.cmu.edu/~koopman/protsrvy/protsrvy.html

 

Further:

 

After posting on the PICLIST and getting opinions about the protocol, I got the following responses:

  • What happens when the token gets lost? For example, when the node which has the token goes down. Tokens get lost.
    • My possible solution: I don’t expect a node which owns the token to go down. Note that when sending tokens, the receiving node must always acknowledge that it got the token. If it does not send an ACK, the sender still owns the token and must try again a few times until the receiving node gets the token. If there are a few unsuccessful tries (4 or 5), the token owner broadcasts (to 0xFF) that the target node is probably down and sends the token to the next node (n + 0x02). Now, to solve the problem of recovering from a lost node which has the token, we could probably add another node to the bus known as the bus owner with an address of 0x00. The owner takes place in passing the token around and keeps track of who is on the bus and who is not. If the token gets lost, the bus owner then polls each of the nodes to see which node went down. After that, the bus owner then either reassigns the IDs so that they are all consecutive or broadcasts to all nodes saying which node went down so the nodes do not pass the token to that node. Basically, the bus owner makes sure that the bus is in working order.
  • Look at the DMX 256 protocol.
  • Look at uLan - http://ulan.sourceforge.net/
  • What happens when an ACK is not received?
    • In this situation, the first packet (with an ACK_REQ) is successfully transmitted, but there is an error when the target node sends an ACK. In effect, the ACK does not get sent back to the original sender. The sender would probably then send the packet again. This could be a problem especially when node A passes the token to node B. B then needs to ACK to A that it got the token. If A does not get the ACK, A will then send the token again. But what might happen here is that B will start talking on the network since it has voice (it got the token). A collision occurs. This might be a very improbable situation, but if it does happen, it will disrupt the bus.
    • A possible solution to this is that:
      • A token –> B
      • B ACK –> A
      • A ACK –> B
    • What happens here is that A sends a token to B and waits for an ACK.
      • If B did not get the token at all, it obviously will not send an ACK. In this case, A still has the token. A will wait for a while and then trigger a timeout in which case it will retry sending the token to A.
      • If B got the token:
    • B sends an ACK to A.
      • If there is a problem sending the first ACK, B will retry sending the ACK until it gets a response. If it doesn’t get a response, B drops the token and it is assumed that A still has the token due to an unsuccessful transfer.
      • If A got the ACK:
    • A sends an ACK to B.
      • If there is a problem sending the ACK to B, B will drop the token. Now no one has the token. The bus owner sees this as the token has not been circulating for a while. The bus owner then resets the bus by giving the token to a certain node.
      • If it B got the ACK, B now has the token.
    • As you can see, this gets very complicated really quickly. Maybe CSMA/CD isn’t so bad after all.

1 comment:

  1. How can you ensure real-time implementation; since there is packetization delay, processing delay at each nodes (other nodes will be idle at the time of transmission or reception at one of the node). A 2 Mbps RS485 bus can provide determinism?

    -Wareendra Nath

    ReplyDelete