Import of the watch repository from Pebble

This commit is contained in:
Matthieu Jeanson 2024-12-12 16:43:03 -08:00 committed by Katharine Berry
commit 3b92768480
10334 changed files with 2564465 additions and 0 deletions

View file

@ -0,0 +1,135 @@
PULSE Flash Imaging
===================
This document describes the PULSE flash imaging protocol. This protocol
was originally designed for use over PULSEv1, but also works over the
PULSEv2 Best-Effort transport.
The flash imaging protocol is used to write raw data directly to the
external flash. It is a message-oriented protocol encapsulated in PULSE
frames. The primary goals of the protocol are reliability and speed over
high-latency links.
* All client-sent commands elicit responses
* As much as possible, any command can be resent without corrupting the
flashing process. This is to accommodate the situation where the
command was received and acted upon but the response was lost, and the
client retried the command.
* Any notification (server→client message which is not a response to a
command) can be lost without holding up the flashing process. There
must be a way to poll for the status of all operations which elicit
notifications.
* Most of the state is tracked by the client. The server only has to
maintain a minimal, fixed-size amount of state.
> The idempotence of writing to flash is leveraged in the design of this
> protocol to effectively implement a [Selective Repeat ARQ](http://en.wikipedia.org/wiki/Selective_Repeat_ARQ)
> with an unlimited window size without requiring the server to keep
> track of which frames are missing. Any Write Data command to the same
> location in flash can be repeated any number of times with no ill
> effects.
## Message format
All fields in a message which are more than one octet in length are
transmitted least-significant octet (LSB) first.
All messages begin with a 1-octet Opcode, followed by zero or more data
fields depending on the message. All Address fields are offsets from the
beginning of flash. Address and Length fields are specified in units of
bytes.
### Client Commands
#### 1 - Erase flash region
Address: 4 octets
Length: 4 octets
#### 2 - Write data to flash
Address: 4 octets
Data: 1+ octets
The data length is implied.
#### 3 - Checksum flash region
Address: 4 octets
Length: 4 octets
#### 4 - Query flash region geometry
Region: 1 octet
Region | Description
-------|-------------------
1 | PRF
2 | Firmware resources
#### 5 - Finalize flash region
Region: 1 octet
Inform the server that writing is complete and perform whatever task is
necessary to finalize the data written to the region. This may be a
no-op.
Region numbers are the same as for the "Query flash region geometry"
message.
### Server Responses
#### 128 - ACKnowledge erase command
Address: 4 octets
Length: 4 octets
Complete?: 1 octet
Complete field is zero if the erase is in progress, nonzero when the
erase is complete.
#### 129 - ACKnowledge write command
Address: 4 octets
Length: 4 octets
Complete?: 1 octet
#### 130 - Checksum result
Address: 4 octets
Length: 4 octets
Checksum: 4 octets
The legacy Pebble checksum ("STM32 CRC") of the specified memory is
returned.
#### 131 - Flash region geometry
Region: 1 octet
Address: 4 octets
Length: 4 octets
A length of zero indicates that the region does not exist.
#### 132 - ACKnowledge finalize flash region command
Region: 1 octet
#### 192 - Malformed command
Bad message: 9 octets
Error string: 0+ octets
#### 193 - Internal error
Error string: 0+ octets
Something has gone terribly wrong which prevents flashing from
proceeding.
<!-- vim: set tw=72: -->

165
docs/pulse2/history.md Normal file
View file

@ -0,0 +1,165 @@
History of PULSEv2
==================
This document describes the history of the Pebble dbgserial console
leading up to the design of PULSEv2.
In The Beginning
----------------
In the early days of Pebble, the dbgserial port was used to print out
log messages in order to assist in debugging the firmware. These logs
were plain text and could be viewed with a terminal emulator such as
minicom. An interactive prompt was added so that firmware developers and
the manufacturing line could interact with the running firmware. The
prompt mode could be accessed by pressing CTRL-C at the terminal, and
could be exited by pressing CTRL-D. Switching the console to prompt mode
suppressed the printing of log messages. Data could be written into the
external flash memory over the console port by running a prompt command
to switch the console to a special "flash imaging" mode and sending it
base64-encoded data.
This setup worked well enough, though it was slow and a little
cumbersome to use at times. Some hacks were tacked on as time went on,
like a "hybrid" prompt mode which allowed commands to be executed
without suppressing log messages. These hacks didn't work terribly well.
But it didn't really matter as the prompt was only used internally and
it was good enough to let people get stuff done.
First Signs of Trouble
----------------------
The problems with the serial console started becoming apparent when
we started building out automated integration testing. The test
automation infrastructure made extensive use of the serial console to
issue commands to simulate actions such as button clicks, inspect the
firmware state, install applications, and capture screenshots and log
messages. From the very beginning the serial console proved to be very
unreliable for test automation's uses, dropping commands, corrupting
screenshots and other data, and hiding log messages. The test automation
harness which interacted with the dbgserial port became full of hacks
and workarounds, but was still very unreliable. While we wanted to have
functional and reliable automated testing, we didn't have the manpower
at the time to improve the serial console for test automation's use
cases. And so test automation remained frustratingly unreliable for a
long time.
PULSEv1
-------
During the development of Pebble Time, the factory was complaining that
imaging the recovery firmware onto external flash over the dbgserial
port was taking too long and was causing a manufacturing bottleneck. The
old flash imaging mode had many issues and was in need of a replacement
anyway, and improving the throughput to reduce manufacturing costs
finally motivated us to allocate engineering time to replace it.
The biggest reason the flash imaging protocol was so slow was that it
was extremely latency sensitive. After every 768 data bytes sent, the
sender was required to wait for the receiver to acknowledge the data
before continuing. USB-to-serial adapter ICs are used at both the
factory and by developers to interface the watches' dbgserial ports to
modern computers, and these adapters can add up to 16 ms latency to
communications in each direction. The vast majority of the flash imaging
time was wasted with the dbgserial port idle, waiting for the sender to
receive and respond to an acknowledgement.
There were other problems too, such as a lack of checksums. If line
noise (which wasn't uncommon at the factory) corrupted a byte into
another valid base64 character, the corruption would go unnoticed and be
written out to flash. It would only be after the writing was complete
that the integrity was verified, and the entire transfer would have to
be restarted from the beginning.
Instead of designing a new flash imaging protocol directly on top of the
raw dbgserial console, as the old flash imaging protocol did, a
link-layer protocol was designed which the new flash imaging protocol
would operate on top of. This new protocol, PULSE version 1, provided
best-effort multiprotocol datagram delivery with integrity assurance to
any applications built on top of it. That is, PULSE allowed
applications to send and receive packets over dbgserial, without
interfering with other applications simultaneously using the link, with
the guarantee that the packets either will arrive at the receiver intact
or not be delivered at all. It was designed around the use-case of flash
imaging, with the hope that other protocols could be implemented over
PULSE later on. The hope was that this was the first step to making test
automation reliable.
Flash imaging turns out to be rather unique, with affordances that make
it easy to implement a performant protocol without protocol features
that many other applications would require. Writing to flash memory is
an idempotent operation: writing the same bytes to the same flash
address _n_ times has the same effect as writing it just once. And
writes to different addresses can be performed in any order. Because
of these features of flash, each write operation can be treated as a
wholly independent operation, and the data written to flash will be
complete as long as every write is performed at least once. The
communications channel for flash writes does not need to be reliable,
only error-free. The protocol is simple: send a write command packet
with the target address and data. The receiver performs the write and
sends an acknowledgement with the address. If the sender doesn't receive
an acknowledgement within some timeout, it re-sends the write command.
Any number of write commands and acknowledgements can be in-flight
simulatneously. If a write completes but the acknowledgement is lost in
transit, the sender can re-send the same write command and the receiver
can naively overwrite the data without issue due to the idempotence of
flash writes.
The new PULSE flash imaging protocol was a great success, reducing
imaging time from over sixty seconds down to ten, with the bottleneck
being the speed at which the flash memory could be erased or written.
After the success of PULSE flash imaging, attempts were made to
implement other protocols on top of it, with varying degrees of success.
A protocol for streaming log messages over PULSE was implemented, as
well as a protocol for reading data from external flash. There were
attempts to implement prompt commands and even an RPC system using
dynamically-loaded binary modules over PULSE, but they required reliable
and in-order delivery, and implementing a reliable transmission scheme
separately for each application protocol proved to be very
time-consuming and bug-prone.
Other flaws in PULSE became apparent as it came into wider use. The
checksum used to protect the integrity of PULSE frames was discovered to
have a serious flaw, where up to three trailing 0x00 bytes could be
appended to or dropped from a packet without changing the checksum
value. This flaw, combined with the lack of explicit length fields in
the protocol headers, made it much more likely for PULSE flash imaging
to write corrupted data. This was discovered shortly after test
automation switched over to PULSE flash imaging.
Make TA Green Again
-------------------
Around January 2016, it was decided that the issues with PULSE that were
preventing test automation from fully dropping use of the legacy serial
console would best be resolved by taking the lessons learned from PULSE
and designing a successor. This new protocol suite, appropriately
enough, is called PULSEv2. It is designed with test automation in mind,
with the intention of completely replacing the legacy serial console for
test automation, developers and the factory. It is much better at
communicating and synchronizing link state, which solves problems that
test automation was running into with the firmware crashing and
rebooting getting the test harness confused. It uses a standard checksum
without the flaws of its predecessor, and packet lengths are explicit.
And it is future-proofed by having an option-negotiation mechanism,
allowing us to add new features to the protocol while allowing old and
new implementations to interoperate.
Applications can choose to communicate with either best-effort datagram
service (like PULSEv1), or reliable datagram service that guarantees
in-order datagram delivery. Having the reliable transport available
made it very easy to implement prompt commands over PULSEv2. And it was
also suprisingly easy to implement a PULSEv2 transport for the Pebble
Protocol, which allows developers and test automation to interact with
bigboards using libpebble2 and pebble-tool, exactly like they can with
emulators and sealed watches connected to phones.
Test automation switched over to PULSEv2 on 2016 May 31. It immediately
cut down test run times and, once some bugs got shaken out, measurably
improved the reliability of test automation. It also made the captured
logs from test runs much more useful as messages were no longer getting
dropped. PULSEv2 was made the default for all firmware developers at the
end of September 2016.
<!-- vim: set tw=72: -->

419
docs/pulse2/pulse2.md Normal file
View file

@ -0,0 +1,419 @@
PULSEv2 Protocol Suite
======================
Motivation
----------
The initial design of PULSE was shaped by its initial use case of flash
imaging. Flash imaging has a few properties which allowed it to be
implemented on top of a very simplistic wire protocol. Writing to flash
can be split up into any number of atomic write operations that can be
applied in arbitrary order. Flash writes are idempotent: repeatedly
writing the same data to the same flash address does not corrupt the
written data. Because of these properties, it was possible to implement
the flash imaging protocol in a stateless manner simply by ensuring that
every write was applied at least once without concern for out of order
delivery or duplicated datagrams. The PULSE link layer was designed as
simply as possible, guaranteeing only datagram integrity with only
best-effort reliability and sequencing, since it was all that the flash
imaging protocol needed.
As we try to use PULSE for more applications, it has become clear that
flash imaging is a special case. Most applications have some manner of
statefulness or non-idempotent operations, so they need guarantees about
reliable delivery and sequencing of datagrams in order to operate
correctly in the face of lost or corrupted datagrams. The lack of such
guarantees in PULSE has forced these applications to bake sequencing and
retransmissions into the application protocols in an ad-hoc manner,
poorly. This has made the design and implementation of prompt and file
transfer protocols more complex than necessary, and no attempt has yet
been made to tunnel Pebble Protocol over PULSE. It's the [waterbed
theory](http://wiki.c2.com/?WaterbedTheory) at work.
Adding support for reliable, ordered delivery of datagrams will allow
for any application to make use of reliable service simply by requesting
it. Implementation of chatty protocols will be greatly simplified.
Protocol Stack
--------------
PULSEv2 is a layered protocol stack. The link layer provides
integrity-assured delivery of packet data. On top of the link layer is a
suite of transport protocols which provide multiprotocol delivery of
application datagrams with or without guaranteed reliable in-order
delivery. Application protocols use one or more of the available
transports to exchange datagrams between the firmware running on a board
and a host workstation.
Physical Layer
--------------
PULSEv2 supports asynchronous serial byte-oriented full duplex links,
8-N-1, octets transmitted LSB first. The link must transparently pass
all octet values. The baud rate is 1,000,000 bps.
> **Why that baud rate?**
>
> 1 Mbaud is a convenient choice as it is the highest frequency which
> divides perfectly into a 16 MHz core clock at 16x oversampling, and
> works with zero error at 64, 80 and 100 MHz (with only 100 MHz
> requiring any fractional division at all). The only downside is that
> it is not a "standard" baud rate, but this is unlikely to be a problem
> as FTDI, PL2303, CP2102 (but not CP2101) and likely others will handle
> 1 Mbaud rates (at least in hardware). YMMV with Windows drivers...
Link Layer
----------
The link layer, in a nutshell, is PPP with custom framing. The entirety
of [RFC 1661](https://tools.ietf.org/html/rfc1661) is normative, except
as noted in this document.
### Encapsulation
PPP encapsulation (RFC 1661, Section 2) is used. The Padding field of
the PPP encapsulation must be empty.
A summary of the frame structure is shown below. This figure does not
include octets inserted for transparency. The fields are transmitted
from left to right.
Flag | Protocol | Information | FCS | Flag
-----|----------|-------------|----------|-----
0x55 | 2 octets | * | 4 octets | 0x55
#### Flag field
Each frame begins and ends with a Flag sequence, which is the octet 0x55
hexadecimal. The flag is used for frame synchronization.
> **Why 0x55?**
>
> It is transmitted as bit pattern `(1)0101010101`, which is really easy
> to spot on an oscilloscope trace or logic analyzer capture, and it
> allows for auto baud rate detection. The STM32F7 USART supports auto
> baud rate detection with an 0x55 character in hardware.
Only one Flag sequence is required between two frames. Two consecutive
Flag sequences constitute and empty frame, which is silently discarded.
#### Protocol field
The Protocol field is used as prescribed by RFC 1661, Section 2. PPP
assigned protocol numbers and their respective assigned protocols should
be used wherever it makes sense. Custom protocols must not be assigned
protocol numbers which overlap any [existing PPP assigned protocol](http://www.iana.org/assignments/ppp-numbers/ppp-numbers.xhtml).
#### Frame Check Sequence field
The Frame Check Sequence is transmitted least significant octet first.
The check sequence is calculated using the [CRC-32](http://reveng.sourceforge.net/crc-catalogue/all.htm#crc.cat.crc-32)
checksum. The parameters of the CRC algorithm are:
width=32 poly=0x04c11db7 init=0xffffffff refin=true refout=true
xorout=0xffffffff check=0xcbf43926 name="CRC-32"
The FCS field is calculated over all bits of the Protocol and
Information fields, not including any start and stop bits, or octets
inserted for transparency. This also does not include the Flag sequence
nor the FCS field itself.
### Transparency
Transparency is achieved by applying [COBS](https://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffing)
encoding to the Protocol, Information and FCS fields, then replacing any
instances of 0x55 in the COBS-encoded data with 0x00.
### Link Operation
The Link Control Protocol packet format, assigned numbers and state
machine are the same as PPP (RFC 1661), with minor exceptions.
> Do not be put off by the length of the RFC document. Only a small
> subset of the protocol needs to be implemented (especially if there
> are no negotiable options) for an implementation to be conforming.
> All multi-byte fields in LCP packets are transmitted in Network
> (big-endian) byte order. The burden of converting from big-endian to
> little-endian is very minimal, and it lets Wireshark dissectors work
> on PULSEv2 LCP packets just like any other PPP LCP packet.
By prior agreement, peers MAY transmit or receive packets of certain
protocols while the link is in any phase. This is contrary to the PPP
standard, which requires that all non-LCP packets be rejected before the
link reaches the Authentication phase.
Transport Layer
---------------
### Best-Effort Application Transport (BEAT) protocol
Best-effort delivery with very little overhead, similar to PULSEv1.
#### Packet format
Application Protocol | Length | Information
---------------------|----------|------------
2 octets | 2 octets | *
All multibyte fields are in big-endian byte order.
The Length field encodes the number of octets in the Application
Protocol, Length and Information fields of the packet. The minimum value
of the Length field in a valid packet is 4.
BEAT application protocol 0x0001 is assigned to the PULSE Control
Message Protocol (PCMP). When a BEAT packet is received by a conforming
implementation with the Application Protocol field set to an
unrecognized value, a PCMP Unknown Protocol message MUST be sent.
#### BEAT Control Protocol (BECP)
BECP uses the same packet exchange mechanism as the Link Control
Protocol. BECP packets may not be exchanged until LCP is in the Opened
state. BECP packets received before this state is reached should be
silently discarded.
BECP is exactly the same as the Link Control Protocol with the following
exceptions:
* Exactly one BECP packet is encapsulated in the Information field of
Link Layer frames where the Protocol field indicates type 0xBA29 hex.
* Only codes 1 through 7 (Configure-Request, Configure-Ack,
Configure-Nak, Configure-Reject, Terminate-Request, Terminate-Ack and
Code-Reject) are used. Other codes should be treated as unrecognized
and should result in Code-Rejects.
* A distinct set of configure options are used. There are currently no
options defined.
#### Sending BEAT packets
Before any BEAT protocol packets may be communicated, both LCP and BECP
must reach the Opened state. Exactly one BEAT protocol packet is
encapsulated in the Information field of Link Layer frames where the
Protocol field indicates type 0x3A29 hex.
### PUSH (Simplex) transport
Simplex best-effort delivery of datagrams. It is designed for log
messages and other status updates from the firmware to the host. There
is no NCP, no options, no negotiation.
#### Packet format
Application Protocol | Length | Information
---------------------|----------|------------
2 octets | 2 octets | *
All multibyte fields are in big-endian byte order.
The Length field encodes the number of octets in the Application
Protocol, Length and Information fields of the packet. The minimum value
of the Length field in a valid packet is 4.
#### Sending PUSH packets
Packets can be sent at any time regardless of the state of the link,
including link closed. Exactly one PUSH packet is encapsulated in the
Information field of Link Layer frames where the Protocol field
indicates type 0x5021 hex.
### Reliable transport (TRAIN)
The Reliable transport provides reliable in-order delivery service of
multiprotocol application datagrams. The protocol is heavily based on
the [ITU-T Recommendation X.25](https://www.itu.int/rec/T-REC-X.25-199610-I/en)
LAPB data-link layer. The remainder of this section relies heavily on
the terminology used in Recommendation X.25. Readers are also assumed to
have some familiarity with section 2 of the Recommendation.
#### Packet formats
The packet format is, in a nutshell, LAPB in extended mode carrying BEAT
packets.
**Information command packets**
Control | Application Protocol | Length | Information
---------|----------------------|----------|------------
2 octets | 2 octets | 2 octets | *
**Supervisory commands and responses**
Control |
---------|
2 octets |
##### Control field
The control field is basically the same as LAPB in extended mode. Only
Information transfer and Supervisory formats are supported. The
Unnumbered format is not used as such signalling is performed
out-of-band using the TRCP control protocol. The Information command and
the Receive Ready, Receive Not Ready, and Reject commands and responses
are permitted in the control field.
The format and meaning of the subfields in the Control field are
described in ITU-T Recommendation X.25.
##### Application Protocol field
The protocol number for the message contained in the Information field.
This field is only present in Information packets. The Application
Protocol field is transmitted most-significant octet first.
##### Length field
The Length field specifies the number of octets covering the Control,
Application Protocol, Length and Information fields. The Length field is
only present in Information packets. The content of a valid Information
packet must be no less than six. The Length field is transmitted
most-significant octet first.
##### Information field
The application datagram itself. This field is only present in
Information packets.
#### TRAIN Control Protocol
The TRAIN Control Protocol, or TRCP for short, is used to set up and
tear down the communications channel between the two peers. TRCP uses
the same packet exchange mechanism as the Link Control Protocol. TRCP
packets may not be exchanged until LCP is in the Opened state. TRCP
packets received before this state is reached should be silently
discarded.
TRCP is exactly the same as the Link Control Protocol with the following
exceptions:
* Exactly one TRCP packet is encapsulated in the Information field of
Link Layer frames where the Protocol field indicates type 0xBA33 hex.
* Only codes 1 through 7 (Configure-Request, Configure-Ack,
Configure-Nak, Configure-Reject, Terminate-Request, Terminate-Ack and
Code-Reject) are used. Other codes should be treated as unrecognized
and should result in Code-Rejects.
* A distinct set of configure options are used. There are currently no
options defined.
The `V(S)` and `V(R)` state variables shall be reset to zero when the
TRCP automaton signals the This-Layer-Up event. All packets in the TRAIN
send queue are discarded when the TRCP automaton signals the
This-Layer-Finished event.
#### LAPB system parameters
The LAPB system parameters used in information transfer have the default
values described below. Some parameter values may be altered through the
TRCP option negotiation mechanism. (NB: there are currently no options
defined, so there is currently no way to alter the default values during
the protocol negotiation phase)
**Maximum number of bits in an I packet _N1_** is equal to eight times
the MRU of the link, minus the overhead imposed by the Link Layer
framing and the TRAIN header. This parameter is not negotiable.
**Maximum number of outstanding I packets _k_** defaults to 1 for both
peers. This parameter is (to be) negotiable. If left at the default, the
protocol will operate with a Stop-and-Wait ARQ.
#### Transfer of application datagrams
Exactly one TRAIN packet is encapsulated in the Information field of
Link Layer frames. A command packet is encapsulated in a Link Layer
frame where the Protocol field indicates 0x3A33 hex, and a response
packet is encapsulated in a Link Layer frame where the Protocol field
indicates 0x3A35 hex. Transfer of datagrams shall follow the procedures
described in Recommendation X.25 §2.4.5 _LAPB procedures for information
transfer_. A cut-down set of procedures for a compliant implementation
which only supports _k=1_ operation can be found in
[reliable-transport.md](reliable-transport.md).
In the event of a frame rejection condition (as defined in
Recommendation X.25), the TRCP automaton must be issued a Down event
followed by an Up event to cause an orderly renegotiation of the
transport protocol and reset the state variables. This is the same as
the Restart option described in RFC 1661. A FRMR response MUST NOT be
sent.
TRAIN application protocol 0x0001 is assigned to the PULSE Control
Message Protocol (PCMP). When a TRAIN packet is received by a conforming
implementation with the Application Protocol field set to an
unrecognized value, a PCMP Unknown Protocol message MUST be sent.
### PULSE Control Message Protocol
The PULSE Control Message Protocol (PCMP) is used for signalling of
control messages by the transport protocols. PCMP messages must be
encapsulated in a transport protocol, and are interpreted within the
context of the encapsulated transport protocol.
> **Why a separate protocol?**
>
> Many of the transports need to communicate the same types of control
> messages. Rather than defining a different way of communicating these
> messages for each protocol, they can use PCMP and share a single
> definition (and implementation!) of these messages.
#### Packet format
Code | Information
--------|------------
1 octet | *
#### Defined codes
##### 1 - Echo Request
When the transport is in the Opened state, the recipient MUST respond
with an Echo-Reply packet. When the transport is not Opened, any
received Echo-Request packets MUST be silently discarded.
##### 2 - Echo Reply
A reply to an Echo-Request packet. The Information field MUST be copied
from the received Echo-Request.
##### 3 - Discard Request
The receiver MUST silently discard any Discard-Request packet that it
receives.
##### 129 - Port Closed
A packet has been received with a port number unrecognized by the
recipient. The Information field must be filled with the port number
copied from the received packet (without endianness conversion).
##### 130 - Unknown PCMP Code
A PCMP packet has been received with a Code field which is unknown to
the recipient. The Information field must be filled with the Code field
copied from the received packet.
----
Useful Links
------------
* [The design document for PULSEv2](https://docs.google.com/a/pulse-dev.net/document/d/1ZlSRz5-BSQDsmutLhUjiIiDfVXTcI53QmrqENJXuCu4/edit?usp=sharing),
which includes a draft of this documentation along with a lot of
notes about the design decisions.
* [Python implementation of PULSEv2](https://github.com/pebble/pulse2)
* [Wireshark plugin for dissecting PULSEv2 packet captures](https://github.com/pebble/pulse2-wireshark-plugin)
* [RFC 1661 - The Point to Point Protocol (PPP)](https://tools.ietf.org/html/rfc1661)
* [RFC 1662 - PPP in HDLC-like Framing](https://tools.ietf.org/html/rfc1662)
* [RFC 1663 - PPP Reliable Transmission](https://tools.ietf.org/html/rfc1663)
* [RFC 1570 - PPP LCP Extensions](https://tools.ietf.org/html/rfc1570)
* [RFC 2153 - PPP Vendor Extensions](https://tools.ietf.org/html/rfc2153)
* [RFC 3772 - Point-to-Point Protocol (PPP) Vendor Protocol](https://tools.ietf.org/html/rfc3772)
* [PPP Consistent Overhead Byte Stuffing (COBS)](https://tools.ietf.org/html/draft-ietf-pppext-cobs)
* [ITU-T Recommendation X.25](https://www.itu.int/rec/T-REC-X.25-199610-I/en)
* [Digital Data Communications Message Protocol](http://www.ibiblio.org/pub/historic-linux/early-ports/Mips/doc/DEC/ddcmp-4.1.txt)
<!-- vim: set tw=72: -->

View file

@ -0,0 +1,96 @@
PULSEv2 Reliable Transport
==========================
The purpose of this document is to describe the procedures for the PULSEv2
reliable transport (TRAIN) to be used in the initial implementations, with
support for only Stop-and-Wait ARQ (Automatic Repeat reQuest). Hopefully,
limiting the scope in this way will make it simpler to implement compared to a
more performant Go-Back-N ARQ. This document is a supplement to the description
of TRAIN in [pulse2.md](pulse2.md).
The PULSEv2 reliable transport (TRAIN) is based on X.25 LAPB, which implements
reliable datagram delivery using a Go-Back-N ARQ (Automatic Repeat reQuest)
procedure. Since a Stop-and-Wait ARQ is equivalent to Go-Back-N with a window
size of 1, LAPB can be readily adapted for Stop-and-Wait ARQ. The description in
this document should hopefully be compatible with an implementation supporting
the full Go-Back-N LAPB procedures when that implementation is configured with a
window size of 1, so that there is a smooth upgrade path which doesn't require
special cases or compatibility breakages.
Documentation conventions
-------------------------
This document relies heavily on the terminology used in [ITU-T Recommendation
X.25](https://www.itu.int/rec/T-REC-X.25-199610-I/en). Readers are also assumed
to have some familiarity with section 2 of that document.
The term "station" is used in this document to mean "DCE or DTE".
Procedures for information transfer
-----------------------------------
There is no support for communicating a busy condition. It is assumed that a
station in a busy condition will silently drop packets, and that the timer
recovery procedure will be sufficient to ensure reliable delivery of the dropped
packets once the busy condition is cleared. An implementation need not support
sending or receiving RNR packets.
Sending I packets
-----------------
All Information transfer packets must be sent with the Poll bit set to 1. The
procedures from X.25 §2.4.5.1 apply otherwise.
Receiving an I packet
---------------------
When the DCE receives a valid I packet whose send sequence number N(S) is equal
to the DCE receive state variable V(R), the DCE will accept the information
fields of this packet, increment by one its receive state variable V(R), and
transmit an RR response packet with N(R) equal to the value of the DCE receive
state variable V(R). If the received I packet has the Poll bit set to 1, the
transmitted RR packet must be a response packet with Final bit set to 1.
Otherwise the transmitted RR packet should have the Final bit set to 0.
Reception of out-of-sequence I packets
--------------------------------------
Since the DTE should not have more than one packet in-flight at once, an
out-of-sequence I packet would be due to a retransmit: RR response for the most
recently received I packet got lost, so the DTE re-sent the I packet. Discard
the information fields of the packet and send an RR packet with N(R)=V(R).
Receiving acknowledgement
-------------------------
When correctly receiving a RR packet, the DCE will consider this packet as an
acknowledgement of the most recently-sent I packet if N(S) of the most
recently-sent I packet is equal to the received N(R)-1. The DCE will stop timer
T1 when it correctly receives an acknowledgement of the most recently-sent I
packet.
Since all I packets are sent with P=1, the receiving station is obligated to
respond with a supervisory packet. Therefore it is unnecessary to support
acknowledgements embedded in I packets.
Receiving an REJ packet
-----------------------
Since only one I packet may be in-flight at once, the REJ packet is due to the
RR acknowledgement from the DTE getting lost and the DCE retransmitting the I
packet. Treat it like an RR.
Waiting acknowledgement
-----------------------
The DCE maintains an internal transmission attempt variable which is set to 0
when the transport NCP signals a This-Layer-Up event, and when the DCE correctly
receives an acknowledgement of a sent I packet.
If Timer T1 runs out waiting for the acknowledgement from the DTE for an I
packet transmitted, the DCE will add one to its transmission attempt variable,
restart Timer T1 and retransmit the unacknowledged I packet.
If the transmission attempt variable is equal to N2 (a system parameter), the
DCE will initiate a restart of the transport link.