Np1sec/incremental consistency

This builds on the server-dictated ordered transcript hashes currently mentioned in SenderKeys.


To summarise: we assume the central server reliably delivers messages to everyone, including the original sender, in the same order. (Discussion of potential failures and recovery will be covered elsewhere.)

Each message has an implicit server-sequence-number ("seqnum"), a receive-parent ("recv-parent" or "parent") and a sender-sequence-number ("own-seqnum"). Semantics of these is covered elsewhere.

Once a message m is received from the server (including own messages sent), a "transcript-hash" may be calculated for it, that commits to that message and all previous messages in the server-dictated order. This consists of:

  • all messages seen by the sender when they sent m, namely:
    • "recv-parent" p and messages with seqnum earlier than p
    • all messages sent by the same sender (messages with own-seqnum earlier than m)
  • as well as all messages inserted by the server, into the server-ordering between p and m

When a new message is sent, the transcript-hash of its recv-parent is included with it.


A message m is fully-acked (from the POV of a given member u) iff, for all recipients r, u has accepted a message by r where seqnum(parent(r)) ≥ seqnum(m).

  • recipients possibly includes u, but certainly excludes m's sender
  • accepted means delivered locally, i.e. received, then decrypted-verified including parent hash checks; all causally-previous messages must already be accepted
    • we assume a server-dictated ordering, so accepting a message at seqnum i means we have already accepted all messages j ≤ i.

Once m is fully-acked, u knows that everyone else has seen m and all messages before it. We'll call the process of gaining that knowledge, "reaching consistency for m" (given context, m might be implicit). We don't have another mechanism for it, so henceforth we'll use "reach consistency" interchangeably with "reach full-ack", though strictly the former is a security property and the latter is a mechanism for achieving it.

An explicit-ack is a message with no user content, that is sent solely for the purpose of declaring (via recv-parent) what the sender has seen. We'll make use of these to ensure consistency is reached in a timely manner, even if no-one wants to send a normal message (that contains user-level contents, and which is an implicit-ack).


We consider two cases: (a) reaching consistency for arbitrary messages during the course of a conversation, and (b) reaching consistency when a user u leaves. Case (b) may be viewed as a special instance of case (a) plus the additional premise that u must reach consistency as soon as possible (because they want to leave), and that they don't care about reaching consistency for any subsequent messages that they might receive after their final "farewell" message.

Case (a): when a member u accepts a non-explicit-ack message m at time t:

  • if u did not send m, and they have not acked m by t+ACK_GRACE_INTERVAL, they should send an explicit-ack
  • if m is not fully-acked (from their POV) by t+(2*BROADCAST_LATENCY)+ACK_GRACE_INTERVAL then issue a local UI warning. Cancel the warning if/when full-ack is reached later.

Case (b): when a member u wants to part, they send a "farewell" message m:

  • Everyone should explicit-ack m as soon as they receive it from the server.
    • This ack may also bundle any cryptographic material needed for other members to agree on a key or keys that exclude u. For example, in the SenderKeys scenario, this would contain the next sender-key, encrypted to everyone except u.
  • When m is fully-acked, u reaches consistency for all previous messages up to m, and may leave.
  • Messages that others send to u that are echoed back after m, will not be acked by u.
    • u won't have a chance to reach consistency for these messages, even if u receives them.
    • Others won't have a chance to check that u read them, though they may be able to check that everyone else did. But u may still read them in theory, since they were encrypted to u.
    • In either case, communicating this meaning to the user, should be already covered by the same UI as for case (a). In the first case, u may shortcut the timeout and just issue the warning straight away, or alternatively not display those messages to the user at all.
  • TBD: need to think about simultaneous parts

Parameters and properties

  • BROADCAST_LATENCY should be based on the transport.
  • ACK_GRACE_INTERVAL should be based on expected user communication rate.

Both of these should be defined to cover common cases (e.g. 95th-percentile) rather than being mean values.

The above checks should be applied for every single message. This guarantees that a warning shows up if we don't reach consistency within the timeout defined above, and works even if people don't manually send messages.

If we set ACK_GRACE_INTERVAL high enough to match the typical interval between a user's messages, then (in theory) the overhead is only incurred once everyone stops talking and we automatically send off one extra round of acks. While the lull in activity continues, there will be no extra messages, since we don't require explicit-acks to be fully-acked themselves. (Preliminary experiments on some code I wrote shows this to be effective.)

We are still vulnerable to a "drop everything" attack, but that can't be helped unless we have unconditionally-periodic heartbeats. Not sure if we want to put these in the upcoming spec.

(The above is basically my msg-notes stuff, except assuming reliable transport, without recovery or flow control, without heartbeats, and adapted to a server-dictated ordering.)

Relative ordering

Ensure that messages received out-of-order are highly visible to the user.

A message m is badly ordered if seqnum(parent(m)) < seqnum(m) - MAX_UNSYNC_COUNT

  • MAX_UNSYNC_COUNT may either be constant, or a linear function of the number of members, TBD.

These messages should be highlighted in some way in the UI that is not too severe. They are still valid messages; the user should just be informed that they refer to older context that may be surprising.

This definition is globally consistent (or else transcript consistency breaks) so it's easier to reason about, and the warning is simpler to explain than MAX_RTD.