NATS Weekly #18

Byron Ruth

Published on Mar 21st, 2022

Week of March 14 - 20, 2022

🗞 Announcements, writings, and projects

A short list of  announcements, blog posts, projects updates and other news.

âš¡Releases

Official releases from NATS repos and others in the ecosystem.

📖 Articles

💬 Discussions

Github Discussions from various NATS repositories.

💡 Recently asked questions

Questions sourced from Slack, Twitter, or individuals. Responses and examples are in my own words, unless otherwise noted.

How long can a consumer, stream, or account name be?

Naming recommendations and constraints are in the docs. It is recommended that each name is no more than 32 characters. Todd Beets provides the reason for this recommendation:

The filestore mechanism of JetStream uses all three as filesystem directory names. There is a possibility that very long names (alone or taken together) could exceed operating system limits.

This may or may not apply to the filesystem you are deploying NATS to, so if you anticipate running into these limits, be sure to check this for yourself!

When do I need to think about message order?

The topic of message order comes up quite often. This is understandable since receiving information out of order can be confusing or result in different outcomes. Likewise, when publishing messages, there is an expected order as a publisher.

NATS provides message order guarantees under specific conditions on both the publisher and subscriber side. Most of these conditions should be fairly intuitive, but it is worth highlighting.

With core NATS, when using a single client connection, all published messages will be received by NATS in the order they are published.

nc.Publish("foo", []byte("1"))
nc.Publish("foo", []byte("2"))

1 will always come before 2. If you choose to use this client connection in concurrent parts of your application, then by definition, which message comes first is undefined.

go nc.Publish("foo", []byte("1"))
go nc.Publish("foo", []byte("2"))

No guarantee that 1 will come before 2 since each are done in a separate goroutine. The published order can be observed once NATS receives the messages since they are serialized on receipt prior to being sent to subscribers. The same behavior would be true if using two or more client connections. Each have their own order due to the isolated connection, but these get effectively merged and serialized server-side.

nc1.Publish("foo", []byte("1"))
nc2.Publish("foo", []byte("2"))

Again, no guarantee which will arrive to the server first since these are different client connections. In practice, the subjects being published to are typically more granular and thus the publishing done per subject often occurs independently.

go func() {
  nc.Publish("foo", []byte("1"))
  nc.Publish("foo", []byte("2"))
}()

go func() {
  nc.Publish("bar", []byte("1"))
  nc.Publish("bar", []byte("2"))
}()

Order is still preserved per subject. Since each publish targets a specific subject, if concurrent publishing use disjoint subjects then order within those subjects are still preserved. If however you are publishing to an overlapping hierarchy, e.g. foo.1 and foo.2, concurrently and you expect a subscriber to consume from foo.*, be aware that this total order is not known at publisher time.

On the subscriber side, core NATS guarantees that the order in which messages are received (serialized from publishers) will be the order they are delivered to subscribers. There could be 100s or 1000s of subscribers for foo.* and they will all receive messages in the same order.

sub1, _ := nc.SubscribeSync("foo.*")
sub2, _ := nc.SubscribeSync("foo.*")

s1msg1, _ := sub1.NextMsg()
s2msg1, _ := sub2.NextMsg()

// string(s1msg1.Data) == string(s2msg1.Data)

Order of messages for the same subject subscription will always be the same if established prior to publishing (see caveat below). The one nuance to be aware of with core NATS is that since messages are not persisted nor is delivery retried, there are two factors that could impact this ordering/equality guarantee.

  • Network issues that would result in a message being dropped for a given subscription. Typically this would be the NATS server identifying a slow consumer/lack of pong response followed by a disconnect. Once the network restores, the client will auto-reconnect. Any messages eligible for the subscription during that interruption would be dropped.
  • Related to the first point, the timing of the subscription may result in a different "first message". Even with the code block above, sub1 is being created before sub2. Its possible a message had been published in-between those two subscription calls and the message are actually different. If however these subscriptions are setup prior to any messages being published then the order guarantee holds sans the first point above.

If you don't want time and space to impact your messaging, then you can add JetStream which provides persistent as well as optimistic concurrency control on streams and individual subjects.

Persistence can provide the guarantee of ordered replay/receipt of messages by a subscription regardless of when the subscription is created relative to the publisher. Likewise if a subscription disconnects, the last consumed message is tracked and it can simply reconnect and continuing consumption without dropping a message.

Optimistic concurrency control on the publisher side can be achieved using a header in a published message to indicate the expected last sequence or message ID at the stream level, or the expected last sequence of a specific subject. This provides a way for applications that have concurrent publishers to have control over when order matters.

These JetStream topics have been covered quite a bit in past posts, but I wanted to re-highlight them in the context of this question.

What are some strategies for versioning message schema?

With any cross-process or network communication, there is a step of serializing a native representation of a message (e.g. Go struct, Java class instance, Python dict, JavaScript object, etc.) into a byte array. Likewise when this serialized message is received, it must be deserialized into a native type.

The whole topic of serialization formats, specifically the trade-offs including being self-describing or not, forwards/backwards compatibility semantics, byte representation and code generation requirements, performance, etc. is interesting unto itself. There are many resources out there discussing these trade-offs, so this short topic will assume that a serialization format has been chosen and is consistent among all parties producing or consuming messages.

This may sound obvious, but I would argue that designing message schema should be deliberate and should utilize the language of the domain it is being used for. Assuming the language and the semantics are well understood, this message schema should also be fairly stable. This includes the name of the message itself as well as the properties of the message.

Consumers are inherently expected to conform to this message schema. They don't have a choice, but publishers/schema designers have a few options to evolving this schema contract over time.

  • Don't rename anything.
  • Don't delete any existing properties or message types.
  • Do add new fields to existing message types with default values.
  • Do add new message types even if they overlap with existing ones.
  • No seriously, don't rename or delete anything once you have consumers in production.

The heuristic for adding new fields vs. new message types boils down to whether the semantics have changed. Adding one or more new fields may simply be for including more information because now it's available. Although this is a new version of a message type because it has a different schema, it doesn't mean it needs to be forcefully pushed onto consumers to upgrade. That is the whole point of the default values and having consumer deserialization that can simply ignore properties it is unaware of.

The nuance here is that the schema has changed, so we could give it a v2, but this should not be reflected in the type/name of the message. Otherwise, on consumption the message, the type would be different and the pattern match would fail on existing consumers. Instead, the message type version could be including as metadata in a separate field for tracking purposes.

For new message types, consumers are unaware of these at the start. That is they are required to update themselves if they want to consume these messages. The same goes for new versions of existing messages since the native types will need to support the deserialization of the new properties.

Another consideration is how long-lived these messages are. For events that are stored for perpetuity, this puts the onus on consumers to have the deserialization support read old events. The less disjoint versions of a schema, the easier for a consumer to maintain a compatible deserializer for the message type.

If you don't agree with my "don't change or delete anything", I recommend checking out Greg Young's book on event versioning as a starting point for strategies to do migrations, upcasting, etc. This is mostly applicable with event sourcing (storing events in perpetuity), but since its an extreme case of versioning messages over time, it could be helpful.