Documenting Cloud Pub/Sub Using AsyncAPI
Table of Contents
AsyncAPI, the “the industry standard for defining asynchronous APIs”, recently released version 2.5.0 of its specification. What is significant about this release is that for the first time, Google Cloud Pub/Sub is natively supported. As the author of the Cloud Pub/Sub support for AsyncAPI, I would like to spend some time introducing to you to this feature.
Background
The AsyncAPI specification does not assume any kind of software topology, architecture or pattern.
The AsyncAPI object model is not tied to any specific “software topology, architecture or pattern”, meaning you should be able to use document your asynchronous APIs using AsyncAPI regardless of your software topology, architecture or pattern. But AsyncAPI does have support for documenting specific software topologies, architectures and patterns using Protocol Bindings (two separate links, one common term).
Protocol Bindings are AsyncAPI’s way of defining protocol-specific documentation, or documentation specific to the “software topology, architecture or pattern” being used. And this is how native Cloud Pub/Sub support was added to AsyncAPI.
This article will describe using AsyncAPI to document your Cloud Pub/Sub topology with common AsyncAPI, and the newly-added Cloud Pub/Sub Protocol Bindings.
Google Cloud Pub/Sub Bindings
The Google Cloud Pub/Sub Bindings consists of two
binding objects (at this time) available using the googlepubsub protocol. Each of these objects are discussed
below.
Channel Binding Object
A channel is an addressable component, made available by the server, for the organization of messages. A channel is an addressable component, made available by the server, for the organization of messages.
The Channel Binding Object is used to document Cloud Pub/Sub’s
Topic object, and contains the pertinent Topic configuration to allow for proper API
interaction. The AsyncAPI documentation is the source of truth for the particulars of
the Cloud Pub/Sub Channel Binding Object configuration, but below is an example (sourced from the documentation):
# ...
channels:
  topic-avro-schema:
    bindings:
      googlepubsub:
        topic: projects/your-project/topics/topic-avro-schema
        schemaSettings:
          encoding: json
          name: projects/your-project/schemas/message-avro
# ...
  topic-proto-schema:
    bindings:
      googlepubsub:
        topic: projects/your-project/topics/topic-proto-schema
        messageRetentionDuration: 86400s
        messageStoragePolicy:
          allowedPersistenceRegions:
          - us-central1
          - us-central2
          - us-east1
          - us-east4
          - us-east5
          - us-east7
          - us-south1
          - us-west1
          - us-west2
          - us-west3
          - us-west4
        schemaSettings:
          encoding: binary
          name: projects/your-project/schemas/message-proto
# ...
The configuration of the Channel Binding Object is pretty straight forward.
Message Binding Object
A message is the mechanism by which information is exchanged via a channel between servers and applications.
The Message Binding Object is used to document Cloud Pub/Sub’s
PubsubMessage object, alongside with pertintent parts of the Google Cloud Pub/Sub
Schema object. As with the Channel Binding Object, the Mesage Binding Object for Cloud
Pub/Sub documents the pertinent PubsubMessage/Schema configuration to allow for proper API interaction. As above, the
AsyncAPI documentation is the source of truth for the particulars of the Cloud
Pub/Sub Message Binding Object configuration, but below is an example (sourced from the documentation):
# ...
components:
  messages:
    messageAvro:
      bindings:
        googlepubsub:
          schema:
            name: projects/your-project/schemas/message-avro
            type: avro
      contentType: application/json
      name: MessageAvro
      payload:
        fields:
        - name: message
          type: string
        name: Message
        type: record
      schemaFormat: application/vnd.apache.avro+yaml;version=1.9.0
    messageProto:
      bindings:
        googlepubsub:
          schema:
            name: projects/your-project/schemas/message-proto
            type: protobuf
      contentType: application/octet-stream
      name: MessageProto
      payload: true
# ...
The AsyncAPI example above contains both the Cloud Pub/Sub Protocol Bindings and common AsyncAPI, which is discussed separately below.
Common AsyncAPI Support
As mentioned earlier, the “AsyncAPI specification does not assume any kind of software topology, architecture or pattern” and a good bit of Cloud Pub/Sub could be documented without the addition of the Cloud Pub/Sub Protocol Bindings mentioned above. The sections below explain how to use the agnostic AsyncAPI support for documenting Cloud Pub/Sub.
Server Object
An object representing a message broker, a server or any other kind of computer program capable of sending and/or receiving data.
The Server Object is used to docuement where your servers are located. For
Cloud Pub/Sub, there are both HTTP/REST and gRPC endpoints, each of which are
available globally and regionally. While not strictly required, you should document which Cloud Pub/Sub endpoints you
are using. Part of the native Cloud Pub/Sub support, the googlepubsub protocol is now available and would be used as
the protocol value for your Cloud Pub/Sub Server Object(s).
An easy way to document Cloud Pub/Sub servers with AsyncAPI is to leverage the
Server Variable Objects. This allows you to make your Server Object(s) dynamic and
be specific about the Cloud Pub/Sub endpoint(s) your API is using. Below is an example:
# ...
servers:
  cloudPubSub:
    url: '{cloudPubSubEndpoint}.googleapis.com'
    description: The API for Cloud Pub/Sub.
    protocol: googlepubsub
    variables:
      cloudPubSubEndpoint:
        # Default to the global endpoint but allow region-specific endpoints.
        default: pubsub
        description: The Cloud Pub/Sub endpoint region.
# ...
The example above defaults to using the global Cloud Pub/Sub service endpoint, but allows using any supported Cloud
Pub/Sub region. If you wanted to be specific about which region(s) your Cloud Pub/Sub usage includes, you could throw an
enum property into the Server Variable Object to be more specific. Here is an
example:
# ...
servers:
  cloudPubSub:
    url: '{cloudPubSubEndpoint}.googleapis.com'
    description: The API for Cloud Pub/Sub.
    protocol: googlepubsub
    variables:
      cloudPubSubEndpoint:
        description: The Cloud Pub/Sub endpoint region.
        # Restrict to only the following region-specific endpoints.
        enum:
        - us-central1
        - us-central2
        - us-east1
        - us-east4
        - us-east5
        - us-east7
        - us-south1
        - us-west1
        - us-west2
        - us-west3
        - us-west4
# ...
Channels Object
Holds the relative paths to the individual channel and their operations. Channel paths are relative to servers.
Channels are also known as “topics”, “routing keys”, “event types” or “paths”.
The Channels Object is where AsyncAPI collects information about
Channel(s) an API exposes. As this relates to Cloud Pub/Sub, the “channel path” is analogous to
the Topic name. Below is an example:
# ...
channels:
  /projects/your-project/topics/topic-avro-schema:
    # ...
# ...
There isn’t much to this section other than to point out that the Channel Object’s key corresponds to the Cloud
Pub/Sub Topic name. (This is also where you would define the Channel Binding Object described above.)
Message Object
Describes a message received on a given channel and operation.
The Message Object is where AsyncAPI collects information about the Messages
exchanged over a Channel. While most AsyncAPI Message Object properties are pretty straight forward, like using the
PubsubMessage’s messageId for the value of AsyncAPI’s messageId, there are some compatibility issues between
AsyncAPI and Cloud Pub/Sub as it relates to defining the Message Object’s payload property.
When defining a Cloud Pub/Sub Topic, you are asked to provide (or reference) a Schema for validating messages
publisehd to the Topic. Since Cloud Pub/Sub Schemas are reused across Topics, documenting the
SchemaSettings within the Channel Binding Object does not make sense because a
Schema is not tied to a Topic. Another reason that the SchemaSettings is not documented in the
Channel Binding Object is because AsyncAPI already has a mechanism for defining schemas for message validation and
that is by using the payload property (the property affected by compatibility issues) of the Message Object.
Cloud Pub/Sub currently supports two types of schemas: Apahce Avro and
Protobuf. Unfortunately, only Avro is supported by AsyncAPI (at this time) and this is the
“compatibility issue” mentioned above. For Avro Schemas, the payload property of the Message Object can be used
normally to describe your Cloud Pub/Sub Schema in AsyncAPI. Below is an example
(completed with Message Binding Object):
# ...
components:
  messages:
    messageAvro:
      bindings:
        googlepubsub:
          schema:
            name: projects/your-project/schemas/message-avro
            type: avro
      contentType: application/json
      name: MessageAvro
      payload:
        fields:
        - name: message
          type: string
        name: Message
        type: record
      schemaFormat: application/vnd.apache.avro+yaml;version=1.9.0
# ...
But for Protobuf Schemas, the payload property is unusable at this time. For documentation of Protobuf Schema
objects in AsyncAPI, you might consider using a Specification Extension so that you
can at least provide documentation to the API consumer. Below is an example
(the Specification Extension name used is purely for example purposes):
# ...
components:
  messages:
    messageProto:
      bindings:
        googlepubsub:
          schema:
            name: projects/your-project/schemas/message-proto
            type: protobuf
      contentType: application/octet-stream
      name: MessageProto
      payload: true
      x-protobuf-payload: |
        syntax = "proto2";
        message Message {
          required string message = 1;
        }        
# ...
Regardless of the Schema type, there are two properties that need to be discussed before moving on from the
Message Object.
contentType
The content type to use when encoding/decoding a message’s payload. The value MUST be a specific media type (e.g.
application/json).
The value of the contentType property should be set based on the Encoding of the
SchemaSettings. When Encoding is JSON, contentType should be set to application/json. And when Encoding is
BINARY, contentType should be application/octet-stream.
schemaFormat
A string containing the name of the schema format used to define the message payload.
The value of the schemaFormat property depends on whether you’re using Avro or Protobuf for your Message Schema.
When an Avro Schema is used for the PubsubMessage, the value of schemaFormat should be based on the appropriate
Avro media type and Avro version. (See the examples above.) When a Protobuf Schema is used for the PubsubMessage,
schemaFormat should be omitted because AsyncAPI does not support Protobuf natively at this time.
Conclusion
Documenting your APIs is extremely important, and for asynchronous APIs, AsyncAPI is the “industry standard”. Prior to the 2.5.0 release of AsyncAPI, while you could document parts of your Cloud Pub/Sub topology, there was a large gap as there was no way to document Cloud Pub/Sub specific information that may affect your API consumers. With AsyncAPI 2.5.0 there is native Cloud Pub/Sub support and which should help using AsyncAPI to document your Cloud Pub/Sub topology more complete. So if you’re using Cloud Pub/Sub, you now have no reason not to use AsyncAPI for documenting your APIs.