The checkZonePeers API: Is your availability zone "1" equal to my "1"?

This is a local copy of my article on the FTA blog...

Introduction

Since March 2018, Azure Availability Zones (AZs) are generally available. "Availability Zones are physically separate locations within an Azure region. Each Availability Zone consists of one or more datacenters equipped with independent power, cooling, and networking."

Take for example the Azure Region 'West Europe', which is in the 'Geography Europe', and is the Microsoft'ish region name for Amsterdam (in the Netherlands (in Europe (on Earth))). West Europe is equipped with 3 availability zones. Roughly speaking, these availability zones are 1 milli-second network latency away from each other, or something like 30-40 km (if you have to walk), so if a local disaster renders resources in one AZ unavailable, you hopefully have some resiliency baked into your solution architecture, with failover resources in the second and third AZ. When you spin up resources in Azure, some of these are so-called "zonal services":

A zonal service or resource "supports AZs, and can be deployed to a specific, self-selected availability zone, to achieve more stringent latency or performance requirements."

A zonal resource's ARM JSON representation contains a zones array. For example, let's look at a (zonal) IP address:

{
    "id":            "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/...",
    "type":          "Microsoft.Network/publicIPAddresses",
    "name":          "ipproductionweb",
    "resourceGroup": "webfrontend",
    "location":      "westeurope",
    "zones": [
        "1"
    ],
    ...
}

Great, we know now that our IP address 'lives' in West Europe (Amsterdam), in availability zone "1" (whatever that might mean). This information allows us to place other resources in the same AZ. For example, the network interface card or load balancer that you attach that IP address to, certainly has to be in AZ 1 as well. When you place multiple resources into AZ 1, you know they'll be closer together as if one of them would be in a different AZ. The AZ number helps you plan out a deployment, by deliberately placing resources in the same, or in different AZs.

The availability zone numbers are meant to be used only within a single Azure region: AZ1 in West Europe (Amsterdam) has absolutely nothing to do with AZ1 in North Europe (Dublin in Ireland), except that they might be cooled by the same chilly North Sea air.

Assuming you have a single Azure subscription, that's enough you need to know, you can stop reading here. I appreciate your time and attention, check our official docs, and have a pleasant day.

If you're still with me, chances are that you or your company have multiple Azure subscriptions. You might even have multiple Azure AD tenants. Or you might work with subscriptions from customers or partners, who 'live' underneath the partner's Azure AD tenant. And this is where things get more complex:

The "availability zone 1 in West Europe" in subscription 'X' is NOT necessarily equivalent to "availability zone 1 in West Europe" in another subscription 'Y'!

When naming things like 1, 2, 3 and so forth, we humans have the tendency that start using the thingie called "1" first, and once we need a second thing, we use thingie "2", etc. If the Azure team would have made Availability Zone "1" (in a certain region) the same for all of us, then that AZ would quickly demonstrate that "cloud is just the illusion of infinite resources"; all customers would deploy their main workload into AZ1, and only gradually leverage AZ2, and maybe, just maybe, deploy something into AZ3. AZ1 would run hot quickly, while AZ2 would have modest load. And AZ3 would be wondering which business justification made it being built in the first place.

Therefore, the Azure team decided to 'shuffle' the mapping of the availability zone identifiers to 'real/physical' availability zone, on a per-subscription basis. Once a subscription is created, that mapping will remain unchanged for the lifetime of that subscription.

So now imagine you want to deploy a resource (like a database VM) in your own subscription, but you want it to be physically placed in the very same availability zone which my subscription calls "AZ 2". How can you find out which AZ number (in your own subscription) you have to pick, to deploy close to my workload? You can find out using the Azure checkZonePeers API, which gives you a mapping table.

How does the checkZonePeers API work?

The checkZonePeers API lets you retrieve a mapping table, which tells you how other subscriptions call an AZ, which your subscription knows under a certain name.

Let's say 'our' reference subscription ID, relative to which we want to determine AZ names (in the westeurope region) is 11111111-1111-1111-1111-111111111111. We ask the API:

Hey, I am 11111111-1111-1111-1111-111111111111, and I am interested how two other subscription IDs (22222222-2222-2222-2222-222222222222 and 33333333-3333-3333-3333-333333333333), call 'my' AZs in westeurope...

The underlying REST API call looks something like this:

POST /subscriptions/11111111-1111-1111-1111-111111111111/providers/Microsoft.Resources/checkZonePeers?api-version=2020-01-01
Host: management.azure.com
Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6IjJaUXBKM1V...
Content-Type: application/json

{
  "location": "westeurope",
  "subscriptionIds": [
    "/subscriptions/22222222-2222-2222-2222-222222222222",
    "/subscriptions/33333333-3333-3333-3333-333333333333"
  ]
}

In the URL, you have the subscription ID which is your 'north star', which defines the baseline of what AZ names are normative for you. In the request body, you specify the Azure region and the other subscriptions you want to have a mapping for.

The API's response looks like this:

{
  "subscriptionId": "11111111-1111-1111-1111-111111111111",
  "location": "westeurope",
  "availabilityZonePeers": [
    {
      "availabilityZone": "1",
      "peers": [
        { "subscriptionId": "22222222-2222-2222-2222-222222222222", "availabilityZone": "1" },
        { "subscriptionId": "33333333-3333-3333-3333-333333333333", "availabilityZone": "3" }
      ]
    },
    {
      "availabilityZone": "2",
      "peers": [
        { "subscriptionId": "22222222-2222-2222-2222-222222222222", "availabilityZone": "3" },
        { "subscriptionId": "33333333-3333-3333-3333-333333333333", "availabilityZone": "2" }
      ]
    },
    {
      "availabilityZone": "3",
      "peers": [
        { "subscriptionId": "22222222-2222-2222-2222-222222222222", "availabilityZone": "2" },
        { "subscriptionId": "33333333-3333-3333-3333-333333333333", "availabilityZone": "1" }
      ]
    }
  ]
}

Therefore, you have the following mapping:

SubscriptionAZ XAZ YAZ Z

11111111-1111-1111-1111-111111111111

1

2

3

22222222-2222-2222-2222-222222222222

1

3

2

33333333-3333-3333-3333-333333333333

3

2

1

AZ 1 in our main sub (11111111-...) is AZ "1" as well in 22222222..., but it's called AZ "3" in 33333333-....

Subscriptions in different Azure ADs, but only a single Authorization header?

The aforementioned approach works - if all your subscriptions are governed by the very same Azure AD tenant; you fetch the access token, do the lookup for all your subscriptions, and be done.

However, if these subscriptions are hooked up to different AAD tenants, you need to demonstrate to the checkZonePeers API that you are authorized to access all of them. Unfortunately, the API only accepts a single bearer token in the HTTP Authorization header. How can we supply the other tokens? For such tenant-boundary-spanning API calls, the Azure Resource Manager API has a custom HTTP header, called x-ms-authorization-auxiliary, which can hold up to three 'auxiliary' tokens.

If the primary token (from the base subscription's AAD tenant) goes into Authorization: Bearer ..., you can put three additional AAD tokens from other AAD tenants into the x-ms-authorization-auxiliary.

Please note that all of these tokens need to belong to the same subject, or user. Your Azure AD user must have an account (or be a guest) in all relevant AAD tenants, and of course you need to be a Reader on the underlying subscriptions.

Otherwise, you will get an AuxiliaryTokensInvalidUserIdentity error, indicating that Authentication failed for auxiliary token: The '1' auxiliary tokens are from the client(s) 'live.com#foobar@hotmail.com' which are different from the client of primary identity 'chgeuer@microsoft.com'.

Enable the Microsoft.Resources/AvailabilityZonePeering feature for your subs

You might have to register usage of that API first. You can check that the feature is enabled, using this command:

az feature show \
   --namespace Microsoft.Resources \
   --name AvailabilityZonePeering \
   --subscription chgeuer-work \
   | jq -r '.properties.state'

If you don't get back "Registered", make sure you register for the feature in all relevant subscriptions.

az feature register \
   --namespace Microsoft.Resources \
   --name AvailabilityZonePeering

A practical demo script (pure bash, curl and jq)

Below, you can find a demo script, which you of course need to adapt to your environment, i.e. filling in the appropriate Azure AD tenant IDs and Azure subscription IDs.

In my environment, I have three Azure subscriptions and each of them is hooked up to a different Azure AD tenant.

Azure Subscription IDAzure Active Directory Tenant ID

11111111-1111-1111-1111-111111111111

aadaadaa-1111-1111-1111-111111111111

22222222-2222-2222-2222-222222222222

aadaadaa-2222-2222-2222-222222222222

33333333-3333-3333-3333-333333333333

aadaadaa-3333-3333-3333-333333333333

In order to make it simple, I'm running this in a Linux shell, under WSL, on my Windows laptop. For each of the (in my case three) required access tokens, I'm doing a device login, so I can simply authenticate in my Windows web browser. If that script detects that it's running within WSL, then I copy the device login's user code into the Windows clipboard, and kick the (Windows-side) default web browser to the device login page, so I just need to Ctrl-V paste the device user code, and do my authentication dance.

In addition to the actual response from the REST API, I output some claim contents from the various access tokens, which looks something like this:

access_token_1 Issuer:   iss="https://sts.windows.net/aadaadaa-1111-1111-1111-111111111111/"
access_token_1 Subject:  sub="djwNuWxvH-6cUIHWIwRBjVanUvsrG5Ty6eJMqcK722U"

access_token_2 Issuer:   iss="https://sts.windows.net/aadaadaa-2222-2222-2222-222222222222/"
access_token_2 Subject:  sub="djwNuWxvH-6cUIHWIwRBjVanUvsrG5Ty6eJMqcK722U"

access_token_3 Issuer:   iss="https://sts.windows.net/aadaadaa-3333-3333-3333-333333333333/"
access_token_3 Subject:  sub="djwNuWxvH-6cUIHWIwRBjVanUvsrG5Ty6eJMqcK722U"

As you can see, the Subject (sub) is the same in all of them (as mentioned earlier, this is needed)...

#!/bin/bash

aad1="aadaadaa-1111-1111-1111-111111111111"
sub1="11111111-1111-1111-1111-111111111111"
aad2="aadaadaa-2222-2222-2222-222222222222"
sub2="22222222-2222-2222-2222-222222222222"
aad3="aadaadaa-3333-3333-3333-333333333333"
sub3="33333333-3333-3333-3333-333333333333"

function deviceLogin {
  local tenant="$1" ; 
  local resource="$2" ;

  deviceResponse="$(curl \
    --silent \
    --request POST \
    --url "https://login.microsoftonline.com/${tenant}/oauth2/v2.0/devicecode" =
    --data-urlencode "client_id=04b07795-8ddb-461a-bbee-02f9e1bf7b46" \
    --data-urlencode "scope=${resource}" \
    )" ;

  device_code="$(echo "${deviceResponse}" | jq -r ".device_code")" ;
  sleep_duration="$(echo "${deviceResponse}" | jq -r ".interval")" ;
  access_token="" ;

  if [[ $(grep --ignore-case Microsoft /proc/version) ]]; then
     # On WSL, copy response code to clipboard, and launch user's web browser
     echo "$( echo "${deviceResponse}" | jq -r ".user_code" )" | iconv -f utf-8 -t utf-16le | clip.exe
     cmd.exe /C "start $( echo "${deviceResponse}" | jq -r ".verification_uri" )"
  fi
 
  while [[ "${access_token}" == "" ]]
  do
      tokenResponse="$(curl \
          --silent \
          --request POST \
          --url "https://login.microsoftonline.com/{aadTenant}/oauth2/v2.0/token" \
          --data-urlencode "grant_type=urn:ietf:params:oauth:grant-type:device_code" \
          --data-urlencode "client_id=04b07795-8ddb-461a-bbee-02f9e1bf7b46" \
          --data-urlencode "device_code=${device_code}" \
          )" ;
  
      if [ "$(echo "${tokenResponse}" | jq -r ".error")" == "authorization_pending" ]; then
        >&2 echo "$(echo "${deviceResponse}" | jq -r ".message")" ;
        sleep "${sleep_duration}" ;
      else
        access_token="$(echo "${tokenResponse}" | jq -r ".access_token")" ;
      fi
  done ;
  
  echo "${access_token}"
}

function showToken {
  local name="$1" ; 
  local access_token="$2" ;
  claims="$( jq -R 'split(".") | .[1] | @base64d | fromjson' <<< "${access_token}" )"
  
  echo "${name} Issuer:   iss=$( echo "${claims}" | jq .iss )"
  # echo "${name} Audience: aud=$( echo "${claims}" | jq .aud )"
  echo "${name} Subject:  sub=$( echo "${claims}" | jq .sub )"
}

echo "Please login to the tenants using the same user ID..."
arm_api="https://management.azure.com/.default"
access_token_1="$( deviceLogin "${aad1}" "${arm_api}" )"
access_token_2="$( deviceLogin "${aad2}" "${arm_api}" )"
access_token_3="$( deviceLogin "${aad3}" "${arm_api}" )"

#
# Show selected contents of the access tokens, like issuer and subject.
#
showToken "access_token_1" "${access_token_1}"
showToken "access_token_2" "${access_token_2}"
showToken "access_token_3" "${access_token_3}"

location="westeurope"

checkZonePeersBody="$( echo "{}"                                                                \
   | jq --arg x "${location}"            '.location=$x'                                         \
   | jq                                  '.subscriptionIds=[]'                                  \
   | jq --arg x "/subscriptions/${sub2}" '.subscriptionIds[.subscriptionIds | length] |= .+ $x' \
   | jq --arg x "/subscriptions/${sub3}" '.subscriptionIds[.subscriptionIds | length] |= .+ $x' \
)"

curl \
    --silent \
    --request POST \
    --url "https://management.azure.com/subscriptions/${sub1}/providers/Microsoft.Resources/checkZonePeers?api-version=2020-01-01" \
    --header "Authorization: Bearer ${access_token_1}" \
    --header "x-ms-authorization-auxiliary: Bearer ${access_token_2}, Bearer ${access_token_3}" \
    --header "Content-Type: application/json" \
    --data "${checkZonePeersBody}" \
| jq "."

Last updated