Federated credentials from GitHub and GitLab pipelines to Azure
Last updated
Last updated
This post also appeared in the FastTrack Blog.
Federated credentials / workload identity federation allows your CI/CD pipelines in GitHub and GitLab to access your Azure subscription without any secrets stored in the pipeline config.
GitHub's azure/login@1
task handles this transparently, but I also explain how it works under the hood. GitLab supplies the necessary token directly to your pipeline run.
Both GitHub and GitLab are easy to setup and federate securely with your Azure subscription.
BitBucket can't be setup that way, because tokens issued by BitBucket don't have a predicable subject identifier.
This article demonstrates how to configure, and securely access, an Azure environment from within a GitHub and a GitLab CI/CD pipeline, without having to store credentials on the GitHub/GitLab side. The article also briefly explains why BitBucket currently doesn't support that capability.
CI/CD pipelines often have to interact with an Azure cloud environment, e.g. to upload artifacts to a storage account, read values from a Key Vault or deploy resources. Service principal credentials are a well-established way for such access, but have the disadvantage of using secrets and passwords, which have to be managed securely. Using 'federated identity credentials' (also sometimes called 'workload identity federation'), such cloud access can happen without having to store secrets in the CI/CD pipeline.
Old fashioned service principals: Traditionally, a service principal credential consists of three values: The client_id
of the service principal or app registration, the Azure AD tenant_id
where the service principal exists, and a client_secret
password needed to fetch a token. Ideally, we want to avoid having to handle a client_secret
; these secrets often have a lifetime (they expire and have to be updated after a couple of months), and access to these secrets must be protected so only authorized parties can access these secrets.
Federated credentials: Modern CI/CD systems, such as GitHub or GitLab, allow their users to run pipelines on container-based runners in their infrastructure. As part of these environments, they also expose a service-provider-specific OAuth2 identity provider (IdP), that the code in the CI/CD can fetch tokens from. The idea behind a federated credential is to say: "In Azure, there isa user-assigned managed identity (or a service principal), and the CI/CD pipeline should be able to use a token from the local IdP, to sign-in to that UAMI/SP".
So there are two token exchanges:
The CI/CD pipeline somehow talks to the 'local' IdP, says "I am the main branch within this project, please issue me a JWT token which I can then use to sign-in to Azure". That security token has an issuer being GitHub or GitLab, an audience of Azure AD, and the token's subject being information about which CI/CD pipeline is currently running.
The CI/CD pipeline then talks to Azure Active Directory, and exchanges the GitHub-issued token with one that can be used to access the desired Azure resource. The exchange basically says "Here's a security token, showing that I'm this CI/CD pipeline, please give me a token to call into Azure KeyVault (or ARM, or Storage, or whatever it might be)".
The Microsoft Entra 'Workload Identity Federation' docs show in depth how the flow works in general:
In our scenario, the 'external identity provider' is the GitHub/GitLab-internal IdP. Simply speaking,
the CI/CD pipeline fetches the token from GitHub,
fetches the Azure token from Azure AD,
during that request, Azure AD validates the GitHub-issued token by retrieving the external IdP's signing credentials and checking the token signature, and
finally the CI/CD code can access the Azure resources:
Federated credentials are supported by both service principals and user-assigned managed identities (UAMI). In the end, a UAMI under the hood is represented by a service principal in Azure AD, too. The identities of both a service principal and UAMI can be granted access to Azure resources, so both are a fit here, too.
However, the lifecycle management and API surface for these two identities is very different:
A service principal is created and configured within Azure Active Directory (for example by calling az ad app create
), and adding a https://graph.microsoft.com/beta/applications/${applicationObjectId}/federatedIdentityCredentials/
via Microsoft Graph API. Depending on where you work, writing to Azure AD and Graph API might be tightly regulated; many companies prevent regular users from directly creating a service principal, so this route might be challenging.
A user-assigned managed identity on the other hand can be completely handled in the Azure Resource Manager (ARM) control plane. A UAMI (and it's federated identity configuration) is a first-party ARM object, so that might be more approachable for teams who have full control over their Azure subscription (but lack Azure AD privileges).
Both the service principal configuration (via Microsoft Graph API), as well as the UAMI configuration (via ARM API) require the same configuration data:
name
: Each SP or UAMI might have up to 20 different federated credentials configured, and each credential must have a name
attribute.
issuer
: Each federated credential must have the issuer URL configured.
The issuer
is something like "https://token.actions.githubusercontent.com"
or "https://gitlab.com"
, or a custom domain name in case of a dedicated GitLab instance. It must be equivalent to the iss
claim in the security token.
How is it used? Azure AD appends the path .well-known/openid-configuration
to the issuer URL, to retrieve the IdP's signing credentials, used to check the signature on the tokens.
audience
: An array (with exactly one string) of the aud
claim in the federated identity token.
By default, this is "api://AzureADTokenExchange"
, but you can customize that if desired.
subject
: The subject value is the sub
claim of the security token, and is determined by the CI/CD environment.
For GitHub, this subject for example looks like "repo:chgeuer/azure-workload-identity-github:ref:refs/heads/main"
, in which chgeuer/azure-workload-identity-github
represents the user or organization (chgeuer
), and the repository (azure-workload-identity-github
), while ref:refs/heads/main
indicates a CI/CD pipeline running on the main
branch.
For GitLab, this subject looks similar, like "project_path:chgeuer/azure-workload-identity-federation-demo:ref_type:branch:ref:main"
, i.e. chgeuer
being the user, azure-workload-identity-federation-demo
being the repository and main
being the branch.
Unfortunately, BitBucket handles this differently: For federated credential sign-in to work well, the expected sub
claim in the security token must have a predictable value. BitBucket's sub
claims look like this:"{ad073b2b-7126-4f19-9eed-1c9b10abe160}:{2b2ac083-d564-4064-8ea1-43e6aeff2b96}:{37416cfa-3260-4c31-bea4-a6b2f29272a7}"
. The three GUIDs are "{repositoryUuid}:{deploymentEnvironmentUuid}:{stepUuid}"
. The repositoryUuid
and the deploymentEnvironmentUuid
are stable, but unfortunately, the 3rd element in the tuple, the stepUuid
, is re-generated with each pipeline run. Therefore, each time a new security token has a different sub
claim. Given that an Azure federated identity credential expects a fixed subject, and does not allow semantics like 'Subject claim starts with ... or conforms to a regular expression', BitBucket's tokens can't be used for federated credential flows.
These four values, represented in JSON, in a GitHub configuration would look like this:
while a GitLab config would look like this:
Example on creating a service principal with a federated credential (using a script)
The following bash script gives you an idea on how the service principal would be created (az ad app create
), and how you can add the federatedIdentityCredentials
JSON to Graph API
Example creating a user-assigned managed identity with a federated credential (using infra-as-code)
When creating a UAMI via Bicep, these two steps could be combined in a single representation:
Once the service principal or the UAMI is created on the Azure side, you need two pieces of information from Azure:
The Azure Active Directory's Tenant ID, i.e. either the tenant's GUID (something like 942023a6-efbe-4d97-a72d-532ef7337595
), or one of the configured domain names (such as chgeuerfte.onmicrosoft.com
).
The service principal's or UAMI's client ID, which always is a GUID.
These two values must be configured in the CI/CD environment, as environment variables, or secrets (even though they're strictly speaking not secret).
The following sample shows how to ZIP the repo's source code and upload it into a storage account:
The use of Azure Storage is just a sample of an Azure resource, that we can access from within GitHub. You can see a few interesting parts in that YAML
file:
The YAML file must contain the following lines, so that the GitHub IdP is able to issue a token:
The azure/login@1
task on GitHub automagically handles all the federated sign-in to an SP or a UAMI. In the sample below, we set tenant-id
and client-id
based on GitHub secret values. If the expected audience
on Azure would be different from "api://AzureADTokenExchange"
, it would also be possible to tweak that value:
A full example of that configuration can be found here: https://github.com/chgeuer/azure-workload-identity-github/
For those interested in understanding what the azure/login@1
task on GitHub does under the hood, we can mimic it all in a bash
script with curl
and jq
. This allows us to inspect the security tokens, along the way. Let's tweak the step in the CI/CD to just run a shell script:
In this code, we copy the tenant ID and client ID from the secrets into environment variables. In action.sh
, we now manually do the two token exchanges, and print out the claims from the JWT:
First, we fetch a token from the GitHub's IdP: the environment variable ACTIONS_ID_TOKEN_REQUEST_URL
contains the URL of the IdP. The URL-encoded audience goes into the query string, and we supply a GitHub-internal security token from the environment variable ACTIONS_ID_TOKEN_REQUEST_TOKEN
as bearer token.
Second, we issue a token issuance request against our Azure AD tenant, in which we specify the UAMI's or SP's client_id
, and supply the GitHub-issued JWT as client_assertion
.
Then we use jq -R 'split(".") | .[1] | @base64d | fromjson'
to extract the claims portion from both JWT tokens, and write out a Markdown-formatted table with iss
, aud
and sub
claims of both tokens to the file "${GITHUB_STEP_SUMMARY}"
, so that the table shows up in the CI/CD's pipeline output:
GitHub
Issuer
iss="https://token.actions.githubusercontent.com"
GitHub
Audience
aud="api://AzureADTokenExchange"
GitHub
Subject
sub="repo:chgeuer/azure-workload-identity-github:ref:refs/heads/main"
Azure
Issuer
iss="https://sts.windows.net/***/"
Azure
Audience
aud="https://storage.azure.com"
Azure
Subject
sub="079fd90b-a298-480a-b951-257d0974f77e"
In the above output table, GitHub blanks-out (***
) the Azure AD tenant, in the Azure/Issuer value, because that string comes from the AZURE_TENANT_ID
secret.
In the rest of shell script, the azure_access_token
shell variable can be used to call Azure services, in this sample Azure Storage.
On the GitLab side, we don't have task like GitHub's azure/login@1
action, so we follow a pure script-based approach in our GitLab YAML:
A difference you can see above is that token issuance within GitLab is handled differently: You don't need to use curl
to request a GitLab-issued token from some token endpoint. Instead, you just specify a id_tokens
section, in which you name a desired environment variable (ID_TOKEN_FOR_AZURE
in the above example), and the audience for that token, and GitLab stored the JWT token in your environment variable of choice, prior running your job.
We request the whole thing to use the Azure CLI Docker image (mcr.microsoft.com/azure-cli:latest
), so we can use commands like az login
in our script. Inside that script, we can then use
To login to a given UAMI or service principal, using a federated token from GitLab.
For illustration purposes, we can fetch a secret from a Key Vault (assuming our UAMI is authorized to read that secret), and print out the token contents on screen.
Given that a service principal, or a UAMI, can have up to 20 federated credentials configured, one can also hook up multiple pipelines (from different providers) to the same Azure identity:
Since March '23, Managed Identity support for Azure DevOps is in public preview. When you're running ADO-based CI/CD pipelines on a compute resource (such as a VM) in your own subscription, you can bind a system-assigned or user-assigned managed identity to that compute resource, and from within your pipeline run access all the Azure resources the managed identity has access to.
Federated identity credentials are currently targeted to allow external identity providers (non-Azure AD) to facilitate sign-in to Azure environments. Given that both Azure DevOps and your Azure resources are all governed by Azure Active Directory, there's not necessarily a need to use a federated credential.
As of now (June '23), federated identity credentials unfortunately don't work across Azure AD-tenant boundaries (error message AAD STS 700222). Should you want to allow 'inbound' connections from a managed identity in another tenant into resources in your Azure DevOps tenant, check the team's guidance on "Can I add a managed identity from a different tenant to my organization?"
In case that article sparked your interest, and you'd like to go deeper with end-to-end samples, check out the following resources...
https://github.com/chgeuer/azure-workload-identity-github/
CI/CD Provider: Github
Azure Identity: Service Principal, created and configured using a script
YAML: Demo both the 'proper' azure/login@1
task and a bash script.
Scenario Azure Service: Uploading a file to blob storage
Azure Environment, and GitHub variables, all configured as part of setup.sh
https://github.com/chgeuer/github-action-via-user-assigned-managed-identity-to-keyvault-secret/
CI/CD Provider: GitHub
Azure Identity: A user-assigned managed identity, created using Bicep
YAML: Using all the good pre-defined actions, no fiddling with bash
Scenario Azure Service: Fetch a secret from Key Vault, and base64 encode it to GitHub's secret filter doesn't kick in 😬
https://gitlab.com/chgeuer/azure-workload-identity-federation-demo
CI/CD Provider: GitLab
Azure Identity: A user-assigned managed identity, created using Bicep
YAML: Using az
CLI commands for Azure access.
Scenario Azure Service: Fetch a secret from Key Vault, fetching the secret using the az cli
.
Minimal "Azure AD Workload identity federation"
In case you're interested in fully understanding how the federated credential / workload identity federation works with Azure, this blog post shows how to create an IdP signing credential, upload it into a publicly accessible location (blob storage), and show how to create a self-issued token, which can be used to sign-in to a federated credential on Azure.