Yes, You Can Run Claude on Azure

You can deploy Anthropic's Claude inside Microsoft Foundry today, then govern it with the same Azure API Management gateway that already controls your OpenAI traffic.

If you only have a minute, here is what you need to know.

Most people assume "AI on Azure" means OpenAI's GPT models. It does not. You can deploy Anthropic's Claude inside Microsoft Foundry today, on your subscription, in your tenant, billed on your Azure invoice.
You can point Claude Code, the coding agent your developers are probably already using, at that Azure deployment. It authenticates with their existing Azure login. No personal API keys, no data leaving your compliance boundary.
The setup is genuinely small: deploy the model, set a few environment variables, run az login. There is a starter kit that provisions the whole thing with one command.
The real prize is governance. Once Claude runs as a Foundry deployment, you front it with Azure API Management, the same gateway that already governs your OpenAI traffic. One set of spending caps, safety screening, and cost tracking now covers both vendors.
The honest caveats matter. Claude on Foundry is in preview, limited to two regions, and does not support Anthropic's server-managed agent features. Governing it through the gateway requires the newer version of API Management. Know the edges before you build on them.

The thing nobody mentioned

Here is a question I have started asking enterprise teams: which AI models can you run inside your own Azure tenant?

The answer comes back fast and confident. "OpenAI. GPT-4, GPT-5, the o-series. That is what Azure AI is."

It is also wrong, or at least two model families out of date. Anthropic's Claude models run on Microsoft Foundry right now. Claude Sonnet 4.6, Opus, Haiku, deployed the same way you deploy a GPT model, called through the same Foundry resource, billed on the same Azure invoice. Not a connector to Anthropic's servers. A deployment in your subscription.

I keep running into the same reaction when I show people this: a pause, then "wait, since when?" So, in the spirit of a useful nudge rather than a lecture: hey, not sure if you knew this. You can run Claude on Azure. And if your organization already standardized on Azure and a gateway in front of it, Claude drops in behind your existing controls without you giving up a thing.

Let me show you how, and then why the gateway part is the part that actually matters.

Why this is more than a curiosity

The reflex reaction is "interesting, but we already have GPT, why would I care." Here is why a platform team should care, and it has nothing to do with which model is smarter this quarter.

Your developers are already using Claude. Claude Code has become one of the more widely adopted agentic coding tools, and the people on your teams who reach for it are doing so on personal accounts, with personal API keys, sending your codebase to an endpoint your security team has never reviewed. That is textbook shadow AI. You can argue with it or you can absorb it.

Running Claude on Foundry turns shadow AI into governed AI without telling your developers to stop using the tool they like. The model they want now lives inside the boundary your security team already controls. Their requests authenticate with their Azure identity through Microsoft Entra ID, so there is no personal key to leak. The traffic stays in your tenant and your region. The cost lands on a bill you already reconcile. And the same monitoring you use for everything else in Azure now sees it.

That is the business case in one sentence: it is the difference between developers using Claude in spite of you and developers using Claude through you.

Shadow AI to governed AI: developers using Claude on personal accounts move inside the tenant boundary, authenticated by Entra ID, with traffic and cost staying in the organization

How it actually works

This is the part where most "you can do X on Azure" articles wave their hands. Here are the concrete steps, because the surprise is how few there are.

Claude on Azure setup steps: deploy the model in Foundry, call it through Anthropic's Messages API, and point Claude Code at the deployment with a handful of environment variables

Deploy the model. Claude models are available in Foundry as a partner offering through the Azure Marketplace. You subscribe to the offering, then deploy your chosen Claude model as a global standard deployment, the same flow as any Foundry model. Today the models are available in two regions, East US 2 and Sweden Central, and you need a paid Azure subscription with a real billing method behind it. If you would rather not click through the portal, Microsoft and Anthropic publish a starter kit that provisions a Foundry account, project, and your Claude deployments with a single azd up, using either Bicep or Terraform.

Call it like Claude, because it is Claude. The deployment exposes Anthropic's own Messages API. Your endpoint takes the shape https://<your-resource>.services.ai.azure.com/anthropic/v1/messages. You call it with the standard Anthropic SDK or plain REST, with the usual anthropic-version header. Authentication is either a Foundry API key or, better, Microsoft Entra ID with a bearer token. Nothing about your existing Anthropic SDK code has to change except where it points and how it authenticates.

Point Claude Code at it. This is the part that surprises people most. Claude Code, the CLI and VS Code agent, has a built-in Foundry mode. You set a handful of environment variables and you are done:

export CLAUDE_CODE_USE_FOUNDRY=1
export ANTHROPIC_FOUNDRY_RESOURCE="your-foundry-resource-name"
export ANTHROPIC_DEFAULT_SONNET_MODEL="your-sonnet-deployment-name"
export ANTHROPIC_DEFAULT_OPUS_MODEL="your-opus-deployment-name"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="your-haiku-deployment-name"
az login

That is the whole thing. Claude Code builds the endpoint from your resource name, picks up your Azure CLI session for authentication, and runs against your Foundry deployment. The /login and /logout commands inside the tool turn off, because identity now comes from Azure, not from a personal Anthropic account. Your developers keep the exact workflow they already know. The traffic just stops leaving the building.

The bottom line on setup: this is an afternoon, not a project.

The part that earns its keep: putting a gateway in front

Deploying Claude on Azure is the table stakes. The reason to do it inside an enterprise is what you can put in front of it.

I wrote a whole piece on this called "The Trust Boundary Moved," and the short version is this: once an AI agent can act, the only place you can actually control it is the single point every request passes through on its way to a model or a tool. That point is the API gateway. In Microsoft's world, that gateway is Azure API Management.

Here is the move that ties this article to that one. API Management does not care whether the model behind it is OpenAI or Claude. Its AI gateway capabilities govern Foundry model deployments across providers, not just Azure OpenAI. So the moment Claude is a Foundry deployment, every control you already built for your GPT traffic extends to it, written once, applied to both.

That means Claude inherits, with no new code:

A spending cap measured in tokens, the unit you actually get billed for, enforced two ways at once. A per-minute speed limit that returns a 429 and tells a runaway agent to wait. A longer quota, hourly through yearly, that returns a 403 and stops it cold when the budget is spent.

Content safety screening on the way in, through the same Prompt Shields integration that catches both the obvious jailbreak and the poisoned instruction hidden in a document the agent read on its own.

Per-consumer cost tracking, so the token spend of the legal team's Claude-powered contract agent shows up tagged and separate from everyone else's, and you can finally answer the question every finance leader eventually asks.

Semantic caching, so a Claude agent that fires the same request fifty different ways in one run stops paying for the same answer fifty times.

One gateway. Two model vendors. One set of policies.

Write the controls once for OpenAI, and Claude inherits them the moment it becomes a Foundry deployment.

Azure API Management gateway controls extended to Claude: token spending caps, content safety screening, per-consumer cost tracking, and semantic caching applied across both OpenAI and Claude traffic

Microsoft has gone a step further and built an AI gateway directly into Foundry's control plane. From the Foundry portal you can set token-per-minute limits and quotas on a model deployment, including Claude, and route the traffic through an associated API Management instance. Limits apply at the project level, so one team's Claude usage cannot starve another's. For a platform team, this is the difference between "we have Claude somewhere" and "we know exactly who is using Claude, what it can reach, what it costs, and where the log is."

Be honest about the edges

This is preview technology wearing an enterprise suit. If you skip this section you will get surprised later, so do not skip it.

The honest edges of Claude on Foundry: preview status, two regions only, Messages API rather than the full Anthropic platform, and v2-tier API Management required for cross-provider governance

Foundry support for Claude is in preview, and the footprint is narrow. Two regions today. A paid subscription with an active pay-as-you-go billing method, which rules out several common subscription types, including credit-only and some partner arrangements. Treat region and subscription support as the first thing you verify, not the last.

Foundry runs the Messages API, not Anthropic's full platform. If your plan depends on Anthropic's server-managed agent features or its server-side refusal fallbacks, those do not run on Foundry. The pattern that does work is the Messages API with your own tool-use loop, which is the right pattern for most enterprise agent work anyway. Just do not architect around a capability the deployment does not expose.

Governing Claude through API Management needs the newer version. This is the same catch I flagged in the governance article, and it bites here too. The cross-provider AI gateway features live on the v2 tiers of API Management. A company that stood up its gateway on an older tier for OpenAI, then wants to bring Claude under the same controls, has migration work to do first. The lesson has not changed: if Claude is anywhere on your roadmap, start on a v2 tier now.

Some of the most useful Foundry-integrated gateway pieces are themselves preview. The single control room that inventories every model, agent, and tool you run is arriving, not arrived. Plan for it; do not yet depend on it.

None of this makes the capability unusable. The core path, deploy Claude, point a governed client at it, cap and screen and track it through API Management, works today. The edges are about knowing which parts are finished.

What to do this week

Confirm the boundary problem you already have. Ask your engineering leads, honestly, who is using Claude Code or the Claude API on personal accounts. The answer is almost never zero. That is your shadow AI exposure, and it is the reason this matters.

Stand up one Claude deployment in Foundry. Use the starter kit. One azd up, a non-production project, a single Sonnet deployment. You are proving the path, not rolling out to the org.

Point one developer's Claude Code at it. Set the five environment variables, run az login, and have them work normally for a day. Confirm the traffic shows up in your Foundry resource and not on a personal account.

Check your API Management tier. If you are on a v2 tier, you can put Claude behind the same policies as your OpenAI traffic now. If you are not, that migration is the real prerequisite, and the sooner you know, the better.

Set the two limits that matter. A per-minute speed cap and a monthly token quota on the Claude deployment, scoped to the team using it. This is the control that turns "we have Claude" into "we have Claude, governed."

The point

The interesting fact is that Claude runs on Azure. The useful fact is that running it there puts it behind the same gateway you already trust with everything else.

Your developers want the best models. Your security team wants the boundary. For two years those have felt like opposing forces, and the gap between them is exactly where shadow AI grows. Claude on Foundry, fronted by API Management, closes it. The model your teams actually want now lives inside the controls your organization actually needs.

So, again, in case nobody mentioned it: you can run Claude on Azure. The only real question is whether it runs through your gateway or around it.

Matthew Kruczek is Managing Director at EY, leading Microsoft domain initiatives within Digital Engineering. Connect with Matthew on LinkedIn to discuss bringing third-party models under enterprise governance for your organization.

References

Microsoft Learn. "Deploy and use Claude models in Microsoft Foundry (preview)." learn.microsoft.com
Microsoft Learn. "Configure Claude Code for Microsoft Foundry." learn.microsoft.com
Microsoft Learn. "Claude models in Microsoft Foundry (preview)." learn.microsoft.com
Microsoft Learn. "AI gateway in Azure API Management." learn.microsoft.com
Microsoft Learn. "Enforce token limits for models." learn.microsoft.com
Microsoft Learn. "Configure AI Gateway in your Foundry resources." learn.microsoft.com
Azure-Samples. "Claude on Foundry starter kit." github.com
Kruczek, Matthew. "The Trust Boundary Moved: Azure API Management Is Now Your Agentic Governance Layer." matthewkruczek.ai

Yes, You Can Run Claude on Azure. And Your Gateway Already Knows What to Do With It.

The thing nobody mentioned

Why this is more than a curiosity

How it actually works

The part that earns its keep: putting a gateway in front

Be honest about the edges

What to do this week

The point

References

Continue Reading

The Trust Boundary Moved: Azure API Management Is Now Your Agentic Governance Layer

You Installed Claude. Now the Hard Work Starts.

Tokenomics: Why a Spend Cap Is the Most Expensive Way to Save Money