Skip to main content

Command Palette

Search for a command to run...

Connecting OpenClaw to Snowflake Cortex

Updated
6 min read
Connecting OpenClaw to Snowflake Cortex

I've been playing around with OpenClaw - an open-source AI assistant framework - and wanted to hook it up to Snowflake's Cortex LLM API. The idea: use enterprise-grade models like Claude Sonnet 4.5 through Snowflake's infrastructure while keeping my existing config as a fallback. Sounds straightforward, right?

Well, it kind of is. And kind of isn't. The integration itself is surprisingly clean, but getting there involved a few detours I didn't expect.

Why Cortex?

Here's the thing most people don't realize about Snowflake Cortex: it exposes a Chat Completions API that's a superset of the OpenAI API. That means any tool that supports OpenAI can - in theory - connect to Snowflake with minimal changes. You get Claude, GPT, Llama, Mistral and others through a single endpoint, all billed through Snowflake credits. Plus the usual enterprise goodies: network policies, PAT tokens with role restrictions, audit logging. All built-in.

So far so good.

Setting Up a Least-Privilege Service Account

Rather than reusing an admin account (please don't), I created a dedicated service user. The SNOWFLAKE.CORTEX_USERdatabase role grants access to Cortex LLM functions - nothing more. No data access, no warehouse modification rights.

-- Create a role with only Cortex access
CREATE ROLE JEEVES_ROLE;

-- Grant the Cortex User database role (required for REST API)
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE JEEVES_ROLE;

-- Grant warehouse usage (no OPERATE/MODIFY - just usage)
GRANT USAGE ON WAREHOUSE TASK_WH TO ROLE JEEVES_ROLE;

Since the service runs from a known IP range, I also locked it down with a network rule and policy:

-- Create a network rule for the service's IP range
CREATE OR REPLACE NETWORK RULE ADMIN_DB.NETWORK_POLICY_MGMT.JEEVES_SERVICE
  MODE = INGRESS
  TYPE = IPV4
  VALUE_LIST = ('10.0.1.0/24');  -- Could be a VPC CIDR, a static IP, whatever fits your setup

-- Create a network policy referencing the rule
CREATE NETWORK POLICY JEEVES_POLICY
  ALLOWED_NETWORK_RULE_LIST = (ADMIN_DB.NETWORK_POLICY_MGMT.JEEVES_SERVICE);

Even if the PAT token gets compromised, it can only be used from the allowed IP range. Belt and suspenders.

For the user itself, Snowflake's TYPE = SERVICE is the right choice for programmatic access:

CREATE USER JEEVES
  TYPE = SERVICE
  DEFAULT_ROLE = JEEVES_ROLE
  DEFAULT_WAREHOUSE = TASK_WH
  NETWORK_POLICY = JEEVES_POLICY
  COMMENT = 'AI assistant service account';

GRANT ROLE JEEVES_ROLE TO USER JEEVES;

And for auth, a PAT token with role restriction:

ALTER USER JEEVES ADD PROGRAMMATIC ACCESS TOKEN 
  JEEVES_PAT
  ROLE_RESTRICTION = 'JEEVES_ROLE'
  DAYS_TO_EXPIRY = 365;

The ROLE_RESTRICTION bit is important - it prevents the token from being used with elevated privileges even if someone grants additional roles to the user later. And heads up: the token secret is only shown once at creation time. Save it immediately.

Configuring the Application

The Cortex Chat Completions API endpoint follows this pattern:

https://<org>-<account_name>.snowflakecomputing.com/api/v2/cortex/v1

One thing that tripped me up right away: you need to use your account name, not the account locator. You can check yours with:

SELECT CURRENT_ORGANIZATION_NAME() || '-' || CURRENT_ACCOUNT_NAME();
-- e.g. returns: myorganization-myaccount

So the URL becomes: https://myorganization-myaccount.snowflakecomputing.com/api/v2/cortex/v1

For OpenClaw, I added a new provider using the OpenAI-compatible API:

{
  "env": {
    "CORTEX_API_KEY": "<JEEVES_PAT_TOKEN>"
  },
  "models": {
    "mode": "merge",
    "providers": {
      "cortex": {
        "baseUrl": "https://<orgname>-<account_name>.snowflakecomputing.com/api/v2/cortex/v1",
        "apiKey": "${CORTEX_API_KEY}",
        "api": "openai-completions",
        "models": [
          {
            "id": "claude-sonnet-4-5",
            "name": "Cortex Sonnet 4.5",
            "reasoning": true,
            "input": ["text", "image"],
            "contextWindow": 200000,
            "maxTokens": 16384,
            "compat": {
              "maxTokensField": "max_completion_tokens",
              "supportsDeveloperRole": false
            }
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "cortex/claude-sonnet-4-5",
        "fallbacks": ["snowflake/claude-sonnet-4-5"]
      }
    }
  }
}

See that compat object? Those two settings are doing the heavy lifting. More on that in a second.

The Troubleshooting Journey

Getting this working wasn't exactly a smooth ride. Here's what actually happened.

404 Not Found

My first attempt used the account locator in the URL. 404. The fix was trivially simple once I figured it out - use the account name, not the locator:

SELECT CURRENT_ACCOUNT_NAME();  -- Returns your account name (use this)
SELECT CURRENT_ACCOUNT();       -- Returns the account locator (don't use this!)

"developer messages are not supported"

The proxy revealed it. OpenAI introduced a developer message role for reasoning models like o1. When Openclaw detects a reasoning model ("reasoning": true), it sends system prompts with role: "developer" instead of role: "system". Cortex doesn't support this.

Fix: "supportsDeveloperRole": false in the model's compat settings.

"max_tokens is deprecated"

Earlier testing with curl already revealed this one. OpenAI's newer API uses max_completion_tokens instead of the legacy max_tokens. Cortex follows this convention strictly and will reject requests using the old parameter.

Fix: "maxTokensField": "max_completion_tokens" in the model's compat settings.

And It Works

After applying both fixes:

$ openclaw agent --local -m "Say hello in one word" --session-id test
Hello!

What a relief.

What I Learned

There are a few debugging lessons worth highlighting. First, use a proxy when debugging API integrations - it reveals the actual request and response bodies, especially when responses are compressed. Second, "OpenAI-compatible" doesn't mean identical. Providers implement different subsets of the API, and the edge cases will get you. And third, 400 errors often have descriptive JSON bodies if you can get past the gzip encoding.

Security Layers

For those keeping score, here's what the setup looks like from a security perspective. Authentication goes through a PAT token (not a password). Authorization is role-restricted to CORTEX_USER only. Network access is limited to an IP allowlist via network policy. The service account has zero data access - Cortex API only. The PAT is role-restricted so it can't be used to escalate privileges. The warehouse grants USAGE only with no modify rights. And the token expires after 365 days with rotation capability.

The Result

OpenClaw now uses Snowflake Cortex as its primary LLM provider, with an existing Snowflake Anthropic endpoint as a fallback. All AI inference routes through my Snowflake account, which gives me centralized billing, enterprise audit logging, network-layer security, consistent access to the latest models, and automatic failover if the primary is unavailable.

The key takeaways: use TYPE = SERVICE for programmatic users. Always use ROLE_RESTRICTION on PAT tokens. Network policies add a real extra layer for service accounts. The OpenAI-compatible endpoint at /api/v2/cortex/v1 makes integration with existing tools surprisingly straightforward - once you work through the quirks.

If you're running any tool that speaks OpenAI, connecting it to Cortex is worth the effort.

What's Next

Now that the plumbing is in place, the interesting part begins. Cortex gives you access to a whole range of models through a single endpoint - Claude, GPT, Llama, Mistral, and more. That means you can start matching the right model to the right task instead of throwing your most expensive model at everything.

My plan is to set up specialized agents: a lightweight model like Haiku for quick summarization and triage, something like Llama for code generation where you need fast iteration, and Sonnet for the heavy lifting - complex reasoning, architecture decisions, that kind of thing. Same endpoint, same auth, same billing. You just swap the model ID in the config and you're done.

The cost savings add up quickly. Not every task needs the biggest model, and with Cortex you don't need separate API keys, billing accounts, or provider integrations to find that out. One service account, one network policy, multiple models - each doing what it's best at.

If you're running any tool that speaks OpenAI, connecting it to Cortex is worth the effort.

The write-up on this follows in the next days! - Subscribe today


Resources: