Binary Analysis Deep Dive

How to read minified JavaScript inside compiled binaries, and the reasoning chain that turned cryptic variable names into working API calls.

← Back to main OAuth guide

What Kind of Binary Is This?
The strings Command — Your X-Ray Vision
Understanding JavaScript Minification
Search Strategy — How to Find What You Need
Decoding VW8 — The OAuth Config Object
Decoding CH9 — The Beta Header
Decoding eK_ — The OAuth Scopes
Decoding the Refresh Request Shape
Connecting Code to curl — The Translation Process
General Binary Analysis Techniques

1. What Kind of Binary Is This?

The first step in any binary analysis is understanding what you're looking at:

Identifying the binary type

$ file ~/.local/share/claude/versions/2.1.87
Mach-O 64-bit executable arm64

This tells us it's a native macOS ARM binary. But Claude Code is written in TypeScript/JavaScript. So how is it a native binary?

Key Concept: Bun-compiled JavaScript

Claude Code uses Bun (a JavaScript runtime) to compile the TypeScript source into a single native executable. The process works like this:

Bundle: All JS/TS source files are bundled into a single file (like webpack/esbuild)
Minify: Variable names are shortened, whitespace removed
Embed: The minified JS is embedded as a string resource inside a native Bun runtime binary

This means the entire JavaScript source code — minified but structurally intact — is sitting inside the binary as readable ASCII text. The strings command can extract it.


  ┌─────────────────────────────────────────────────────────┐
  │               Mach-O Binary (arm64)                      │
  │  ┌──────────────────────────────────────────────────┐   │
  │  │  Bun Runtime (compiled C/Zig code)               │   │
  │  │  - JavaScript engine                              │   │
  │  │  - Node.js compatibility layer                    │   │
  │  │  - HTTP client, crypto, fs, etc.                  │   │
  │  └──────────────────────────────────────────────────┘   │
  │  ┌──────────────────────────────────────────────────┐   │
  │  │  Embedded JavaScript Bundle (minified)             │   │
  │  │                                                    │   │
  │  │  This is what strings extracts!                   │   │
  │  │                                                    │   │
  │  │  var VW8={BASE_API_URL:"https://...",          │   │
  │  │      TOKEN_URL:"https://...",                      │   │
  │  │      CLIENT_ID:"9d1c250a-..."};                    │   │
  │  │  var CH9="files-api-2025-04-14,oauth-...";     │   │
  │  │  var eK_=["user:inference",...];               │   │
  │  │  ...hundreds of thousands of lines...              │   │
  │  └──────────────────────────────────────────────────┘   │
  │  ┌──────────────────────────────────────────────────┐   │
  │  │  Other embedded resources (icons, etc.)           │   │
  │  └──────────────────────────────────────────────────┘   │
  └─────────────────────────────────────────────────────────┘

2. The `strings` Command — Your X-Ray Vision

What does strings do?

The strings command scans a binary file and extracts every sequence of printable ASCII characters that is at least 4 characters long (by default). It doesn't "decompile" anything — it just finds readable text embedded in the raw bytes.

For a Bun-compiled JS app, strings extracts:

The entire minified JavaScript source (string literals, object keys, URLs, etc.)
Error messages and log strings
Bundled dependency code (AWS SDK, Azure SDK, etc. — lots of noise)
Native Bun runtime strings

The scale of output

How much text is embedded?

$ strings ~/.local/share/claude/versions/2.1.87 | wc -l
~2,400,000 lines

$ strings ~/.local/share/claude/versions/2.1.87 | wc -c
~180 MB of text

That's 2.4 million lines of extracted strings. You cannot read this manually. You must use targeted search patterns (grep) to find what you need.

The Noise Problem

Claude Code bundles many third-party libraries (AWS SDK, Azure Identity SDK, various Node.js packages). A naive search like grep "token" returns tens of thousands of results from these libraries. The art of binary analysis is crafting searches that filter out the noise and find Anthropic-specific code.

3. Understanding JavaScript Minification

Before we can interpret what we find, we need to understand what minification does to JavaScript code.

What a bundler/minifier does

// ── ORIGINAL SOURCE CODE (what developers wrote) ──────────────

const OAUTH_CONFIG = {
  BASE_API_URL: "https://api.anthropic.com",
  TOKEN_URL: "https://platform.claude.com/v1/oauth/token",
  CLIENT_ID: "9d1c250a-e61b-44d9-88ed-5944d1962f5e",
};

const BETA_FEATURES = "files-api-2025-04-14,oauth-2025-04-20";
const API_VERSION = "2023-06-01";

const DEFAULT_SCOPES = [
  "user:inference",
  "user:inference:claude_code",
  "user:sessions:claude_code",
  "user:mcp_servers",
  "user:file_upload"
];

// ── AFTER MINIFICATION (what ends up in the binary) ───────────

VW8={BASE_API_URL:"https://api.anthropic.com",TOKEN_URL:"https://platform.claude.com/v1/oauth/token",CLIENT_ID:"9d1c250a-e61b-44d9-88ed-5944d1962f5e"}
CH9="files-api-2025-04-14,oauth-2025-04-20"
bH9="2023-06-01"
eK_=["user:inference","user:inference:claude_code","user:sessions:claude_code","user:mcp_servers","user:file_upload"]

What gets renamed vs. what stays

Element	Renamed?	Why?
Variable names `OAUTH_CONFIG` → `VW8`	Yes — always	Safe to rename because only internal code references them
Object property keys `BASE_API_URL`, `CLIENT_ID`	No — usually preserved	Properties might be accessed via dynamic string lookup (`obj["CLIENT_ID"]`), or used in APIs — the minifier can't safely rename them
String literals `"https://api.anthropic.com"`	No — never	Strings are data, not identifiers. Renaming them would change program behavior
String keys in JSON/requests `"grant_type"`, `"refresh_token"`	No — never	These are wire protocol names — they must match what the server expects
Function names `refreshAccessToken` → `_S`	Yes — usually	Internal function references are safe to shorten

The Golden Rule of Binary Analysis

String literals and object property names survive minification. This means URLs, API keys, header values, request field names, error messages, and scope strings are all readable in the binary exactly as the developer wrote them. Only variable names and function names get mangled.

4. Search Strategy — How to Find What You Need

With 2.4 million lines to search, strategy matters. Here's the general approach:

Strategy: Start with what you know, follow the thread

Anchor on Known Strings

Search for strings you already know must exist — URLs, domain names, header names that you saw in HTTP responses, error messages you received.

Read the Surrounding Context

When you find a hit, look at what's around it. Minified code is dense — a single line often contains an entire function or config object.

Identify Structure

Object literals {key:"value",...}, array literals [a,b,c], and function signatures retain their shape. You can read the structure even with mangled variable names.

Test Your Hypothesis

Turn your reading into a curl command and test it. A 200 or 429 means you're right. A 400 means the format is wrong. A 404 means wrong endpoint.

Filtering out noise

The binary contains massive amounts of third-party code. Key filtering patterns:

# GOOD — target Anthropic-specific strings
strings binary | grep -E "anthropic|claude" | grep -v "github\|docs\|mailto"

# GOOD — search for URLs with specific path patterns
strings binary | grep -oE "https://[^ \"']*anthropic[^ \"']*"

# GOOD — extract JSON-like object literals
strings binary | grep -oE '\{[^}]*refresh_token[^}]*\}'

# BAD — too broad, drowns in AWS/Azure SDK results
strings binary | grep "token"         # → tens of thousands of hits
strings binary | grep "refresh"       # → hundreds of Azure MSAL hits
strings binary | grep "client_id"     # → Azure, Google, Notion, GitHub hits

The -oE Flag is Your Best Friend

grep -oE 'pattern' prints only the matched portion, not the entire line. Since minified JS lines can be hundreds of thousands of characters long, this is essential. Without -o, a single match could dump 500KB of text.

5. Decoding `VW8` — The OAuth Config Object

This is the most information-rich discovery. Let me walk through exactly how I found it and how I knew what it meant.

The search that found it

I was looking for the eK_ variable (the scopes array) because an earlier search had shown it held OAuth scopes. I searched for its name:

$ strings /path/to/claude | grep "eK_" | head -5

One of the results contained this massive line (reformatted for readability):

The raw extracted code Annotated

eK_=[A1H,_S,
  "user:sessions:claude_code",
  "user:mcp_servers",
  "user:file_upload"],

a96=Array.from(new Set([...EW8,...eK_])),

VW8={
  BASE_API_URL:"https://api.anthropic.com",
  CONSOLE_AUTHORIZE_URL:"https://platform.claude.com/oauth/authorize",
  CLAUDE_AI_AUTHORIZE_URL:"https://claude.com/cai/oauth/authorize",
  CLAUDE_AI_ORIGIN:"https://claude.ai",
  TOKEN_URL:"https://platform.claude.com/v1/oauth/token",
  API_KEY_URL:"https://api.anthropic.com/api/oauth/claude_cli/create_api_key",
  ROLES_URL:"https://api.anthropic.com/api/oauth/claude_cli/roles",
  CONSOLE_SUCCESS_URL:"https://platform.claude.com/buy_credits?...",
  CLAUDEAI_SUCCESS_URL:"https://platform.claude.com/oauth/code/success?app=claude-code",
  MANUAL_REDIRECT_URL:"https://platform.claude.com/oauth/code/callback",
  CLIENT_ID:"9d1c250a-e61b-44d9-88ed-5944d1962f5e",
  OAUTH_FILE_SUFFIX:"",
  MCP_PROXY_URL:"https://mcp-proxy.anthropic.com",
  MCP_PROXY_PATH:"/v1/mcp/{server_id}"
}

How I knew what `VW8` was

Observe

The object has property names like BASE_API_URL, TOKEN_URL, CLIENT_ID, AUTHORIZE_URL. These are SCREAMING_SNAKE_CASE constants — a convention for configuration constants.

Observe

The values are all URLs related to OAuth: authorization endpoints, token endpoints, redirect URLs. This is clearly an OAuth configuration object.

Infer

Property names survived minification because the minifier detected they were accessed by string key elsewhere (e.g., config["CLIENT_ID"] or via Object.keys()). The variable name VW8 is meaningless — it was originally something like OAUTH_DEFAULTS or CLAUDE_CONFIG.

Infer

TOKEN_URL points to platform.claude.com, not api.anthropic.com. This tells us token operations (refresh, exchange) go to a different host than API calls.

Verify

We tested curl -s https://platform.claude.com/v1/oauth/token ... and got a 429 (rate limited) instead of a 404 (not found), confirming the endpoint exists and accepts requests.

6. Decoding `CH9` — The Beta Header

This was the single most critical discovery — without this header, nothing works.

How I found it

After the first API call failed with "OAuth authentication is currently not supported", I knew there was something extra needed. I searched for beta-related strings:

$ strings /path/to/claude | grep -E "anthropic-beta|oauth-20" | sort -u

But this returned noise. The breakthrough came from a broader URL-extraction search:

$ strings /path/to/claude | grep -E "api\.anthropic\.com" \
  | grep -v "github\|docs\|mailto\|billing" | sort -u

This returned a large chunk of minified code, and within it was this fragment:

CH9="files-api-2025-04-14,oauth-2025-04-20",bH9="2023-06-01"

How I knew these were HTTP headers

Observe the Values

"files-api-2025-04-14,oauth-2025-04-20" — This is a comma-separated list of feature identifiers with dates. Anthropic uses this exact format for their anthropic-beta header (documented in their public API docs for features like "prompt caching").

Observe the Sibling

"2023-06-01" is right next to it. This is the well-known anthropic-version header value — it's in every Anthropic API example. Two header values defined side by side strongly suggests they're used together in request headers.

Pattern Match

The value "oauth-2025-04-20" contains "oauth" — and our problem was that OAuth didn't work. This beta flag literally says "enable OAuth" with a date. This is the missing piece.

Variable Naming Pattern

CH9 and bH9 share a suffix (H9). In minified code, variables defined near each other often get sequential names. Two header values would be defined in the same block of code → similar minified names.

Verify

Added anthropic-beta: files-api-2025-04-14,oauth-2025-04-20 to the curl request → HTTP 200. The exact same request without it → HTTP 401. Confirmed.

Why the minifier didn't rename the string values

Remember: CH9 is a variable name — the minifier renamed it. But "files-api-2025-04-14,oauth-2025-04-20" is a string literal — it's data that gets sent over the wire. The minifier can never change string values because that would break the program. This is why binary analysis of JS apps is so productive: all the important data survives minification.

7. Decoding `eK_` — The OAuth Scopes

The search

To find the refresh token request format, I searched for code patterns containing refresh_token as a JSON/object key:

$ strings /path/to/claude | grep -oE '\{[^}]*refresh_token[^}]*\}' | grep -v azure -i

This clever regex (\{[^}]*refresh_token[^}]*\}) matches any {...} block containing the text refresh_token, extracting just the object literal. The -v azure filters out Azure SDK noise.

One of the results was:

{let q={grant_type:"refresh_token",refresh_token:H,client_id:m8().CLIENT_ID,scope:((_?.length)?_:eK_).join(" ")}}

How I decoded this line

Observe the Object Shape

This is an object literal being assigned to q. The keys are: grant_type, refresh_token, client_id, scope. These are standard OAuth2 token request parameters — they match the RFC 6749 Section 6 spec exactly.

Observe m8().CLIENT_ID

m8() is a minified function call that returns an object with a CLIENT_ID property. Since we already found VW8 has a CLIENT_ID property, m8() likely returns VW8 (or a wrapper around it). The minifier renamed the function but kept the property name.

Observe the Scope Logic

((_?.length) ? _ : eK_).join(" ") means: "If a scopes argument was passed, use it; otherwise use the default scopes array (eK_), then join with spaces." This tells us eK_ is the default scopes array.

Infer the Full Request

Combining everything: POST to VW8.TOKEN_URL with body grant_type=refresh_token&refresh_token=...&client_id=VW8.CLIENT_ID&scope=eK_.join(" ")

Then I searched for eK_ to find its definition:

eK_=[A1H,_S,"user:sessions:claude_code","user:mcp_servers","user:file_upload"]

Three string literals are visible directly. A1H and _S are minified references. From another nearby line:

eK_,CLAUDE_AI_INFERENCE_SCOPE:()=>_S,ALL_OAUTH_SCOPES:()=>a96

The export name CLAUDE_AI_INFERENCE_SCOPE tells us _S = "user:inference:claude_code" (or similar). And by process of elimination with standard Anthropic scope naming, A1H = "user:inference".

8. Decoding the Refresh Request Shape

Beyond the request body, I also found the response shape in the same search:

{access_token:O,refresh_token:T=H,expires_in:z}

Decoding the destructuring

This is a JavaScript destructuring assignment. In readable form:

// Minified:
{access_token:O, refresh_token:T=H, expires_in:z}

// De-minified (what the developer wrote):
{
  access_token: newAccessToken,        // O = the new access token
  refresh_token: newRefreshToken = oldRefreshToken,  // T=H means: default to old if not present
  expires_in: expiresIn               // z = seconds until expiry
}

Reading Destructuring with Defaults

refresh_token: T = H means "extract the refresh_token field into variable T; if it's undefined, use H as the default." Since H is the parameter name for the old refresh token (seen in the request object above), this tells us: the server may or may not return a new refresh token — if it doesn't, the old one is still valid.

9. Connecting Code to curl — The Translation Process

Here's how each piece of minified JS maps to the final curl command:

For the usage check (API call)

Minified Code Fragment	What It Means	curl Translation
`VW8.BASE_API_URL`	API base URL	`https://api.anthropic.com`
`/v1/messages` (found via URL search)	Messages endpoint path	Appended to base URL
`Authorization: Bearer` + token from keychain	OAuth Bearer auth	`-H "Authorization: Bearer $TOKEN"`
`bH9="2023-06-01"`	API version header	`-H "anthropic-version: 2023-06-01"`
`CH9="files-api-2025-04-14,oauth-2025-04-20"`	Beta features header	`-H "anthropic-beta: oauth-2025-04-20"`

For the token refresh

Minified Code Fragment	What It Means	curl Translation
`VW8.TOKEN_URL`	Token endpoint	`https://platform.claude.com/v1/oauth/token`
`grant_type:"refresh_token"`	OAuth2 grant type	`grant_type=refresh_token`
`refresh_token:H`	The refresh token parameter	`refresh_token=$REFRESH_TOKEN`
`m8().CLIENT_ID` → `VW8.CLIENT_ID`	OAuth client identifier	`client_id=9d1c250a-e61b-44d9-88ed-5944d1962f5e`
`eK_.join(" ")`	Space-separated scopes	`scope=user:inference%20user:inference:claude_code%20...`
`.join(" ")` → scopes joined with spaces	Content type must be form-encoded	`-H "content-type: application/x-www-form-urlencoded"`

How I knew it was form-encoded, not JSON

The code builds a plain object q = {grant_type: ..., ...} and there's no JSON.stringify visible. In the HTTP/OAuth world, when a token endpoint receives parameters, the standard format is application/x-www-form-urlencoded per RFC 6749. I also confirmed this by trying JSON first (which returned 400) and then form-encoded (which returned 429 — valid but rate-limited).

10. General Binary Analysis Techniques

Summary of techniques used, applicable to any Bun/Electron/Node.js compiled binary:

Technique 1: URL Extraction

# Extract all URLs mentioning a domain
strings binary | grep -oE "https://[^ \"']*yourdomain[^ \"']*" | sort -u

When to use: First step in any analysis. Gives you all API endpoints, auth URLs, CDN paths, etc.

Technique 2: Object Literal Extraction

# Extract {key:value} objects containing a specific key
strings binary | grep -oE '\{[^}]*keyword[^}]*\}'

When to use: When you know a field name (like refresh_token) and want to see the full request/response shape.

Limitation: Doesn't work for nested objects (the regex stops at the first }).

Technique 3: Identifier Frequency Analysis

# Find minified names related to a concept
strings binary | grep -oE '[A-Za-z_][A-Za-z0-9_]*Refresh[A-Za-z0-9_]*' | sort | uniq -c | sort -rn

When to use: To get an overview of how a concept (auth, refresh, token) is used across the codebase. High-frequency names are usually important.

Technique 4: Contextual Sibling Search

# Find what's defined near a known string
strings binary | grep -B2 -A2 "known_string"

When to use: When you've found one piece (e.g., the API version string) and want to find related values defined nearby (e.g., the beta header).

Caveat: strings output lines don't correspond to source code lines. Long minified JS might be one enormous string that spans one "line." Use -o (only matching) to manage output size.

Technique 5: Export/Module Boundary Analysis

# Find exported names (these retain semantic meaning)
strings binary | grep -oE '[A-Z_]{2,}:' | sort | uniq -c | sort -rn | head -20

When to use: Module export names (like CLAUDE_AI_INFERENCE_SCOPE) are preserved because they're part of the public API between modules. They give you semantic context for the minified variable names they reference.

Technique 6: Negative Filtering

# Filter OUT known noise sources
strings binary | grep "pattern" | grep -iv "azure\|aws\|google\|msal\|microsoft\|github\|notion"

When to use: Always. Bundled SDKs are the #1 source of noise. Build up a negative filter as you encounter junk libraries. The negative filters I used progressively in this project:

azure — Azure Identity SDK (massive)
aws — AWS SDK
google — Google auth libraries
msal — Microsoft Authentication Library
github — GitHub links in comments
notion — Notion MCP integration

Technique 7: UUID/Key Extraction

# Find UUID-shaped strings (potential client_ids, API keys)
strings binary | grep -oE '[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}' | sort | uniq -c | sort -rn

When to use: When you need to find client IDs, tenant IDs, or other identifiers. Cross-reference with known values from other discoveries.

The Meta-Technique: Iterative Refinement

Binary analysis is never one search. It's a loop:

Search for something you know (a URL, an error message)
Find nearby unknowns (minified variable names)
Search for those unknowns to find their definitions
Those definitions reveal new knowns → go to step 1

Each iteration expands your understanding. The error message "OAuth authentication is currently not supported" → led to searching for "oauth" strings → found CH9 → found its value → found bH9 next to it → led to searching for VW8 → found the entire config → which contained TOKEN_URL and CLIENT_ID.