Two vLLM Flaws: API-Key Bypass (CVE-2026-48746, CVSS 9.1) & Dependency Confusion (CVE-2026-54232)

Top/Articles/Two vLLM Flaws: API-Key Bypass (CVE-2026-48746, CVSS 9.1) & Dependency Confusion (CVE-2026-54232) — Update to 0.22.1

News Updated today

Makoto Horikawa

Backend Engineer / AWS / Django

2026.06.237 min1 views

Share X Hatena LINE LinkedIn RSS

Key takeaways

vLLM, the go-to engine for self-hosting LLMs, has two critical flaws. CVE-2026-48746 (CVSS 9.1) lets attackers bypass the API key and use the AI API without authentication; CVE-2026-54232 (CVSS 8.8) is a Docker-build dependency confusion that runs code as root. Updating to 0.22.1 resolves both.

vLLM, the go-to software for serving large language models (LLMs) at high speed, has two high-severity flaws. The first, CVE-2026-48746 (CVSS 9.1), lets anyone bypass the configured API key and use the AI API. The second, CVE-2026-54232 (CVSS 8.8), allows a third party's code to slip in during a Docker build. Both were published on June 22, 2026.

vLLM lets you run an LLM on your own servers and expose an "OpenAI-compatible API," and it is widely used across AI development and operations. The fixes landed in 0.22.0 (auth bypass) and 0.22.1 (dependency confusion), so updating to 0.22.1 or later resolves both.

Software	vLLM (LLM inference / API server)
Flaw 1	CVE-2026-48746 / API-key bypass (auth bypass) CVSS 9.1 · no authentication
Flaw 2	CVE-2026-54232 / dependency confusion → root-level code execution at build CVSS 8.8
Affected	1) 0.3.0 – <0.22.0 / 2) <0.22.1
Fixed in	0.22.1 and later (resolves both)
Published	June 22, 2026

Who is at risk, and what is the damage

The one with the broadest reach is the API-key bypass (CVE-2026-48746). The target is operators who expose a vLLM API server and rely on "just the API key" for protection. vLLM has a setting (--api-key or VLLM_API_KEY) meant to ensure "only holders of this key can use the API," and many teams assume that is enough.

But exploiting this flaw, an attacker can use the vLLM API freely by slipping past the authentication check, without knowing the API key. No login and no user action are needed. A door you thought was locked turns out to be walkable-around from the side.

The damage takes three main forms. First, since running an LLM needs expensive GPUs, free-riders can flood it with requests, wasting compute, blowing up costs, and slowing or downing the service. Second, the models hosted on that vLLM, the embedded instructions (system prompts), and internal data can be accessed without authorization. Third, an LLM stood up for internal use can be quietly used by outsiders, becoming a springboard for confidential exchanges or inappropriate generation. That is why the update and exposure review below are urgent.

What vLLM actually is

vLLM is an inference and serving engine that runs LLMs fast and memory-efficiently. Born in a lab at UC Berkeley, it is now a staple of AI infrastructure with about 75,000 GitHub stars and over 2,000 contributors. Its signature feature is an "OpenAI-compatible API," letting you point code written for OpenAI's API at a vLLM you run yourself.

As a convenient base for companies and researchers who want to run LLMs on their own servers, it also opens a door to the outside as an API server — so when that entrance's defense is breached, the impact is large. vLLM has had vulnerabilities found and fixed on an ongoing basis (for example a remote code execution via video processing, CVE-2026-22778), so users need a routine of tracking updates.

What is happening, technically

CVE-2026-48746: an auth bypass that skips the API key (CVSS 9.1)

It is classified as CWE-444 (HTTP request/response smuggling) and was found in a source-code audit by security firm X41 D-Sec. vLLM's authentication middleware decides whether a request targets a protected endpoint (the /v1 API) based on the URL path. But that path is reconstructed from the trust between the front ASGI web server and Starlette, so an attacker can exploit the mismatch in that reconstruction to make a request look as if it is outside the authenticated scope. The result: reaching the /v1 API without passing the configured API key. The attack is over the network, with no authentication and no user interaction. Affected versions are 0.3.0–<0.22.0; the fix is 0.22.0.

CVE-2026-54232: dependency confusion in the Docker build → root-level code execution (CVSS 8.8)

This is CWE-427 (Uncontrolled Search Path Element) — a "dependency confusion." vLLM's Dockerfile pulls a package called flashinfer-jit-cache from a custom index, but that package is not registered on PyPI (the standard Python package repository), and a loose global setting (UV_INDEX_STRATEGY="unsafe-best-match") was in effect. So if an attacker registers a package of the same name on PyPI, it gets pulled in during the Docker image build, allowing arbitrary code execution as root on the build machine. Exploitation requires someone to build the image (UI:R), but in environments that auto-build via CI/CD this is a realistic threat. The fix is 0.22.1.

Confirmed vs. still unknown

✓ Confirmed facts

✓CVE-2026-48746 bypasses API-key protection without authentication (CWE-444, CVSS 9.1, affects 0.3.0–<0.22.0, fixed 0.22.0) (NVD)
✓CVE-2026-54232 is a Dockerfile dependency confusion → root RCE at build (CWE-427, CVSS 8.8, fixed 0.22.1) (NVD)
✓Updating to 0.22.1 or later resolves both

? Not yet confirmed

?Whether exploited in the wild — not on CISA KEV at the time of writing
?Severity ratings differ — NVD scores CVE-2026-48746 at CVSS 9.1, while the project's GHSA initially rated it Moderate. Either way, an auth bypass warrants action

What to do now

The top priority is to update vLLM to 0.22.1 or later, which resolves both the auth bypass and the dependency confusion. If you run anything before 0.22.1, treat it as affected, test or production.

As a stopgap for the API-key bypass, do not expose the vLLM API server directly to the internet — keep it behind your internal network or a VPN, and put real authentication in front of it (a reverse proxy or API gateway). The flaw is a reason to drop the assumption that "setting an API key makes it safe to expose." For the dependency confusion, which matters if you build your own Docker image, pin and explicitly specify the source index, enable package hash verification, and claim your own build namespace. Keeping a way to continuously check your OSS dependencies helps you spot supply-chain problems like this quickly.

Summary

Of vLLM's two disclosed flaws, CVE-2026-48746 is a CVSS 9.1 auth bypass that lets attackers use the AI API without authentication by skipping the configured API key. CVE-2026-54232 is a dependency confusion (CVSS 8.8) where a third party's package slips into the Docker build and runs as root on the build machine. Both are resolved by updating to 0.22.1.

As running your own LLM becomes routine, both the entrance (the API server) and the build process were targeted. Beyond updating, use this moment to review how exposed your API is and the provenance of the packages your build pulls in.