DeepSeek-V4 Preview Released: 1M Context, Two Models, and API Migration Notes

DeepSeek released DeepSeek V4 Preview Release on 2026-04-24. Based on the official announcement page, the update is centered on a few very clear themes: 1M context, a two-model lineup with V4-Pro and V4-Flash, dedicated optimization for agent scenarios, and API-side model migration.

If we reduce the release to one sentence, the main signal is this: DeepSeek is not just trying to make a stronger model. It is pushing ultra-long context and agent capabilities toward something that is ready for practical deployment.

1. What was released this time

According to the official page, DeepSeek-V4 Preview mainly includes two product lines:

DeepSeek-V4-Pro
DeepSeek-V4-Flash

The official descriptions are also very direct:

DeepSeek-V4-Pro: 1.6T total / 49B active params
DeepSeek-V4-Flash: 284B total / 13B active params

The naming already makes the strategy clear. This is not a single-model upgrade. DeepSeek is launching a higher-end model and a more cost-efficient model at the same time.

V4-Pro is positioned around performance ceiling, with DeepSeek saying it can compete with the world’s top closed-source models. V4-Flash, by contrast, is positioned around speed, efficiency, and lower cost, making it more suitable for workloads that care more about latency and API pricing.

2. `1M context` is the most visible headline

One of the most prominent lines on the official page is: “Welcome to the era of cost-effective 1M context length.”

DeepSeek is not merely saying the model supports long context. It is presenting 1M context as a default capability of this generation. The page is explicit that:

1M context is now the default standard across official DeepSeek services
Both V4-Pro and V4-Flash support 1M context

The importance of this is not just that you can fit more tokens. It directly affects tasks like:

Understanding large codebases
Long-document Q&A and information synthesis
Multi-turn agent workflows
Complex tasks spanning multiple files, tools, and stages

When the context window is large enough, the model is less likely to lose context midway and re-read material repeatedly. That matters a lot for agentic coding and complex knowledge work.

3. What `V4-Pro` is mainly emphasizing

From the wording on the official page, DeepSeek-V4-Pro focuses on three things:

Agentic coding capability
World knowledge
Reasoning ability

The page says V4-Pro reaches open-source SOTA on agentic coding benchmarks. It also claims leadership among current open models in world knowledge, trailing only Gemini-3.1-Pro, and states that its math, STEM, and coding performance surpasses current open models while rivaling top closed-source models.

In other words, V4-Pro is not positioned as a simple question-answering model. It is aimed much more at high-difficulty reasoning, complex coding, and long-horizon task execution.

4. `V4-Flash` is not just a cut-down version

Another notable point is that DeepSeek does not present V4-Flash as a low-end model. Instead, it stresses that the model is already strong enough for many practical tasks.

According to the announcement, V4-Flash:

Has reasoning ability that comes close to V4-Pro
Performs on par with V4-Pro on simple agent tasks
Uses fewer parameters, responds faster, and is more economical for API usage

That means the lineup is not a very split “one flagship, one entry-level” structure. It is closer to:

V4-Pro: optimize for higher performance and a stronger ceiling
V4-Flash: optimize for lower latency and better cost efficiency

For developers, that is often a more practical combination, because many production tasks do not need the absolute strongest model in theory. They need something strong enough, fast enough, and affordable enough.

5. The release puts clear emphasis on agent optimization

Another strong signal from the announcement page is that DeepSeek is actively pushing V4 toward agent use cases.

The page says DeepSeek-V4 has been seamlessly integrated with several leading AI agents, including:

Claude Code
OpenClaw
OpenCode

DeepSeek also says that V4 is already being used in its in-house agentic coding workflows.

That means the target is no longer limited to chat or ordinary completion. The model is being positioned for longer workflows: reading code, understanding structure, calling tools, generating outputs, and connecting the whole process together.

If you have been paying attention to coding agents recently, this is worth noticing. Model providers are no longer only competing on benchmarks. They are also competing on whether the model can actually plug into real workflows.

6. Structural innovation is serving long context efficiency

On the technical side, the page summarizes this release’s structural work as:

token-wise compression
DSA (DeepSeek Sparse Attention)

The direction is clear: make long context cheaper and more efficient while reducing compute and memory cost as much as possible.

The announcement page does not go into full technical detail, but it at least suggests that DeepSeek is not relying only on brute-force scaling to support longer windows. It is also making architecture-level optimizations specifically for long-context efficiency.

For actual users, that often matters more than just seeing a bigger context number, because real usability depends on more than whether 1M is technically available. It also depends on:

Whether speed stays acceptable
Whether cost stays acceptable
Whether long-context tasks remain stable in practice

7. The API is already available, but model migration matters

The official page clearly states that the API is available today.

The migration path is also relatively simple:

Keep the same base_url
Switch the model name to deepseek-v4-pro or deepseek-v4-flash

The page also says both models support:

1M context
Dual Thinking / Non-Thinking modes
OpenAI ChatCompletions
Anthropic APIs

That means if you already use the DeepSeek API, the upgrade path is not especially difficult. The main work is updating model names and validating behavior.

8. The retirement schedule for old models is explicit

For developers, one of the most important details on the page is actually the retirement notice for older models.

DeepSeek explicitly says:

deepseek-chat
deepseek-reasoner

will be fully retired and inaccessible after July 24, 2026, 15:59 UTC.

The page also notes that these two models are currently being routed to the non-thinking and thinking modes of deepseek-v4-flash.

That means if your project still directly references deepseek-chat or deepseek-reasoner, now is the time to plan the migration instead of waiting until the formal shutdown date gets close.

9. How this release is worth reading

If we compress the update into a few main takeaways, they look like this:

DeepSeek is turning 1M context from a premium feature into a default standard
The two-model strategy is clearer: one targets performance ceiling, one targets speed and cost efficiency
Agent capability has been moved into a very central role
The API upgrade path is relatively direct, but the old-model retirement timeline needs attention soon

For general users, the most visible change may be that long documents, long code contexts, and long workflows become easier to fit into one session.
For developers, the more important point is that if you are already building agents, coding assistants, knowledge workflows, or complex automation pipelines, this generation is very clearly designed for those scenarios.

This is not just a routine model update from DeepSeek. It reads more like a clearer statement of its next product direction: ultra-long context, agent optimization, and more practical API readiness.

DeepSeek official news page: https://api-docs.deepseek.com/news/news260424
Tech Report: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
Open Weights: https://huggingface.co/collections/deepseek-ai/deepseek-v4

1. What was released this time

2. 1M context is the most visible headline

3. What V4-Pro is mainly emphasizing

4. V4-Flash is not just a cut-down version