Question 1

From DailyListen, I'm Alex. DeepSeek just dropped V4, their latest AI model, and it's got the tech world buzzing. A Chinese startup claims this 1-trillion-parameter beast matches top U.S. models on coding benchmarks while running way cheaper. Stocks dipped, VCs are calling it a wake-up call like Sputnik in 1957. Why does this matter now, especially with U.S.-China tensions? We're joined by Priya, our technology analyst, who tracks how these models shift real-world power in AI.

Accepted Answer

What this unlocks is handling million-token prompts that dwarf what most models manage. DeepSeek V4 packs a 1M-token context window—think entire codebases or hour-long video transcripts in one go. Paired with Engram conditional memory, it remembers key patterns across those lengths without bloating compute. Internal benchmarks from Reuters show it beating Claude and GPT series on extremely long code prompts. Pre-release claims hit 80-85% on SWE-bench, up from V3's levels. For developers, that means debugging massive repos without chopping them up. But here's the catch: these are company-reported numbers. Independent tests on Arena.ai put V4 Pro third among open-sources, 14th overall in code arena—strong, but not unchallenged.

Question 2

That million-token context sounds huge. A million words is like five thick novels. How does Engram make that practical without exploding costs?

Accepted Answer

Engram offloads static retrieval to cheaper DRAM, so it doesn't hammer the GPU every time. V4's MoE architecture activates just 32B parameters per token out of 1T total—most of the model sleeps. Add Huawei Ascend chips, which cost less per inference hour than Nvidia A100 or H100 clusters. Result? V4 runs inference dirt cheap compared to U.S. rivals. A V4 Lite at 200B parameters keeps the 1M context but scales down for broader use. Sitepoint notes a 10-fold leap over V3.2 on some benchmarks—V3.2 scored just 5 points, no typo. Developers get Claude-level coding help without the premium price tag.

Question 3

Huawei chips instead of Nvidia—that's bold, given U.S. sanctions. Does this prove China can ditch Nvidia dependence?

Accepted Answer

V4's the first DeepSeek model tuned for Huawei Ascend, testing if homegrown silicon closes the gap. DeepSeek skipped giving Nvidia or AMD early access, per The Information—unusual, since chipmakers usually optimize ahead. But Ascend's lower inference costs give V4 a pricing edge. Morphllm.com details three architecture tweaks over V3: bigger MoE sparsity, Engram memory, native multimodal for text, images, videos. It topped Arena.ai's Vibe Code Benchmark as the number one open-source weighted model, beating Kimi K2.6 and even Gemini 3.1 Pro. Still, no word on training compute or dataset details—those gaps leave questions on how they hit these scores without Nvidia-scale resources.

Question 4

V4 claims 80-85% SWE-bench and outperforms on long code. What's the everyday impact for, say, a software engineer?

Accepted Answer

Engineers feed V4 a full 1M-token repo—tens of thousands of lines—and get fixes that span files. Native multimodal handles code plus screenshots or video walkthroughs of bugs. Arena.ai calls it a "significant leap" from V3.2; it "overwhelmingly" led open-source in Vibe Code. Vs. GPT-5.4 or Gemini-3.1, internal tests show long-prompt wins. Pricing stays low via MoE's 32B active params and Ascend efficiency. A busy dev skips manual slicing, cuts hours off refactors. Drawback: censorship blocks some prompts, and no post-release user data confirms daily wins over Claude.

Question 5

No training details in reports—no compute FLOP, dataset size, or cost. How'd a sanctioned startup build a 1T-param model?

Accepted Answer

Gaps abound there. DeepSeek dodged Nvidia bans via efficiency tricks like MoE and multi-token prediction from R1 days—predicts multiple tokens at once, no feedback loop. Lee noted most models predict one word; DeepSeek trains for chains. V3 proved small teams beat GPU hordes. V4 likely scaled that on Ascend clusters, but without numbers, it's guesswork. They boasted parity with OpenAI pre-release, sparking Monday's stock frenzy. Success? Arena.ai ranks confirm coding strength. But unverified training leaves skeptics wondering if it's sustainable at this scale.

DeepSeek V4: China’s New AI Powerhouse Explained

From DailyListen, I'm Alex

Open-source weights make it accessible, but Taiwan...

Multimodal input for video too—that's new

Frenzy upended markets Monday after last month's hype

Sources

Original Article