Qwen's former lead argues AI is moving from reasoning to agents
MarkTechPost reports on Junyang Lin's view that hybrid thinking is being overtaken by agentic systems trained around environments, tools, and rewards.
Read more
MarkTechPost reports on former Qwen lead Junyang Lin's argument that AI progress is shifting from training better stand-alone reasoning models toward training agents. Lin says agentic reinforcement learning needs infrastructure, environments, and reward systems rather than just hybrid thinking modes. The piece also flags reward hacking as a central unresolved risk as teams move more capability into long-running tool-using agents.
Key details: Junyang Lin previously led work on Qwen, His thesis is that the field is moving from model training toward agent training, The argument centers on environments, reward design, and reward-hacking risk.
Why it matters: It captures a live strategy shift in frontier AI: from bigger reasoning modes toward infrastructure for autonomous agents.