Skip to main content

Tencent Unveils Low-Cost AI Optimization Method

Tencent's Breakthrough in Cost-Efficient AI Optimization

Tencent AI Lab has developed Training-Free GRPO (Gradient-based Policy Optimization), a revolutionary approach to optimizing large language models without traditional parameter fine-tuning. This innovation significantly reduces computational costs while delivering comparable performance improvements.

How Training-Free GRPO Works

The technology converts experiential knowledge into token-level prior information, allowing models to improve without altering their core parameters. By maintaining an external experience knowledge base dynamically, the method enhances capabilities while preserving the main model's architecture.

Image

Performance Improvements

Tests on DeepSeek-V3.1-Terminus showed notable gains:

  • Mathematical reasoning: Accuracy increased from 80% to 82.7% on AIME24 and from 67.9% to 73.3% on AIME25
  • Web search tasks: Pass@1 metric improved from 63.2% to 67.8%

The method achieved these results using just 100 cross-domain training samples, whereas traditional approaches typically require thousands.

Cost Comparison

The financial implications are staggering:

  • Traditional fine-tuning: ~70,000 RMB
  • Training-Free GRPO: ~120 RMB

The savings come primarily from avoiding computationally intensive operations like gradient backpropagation and parameter updates.

Image

Implications for AI Development

This breakthrough could democratize access to advanced AI optimization:

  • Enables smaller organizations with limited resources to enhance model performance
  • Maintains model generalization across domains
  • Opens new possibilities for efficient continuous learning systems

The research team acknowledges that further testing is needed across broader task categories beyond mathematical reasoning and information retrieval.

Paper Reference: Training-Free GRPO on arXiv

Key Points:

  • Achieves similar results as traditional fine-tuning at <0.2% of the cost
  • Works by updating external knowledge bases rather than model parameters
  • Demonstrated effectiveness in mathematical and search tasks
  • Particularly valuable for resource-constrained organizations

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

News

Google's AI Turns News Reports into Flood Warnings for Vulnerable Regions

Google has developed an innovative flood prediction system by analyzing millions of news articles with its Gemini AI. The technology transforms qualitative reports into quantitative data, creating early warnings for areas lacking traditional weather monitoring. Already implemented in 150 countries, this approach marks a breakthrough in using language models for disaster prevention while addressing global inequality in weather forecasting capabilities.

March 13, 2026
AI innovationdisaster preventionclimate technology
Grok4.20 Beta debuts with record-low hallucination rates
News

Grok4.20 Beta debuts with record-low hallucination rates

xAI's latest model Grok4.20 Beta makes waves with its 78% non-hallucination rate - the highest factual reliability score in the industry. While trailing competitors Gemini3.1Pro and GPT-5.4 in benchmark tests, this release shines where it matters most: delivering trustworthy responses without making things up. With three API versions and competitive pricing starting at $2 per million tokens, xAI is betting big on accuracy over raw performance.

March 13, 2026
AI developmentlarge language modelsmachine learning
Tencent's WorkBuddy Now Lets You Control Your PC from WeChat
News

Tencent's WorkBuddy Now Lets You Control Your PC from WeChat

Tencent's AI assistant WorkBuddy just got a major upgrade, allowing users to remotely control their office computers through WeChat. The update introduces mobile voice commands, scheduled tasks, and enhanced security features. Whether you need to pull reports or draft documents, your AI assistant can now handle it anytime, anywhere - even offering 5,000 free credits for new users to try these features.

March 12, 2026
TencentAI ProductivityRemote Work
Tencent Defends Mirror Site Amid OpenClaw Data Scraping Controversy
News

Tencent Defends Mirror Site Amid OpenClaw Data Scraping Controversy

Tencent has responded to accusations from OpenClaw developer Peter Steinberger, who claims the tech giant scraped his platform's data without permission. While Tencent maintains its SkillHub mirror site actually reduced traffic pressure on the original by 99%, the dispute highlights ongoing tensions between open-source developers and corporate ecosystem expansion in the AI boom.

March 12, 2026
OpenClawTencentAI Ethics
News

WeChat Prepares to Roll Out Its Own AI Model This Year

WeChat, Tencent's ubiquitous messaging platform, is reportedly developing its own independent AI model set for release later this year. The move aims to reduce reliance on third-party systems while enhancing WeChat's mini-program ecosystem. Alongside this development, Tencent is testing an AI assistant that could transform WeChat into a comprehensive digital life interface.

March 12, 2026
WeChatAI DevelopmentTencent
Tencent's WorkBuddy Gets Smarter: Now Plays Nice With WeChat
News

Tencent's WorkBuddy Gets Smarter: Now Plays Nice With WeChat

Tencent's desktop AI assistant WorkBuddy just leveled up. The new version lets users connect seamlessly with WeChat - just scan a QR code to control tasks remotely. Beyond smoother integrations with QQ and Feishu, WorkBuddy now handles automated workflows like report generation and meeting notes. Tencent's pushing hard to make AI assistants more useful where we actually work.

March 12, 2026
TencentAI assistantworkplace automation