Yuan 3.0 Ultra: New AI Model for Smarter, Faster Tech

Research on expert load distribution shows two main parts during training:

Early Stage: Expert loads change a lot because they start randomly.
Stable Stage: Expert loads settle down. The order of experts for processing data stays mostly the same.

Small Load Rule (⍺): This helps experts with much less work than average.
Total Load Rule (β): This finds experts that do the least work overall.

Faster Hardware and Better Expert Setup

Method	TFLOPS per GPU
Base Model (1515B)	62.14
DeepSeek-V3 Aux Loss	80.82
Yuan3.0 Ultra (LAEP)	92.60

Model Pruning: Helped make it 32.4% more efficient.
Expert Rearrangement: Helped make it 15.9% more efficient.

Less Overthinking with New RIRM Method

r_min=0: Best for quick, direct answers.
r_max=3: The highest number of checks allowed.

How Yuan 3.0 Ultra Does on Business Tests

Test	What it Tests	Yuan3.0 Ultra Score	Top Competitor Score
Docmatix	Multimodal RAG	67.4%	48.4% (GPT-5.2)
ChatRAG	Text Search (Avg)	68.2%	53.6% (Kimi K2.5)
MMTab	Table Questions	62.3%	66.2% (Kimi K2.5)
SummEval	Summaries	62.8%	49.9% (Claude Opus 4.6)
Spider 1.0	Text-to-SQL	83.9%	82.7% (Kimi K2.5)
BFCL V3	Using Tools	67.8%	78.8% (Gemini 3.1 Pro)

#Yuan30Ultra #AIModel #MultimodalAI #MoEModel #AIEfficiency #LargeLanguageModels #ArtificialIntelligence

Commentaires

عدد التعليقات : 0

إضافة تعليق جديد

💬 We’d Love to Hear From You!
Your thoughts and feedback matter to us. Please keep your comments respectful, helpful, and relevant to the topic.
🚫 No spam or promotional links.
🔒 Your email address will not be published.
✍️ Required fields are marked.
Thank you for contributing to the discussion, we look forward to your comment! 😊

DeepGeek

<span data-i18n="pages">الصفحات</span>

Yuan 3.0 Ultra: New AI Model for Smarter, Faster Tech

Faster Hardware and Better Expert Setup

Less Overthinking with New RIRM Method

How Yuan 3.0 Ultra Does on Business Tests

إضافة تعليق جديد

MedGemma 1.5: New Medical AI for Images & Med…

AI Agent Systems: When and Why They Work

DialogLab: Test AI Group Conversations Easily

Debunking AI Agent Misconceptions: Truths for Pro…

Instagram Parental Alerts for Teen Self-Harm Sear…