Thursday, March 13, 2025

Creating liberating content

Disney’s highly anticipated live-action adaptation of ‘Lilo & Stitch’ is

Trump critic wins Greenland election – CBS News Watch CBS

Senate minority leader Chuck Schumer announces opposition to Republican-backed funding

THIRUVANANTHAPURAM: Tension gripped Neyyattinkara in Kerala’s Thiruvananthapuram district on Wednesday

Related News

Disney’s highly anticipated live-action adaptation of ‘Lilo & Stitch’ is set to crash land cinemas on May 23, 2025. With still months to go before the film’s release, the makers

Trump critic wins Greenland election – CBS News Watch CBS News The center-right Demokraatit Party won the most votes in Greenland’s parliamentary elections, a surprise result as the territory went

Senate minority leader Chuck Schumer announces opposition to Republican-backed funding bill (File photo) Senate Democrats announced on Wednesday their opposition to a Republican-proposed stopgap funding bill through September 30, heightening

THIRUVANANTHAPURAM: Tension gripped Neyyattinkara in Kerala’s Thiruvananthapuram district on Wednesday evening after RSS activists staged a protest against Mahatma Gandhi’s grandson, Tushar Gandhi during the unveiling of Gandhian representative Gopinathan

CEO of Saffron Road food company credits “immigrant journey” for his success – CBS News Watch CBS News Adnan Durani started his company Saffron Road in 2010 to bring foods

Mickey Arthur (Photo by Gareth Copley/Getty Images) NEW DELHI: Pakistan cricket has once again found itself in the middle of controversy and chaos following its first-round exit from the Champions

Trending News

Infosys chairman Nandan Nilekani BENGALURU: The future of energy will be a decentralised structure, with millions of small producers buying and selling power in a way that mirrors Unified Payments

NEW DELHI: Aviation industry personnel in India have made a beeline for joining Saudi Arabia’s soon to be launched second national carrier Riyadh Air, that is owned by the Public

ChrysCapital has secured $2.1 billion for its private equity fund, marking the biggest capital raising by an Indian buyout shop, according to people familiar with the matter. The firm’s latest

NEW DELHI: Govt capital expenditure, tax cuts for middle class income groups to boost consumption and monetary easing will help India’s GDP growth exceed 6.5% for fiscal 2025-26, global ratings

NEW DELHI: Singapore’s sovereign private equity giant Temasek is acquiring 9-10% stake in Haldiram Snacks Foods for over Rs 8,500 crore, valuing the company at around $10 billion. Haldiram Snacks

Retail inflation or Consumer Price Index (CPI) inflation for the month of February eased to a 7-month low of 3.61%. This is well within the Reserve Bank of India (RBI’s)

Mixture of experts: The method behind DeepSeek’s frugal success |

Word Count: 710 | Estimated Reading Time: 4 minutes


Mixture of experts: The method behind DeepSeek's frugal success

China’s DeepSeek has pulled off an AI miracle—building a top-tier artificial intelligence model while spending far less than its American rivals. At a time when AI giants are burning billions on GPUs and power-hungry data centers, this start-up has figured out a way to do more with less.
The secret? A mix of smart engineering, a clever neural network design, and some good old-fashioned mathematical efficiency.
Big AI, Small Budget
Most AI firms stack their data centers with thousands of GPUs—Meta’s latest AI model reportedly ran on 16,000 specialized chips, each costing around $40,000. DeepSeek? Just 2,000. Their total compute cost? A mere $6 million, almost a tenth of what Meta is rumored to have spent.
The ‘Mixture of Experts’ Trick
The key to DeepSeek’s frugal success? A method called “mixture of experts.” Traditional AI models try to learn everything in one giant neural network. That’s like stuffing all knowledge into a single brain—inefficient and power-hungry.
DeepSeek, instead, split the system into specialized mini-networks—one for poetry, one for coding, another for biology, and so on. Each “expert” focused on its domain, while a “generalist” network acted as a bridge, coordinating them.
Think of it like a newsroom: specialist reporters cover specific beats, while an editor connects the dots.
The Decimal Game
If that wasn’t enough, DeepSeek also squeezed efficiency out of pure mathematics. AI models rely on mind-boggling amounts of number crunching, typically using 16-bit precision. DeepSeek? They slashed it to 8 bits—halving memory use and speeding up calculations.
Losing precision sounds risky, right? Not really. Just like rounding π to 3.14 works for most practical uses, trimming decimals didn’t hurt the AI’s performance. And when needed, DeepSeek stretched the final results back to 32-bit accuracy—giving them the best of both worlds.
Why Didn’t Others Do It?
AI giants like OpenAI and Google’s DeepMind have the brains and the budget, so why didn’t they crack this code first? Simple: risk.
Building AI models is expensive, and experimenting with new techniques can burn millions with no guarantee of success. DeepSeek took that gamble—and it paid off.
Now that they’ve published their findings, the industry is taking note. AI development just got a whole lot cheaper. The question is—who will be the next to follow suit?





Source link

Most Popular Articles

Sign In

Welcome ! Log into Your Account