Monday, August 11, 2025

Creating liberating content

What to know about Instagram’s new location-sharing feature – CBS

Could crypto belong in your investment portfolio – CBS News

The initial public offering (IPO) of BlueStone Jewellery and Lifestyle

Related News

What to know about Instagram’s new location-sharing feature – CBS News Watch CBS News Instagram’s new “Instagram Maps” lets users share their location with followers, but critics warn it could

People walk past a neon sign advertising a Bitcoin and Ethereum crypto currency exchange in Warsaw, Poland on 19 May, 2024. Jaap Arriens | Nurphoto | Getty Images Bitcoin is

Could crypto belong in your investment portfolio – CBS News Watch CBS News Jo Ling Kent visits a cryptocurrency hotspot to explore the Trump administration’s new push to make digital

The initial public offering (IPO) of BlueStone Jewellery and Lifestyle Ltd, which sells contemporary jewellery under its flagship brand ‘BlueStone’, was subscribed 39% on the first day of bidding on

Elon Musk-owned vehicle manufacturer Tesla inaugurated its second showroom in India at Delhi’s Aerocity, expanding its presence in the country.The American automaker giant is further set to expand its supercharging

NSDL shares have been a big money-maker for the investors with the prices surging by nearly 80% from its IPO price. In a significant post-market debut, surged over 9.6% on

Trending News

JSW Cement, the building materials arm of Sajjan Jindal-led JSW Group, has reduced the size of its upcoming initial public offering (IPO) to Rs 3,600 crore and will open the

The agricultural Gross Value Added (GVA) growth is expected to moderate to 4.5% in the first quarter of FY26, down from 5.4% in the preceding quarter, according to a report

Foreign portfolio investors (FPIs) turned net sellers in the Indian equity market in July, pulling out Rs 17,741 crore amid rising global trade tensions. According to data from NSDL, this

Avenue Capital Group-backed Asset Reconstruction Company (India) Ltd (ARCIL) has filed its draft red herring prospectus (DRHP) with markets regulator Sebi on Friday to raise funds through an initial public

Russia-backed Nayara Energy looks at India’s state-run oil companies to offload petrol, diesel exports Nayara Energy has approached Indian state-run oil marketing companies (OMCs) to offload its export volumes of

US President Donald Trump on Saturday claimed that he had “heard” reports of India halting Russian oil imports, hailing it as a “good step”. “I understand that India is no

Mixture of experts: The method behind DeepSeek’s frugal success |

Word Count: 710 | Estimated Reading Time: 4 minutes


Mixture of experts: The method behind DeepSeek's frugal success

China’s DeepSeek has pulled off an AI miracle—building a top-tier artificial intelligence model while spending far less than its American rivals. At a time when AI giants are burning billions on GPUs and power-hungry data centers, this start-up has figured out a way to do more with less.
The secret? A mix of smart engineering, a clever neural network design, and some good old-fashioned mathematical efficiency.
Big AI, Small Budget
Most AI firms stack their data centers with thousands of GPUs—Meta’s latest AI model reportedly ran on 16,000 specialized chips, each costing around $40,000. DeepSeek? Just 2,000. Their total compute cost? A mere $6 million, almost a tenth of what Meta is rumored to have spent.
The ‘Mixture of Experts’ Trick
The key to DeepSeek’s frugal success? A method called “mixture of experts.” Traditional AI models try to learn everything in one giant neural network. That’s like stuffing all knowledge into a single brain—inefficient and power-hungry.
DeepSeek, instead, split the system into specialized mini-networks—one for poetry, one for coding, another for biology, and so on. Each “expert” focused on its domain, while a “generalist” network acted as a bridge, coordinating them.
Think of it like a newsroom: specialist reporters cover specific beats, while an editor connects the dots.
The Decimal Game
If that wasn’t enough, DeepSeek also squeezed efficiency out of pure mathematics. AI models rely on mind-boggling amounts of number crunching, typically using 16-bit precision. DeepSeek? They slashed it to 8 bits—halving memory use and speeding up calculations.
Losing precision sounds risky, right? Not really. Just like rounding π to 3.14 works for most practical uses, trimming decimals didn’t hurt the AI’s performance. And when needed, DeepSeek stretched the final results back to 32-bit accuracy—giving them the best of both worlds.
Why Didn’t Others Do It?
AI giants like OpenAI and Google’s DeepMind have the brains and the budget, so why didn’t they crack this code first? Simple: risk.
Building AI models is expensive, and experimenting with new techniques can burn millions with no guarantee of success. DeepSeek took that gamble—and it paid off.
Now that they’ve published their findings, the industry is taking note. AI development just got a whole lot cheaper. The question is—who will be the next to follow suit?





Source link

Most Popular Articles

Sign In

Welcome ! Log into Your Account