Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’

themachinestops@lemmy.dbzer0.com · 4 days ago

Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’

AbouBenAdhem@lemmy.world · 4 days ago

TurboQuant, meanwhile, could lead to efficiency gains and systems that require less memory during inference. But it wouldn’t necessarily solve the wider RAM shortages driven by AI, given that it only targets inference memory, not training — the latter of which continues to require massive amounts of RAM.

I didn’t realize the RAM shortage was mostly due to training—I would have thought inference was at least a big a factor.

Dran@lemmy.world · 4 days ago

Inference is dirt cheap in comparison. Hundreds to thousands of concurrent users can be served by hardware costing in the high-thousands to low-ten-thousands.

Training those same foundational models is weeks to months of time on tens to hundreds of millions worth of hardware.

Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’

Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’

Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it 'Pied Piper' | TechCrunch