Models Compression - Search News

AI Model Compression for $1,000: Ora Computing Uses Quantum Physics to Beat Hardware Lock-In

Vienna startup Ora Computing raised €3.5M and proved a 70-billion-parameter large language model can be compressed for under ...

13d

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

LCLMs compress LLM context before decode — 8.8x faster at 16x compression, beating every KV cache method tested. Open-sourced by NYU and Columbia.

Tech.eu

Ora Computing raises €3.5M to build the efficiency layer of the AI stack

Ora Computing is developing software that makes AI models smaller, faster, and more efficient. Its technology helps reduce ...

SiliconANGLE

Report: AI model compression startup Multiverse seeking €500M funding round

Multiverse Computing SL, a startup with technology that reduces the hardware footprint of artificial intelligence models, is reportedly raising new capital. Sources told Bloomberg today the Spanish ...

The Manila Times

Multiverse Computing Launches Pulsar 16B in collaboration with NVIDIA: Frontier-Grade Reasoning at Half the Parameters

The new open reasoning model delivers 30B-class intelligence in a 16B-parameter footprint, with 3.1B active parameters, validated independently on NVIDIA accelerated computing infrastructure.

Ars Technica

Google’s TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x

Even if you don’t know much about the inner workings of generative AI models, you probably know they need a lot of memory. Hence, it is currently almost impossible to buy a measly stick of RAM without ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results