AI efficiency – Weekly Geek

DeepSeek tests “sparse attention” to slash AI processing costs

The attention bottleneck In AI, “attention” is a term for a software technique that determines which words in a text are most relevant to understanding each other. Those relationships map out context, and context builds meaning in language. For example, in the sentence “The bank raised interest rates,” attention helps the model establish that “bank”… Read More »

Matrix multiplication breakthrough could lead to faster, more efficient AI models

Enlarge / When you do math on a computer, you fly through a numerical tunnel like this—figuratively, of course. reader comments 45 Computer scientists have discovered a new way to multiply large matrices faster than ever before by eliminating a previously unknown inefficiency, reports Quanta Magazine. This could eventually accelerate AI models like ChatGPT, which… Read More »