Umjetna inteligencijadb#2608

Zašto embeddinzi neće spasiti AI pretragu

(1d ago)
Global
youtube.com
Zašto embeddinzi neće spasiti AI pretragu

Zašto embeddinzi neće spasiti AI pretragu📷 © Tech&Space

  • Teorijski limiti vektorskih embedinga
  • Novi benchmarki pogoršavaju stare probleme
  • Yannic Kilcher razotkriva industrijsku hype

ArXiv papir On the Theoretical Limitations of Embedding-Based Retrieval (arXiv:2508.21038) [1] ne samo da potvrđuje ono što su istraživači već godinama šaptali – vektorski embedinzi imaju teorijske granice koje se ne mogu zaobići skaliranjem. Autori ne poriču da su embedinzi korisni za klasične zadatke pretrage, ali upozoravaju: kada se ti isti alati koriste za razmišljanje, slijeđenje instrukcija ili generiranje koda, problemi postaju fundamentalni, a ne samo tehnički.

Problem nije u tome što embedinzi ne rade – rade, ali unutar uskog raspona definiranog njihovom matematičkom prirodom. Kada se od njih traži da shvate 'relevantnost' u kontekstu koji nije statičan (npr. dinamički upiti ili multimodalni scenariji), sustav se raspada.

To nije bug, već značajka. Ili, kako bi rekli autori papira: 'Warning: Rant'.

Yannic Kilcher, čija je video-analiza [2] ovog rada već privukla pažnju tehničke zajednice, naglašava da se industrija pomiče prema sve ambicioznijim benchmarkima bez da je riješila osnovne nedostatke.

Benchmarki za razmišljanje i kodiranje samo proširuju jaz između obećanja i stvarnosti

Part 2: Umjesto da se fokusira na poboljšanje embedinga za specifične domene, trend je gurnuti ih u sve više zadataka – i onda se čuditi zašto rezultati variraju od 'dobrih' do 'potpuno krivih'. Ono što papir ne kaže eksplicitno, ali se podrazumijeva, jest da su ovi limiti posebno bolni za tvrtke koje embedinge tretiraju kao univerzalno rješenje.

Ako je vaš proizvod ovisan o pretrazi koja mora razumjeti nijanse (npr. pravne dokumente ili medicinske dijagnoze), embedinzi će vas prije ili kasnije izdati. To nije pitanje skaliranja modela, već dizajna sustava.

Reakcije u zajednici su podijeljene: dio istraživača vidi papir kao zdravu dozu skepticizma, dok drugi smatraju da autori pretjeruju s kritikom. Na GitHubu i tehnološkim forumima [3] već se pojavljuju rasprave o alternativama, poput hibridnih pristupa koji kombiniraju embedinge s klasičnim metodama pretrage.

Rješenje ovog problema nije jednostavno, ali je jasno da se mora promijeniti pristup korištenju embedinza. Umjesto da se fokusira na univerzalna rješenja, treba se usmjeriti na specifične zadatke i domene. Također, treba se razviti bolja razumijevanja ograničenja embedinza i kako ih zaobići. To će zahtijevati suradnju između istraživača, developera i industrijalaca. Jedno je sigurno: embedinzi neće spasiti AI pretragu, ali mogu biti važan dio rješenja.

AI search benchmarksembedding limitations in retrievalcode generation vs. real-world search performanceLLM evaluation gapssemantic search benchmarking

//Comments

AIAmazon’s $50B OpenAI bet: Trainium’s real test begins nowSpaceMapping the Local Bubble’s magnetic field reshapes cosmic scienceAIGoogle’s Gemini games flop: AI hype hits gamer realitySpaceStarship’s Tenth Test: The Reusability Threshold CrossedAINvidia’s AI tax: half your salary or half your careerSpaceJWST peels back dust to reveal star birth in W51AITriangle Health’s $4M AI won’t replace your doctor—yetSpaceAI’s Copyright Chaos Threatens Space Exploration DataAIHumble AI is just healthcare’s latest buzzword for ‘don’t trust us yet’SpaceExoplanet spins confirm a planetary mass ruleAIOpenAI’s teen safety tools: open source or open question?GamingCrimson Desert’s AI art fail: a mockup that slipped throughAITinder’s AI gambit: swiping left on endless swipingGamingPearl Abyss hid AI assets in Crimson Desert—now players want answersAINVIDIA’s Alpamayo AI: Self-Driving’s Hardest Problem or Just Another Demo?GamingCapcom Rejects AI AssetsAIWaymo’s police problem exposes AV’s real-world blind spotsRoboticsAtlas Redefines Humanoid DesignAILittlebird’s $11M bet: AI that reads your screen—without the screenshotsRoboticsOne antenna, two worlds: robot sniffs out realityAIUK firms drown in AI hype, emerge with empty spreadsheetsRoboticsDrone swarms take flight—but not off the demo lot yetAIApple’s Gemini Distillation: On-Device AI Without the Cloud HypeTechnologyTaiwan’s chip giants bet on helium and nukes to dodge supply shocksAICapcom’s AI partner talk is just corporate speak for ‘we’ll use it carefully’MedicineTelmisartan Boosts Cancer TreatmentAIOpenSeeker’s open gambit: Can 11K data points break AI’s data monopoly?MedicineXaira Unveils X-CellAIGimlet Labs Solves AI BottleneckMedicineAI Fails to Speed Lung Cancer DiagnosisAIHelion Powers OpenAIAINVIDIA’s OpenShell: Security for AI Agents or Just Another Hype Shell?AIDRAFT Boosts AI SafetyAIProject Glasswing: AI finds flaws everywhere—except in its own hypeAIPAM: Complex Math for a 10% Performance HitAIOpenAI’s erotic chatbot pause exposes AI’s adult content dilemmaAIAI Ranks Recovery Factors—but Who’s Really Listening?AIDeepMind’s AI safety play: real guardrails or just another demo?AIAmazon’s $50B OpenAI bet: Trainium’s real test begins nowSpaceMapping the Local Bubble’s magnetic field reshapes cosmic scienceAIGoogle’s Gemini games flop: AI hype hits gamer realitySpaceStarship’s Tenth Test: The Reusability Threshold CrossedAINvidia’s AI tax: half your salary or half your careerSpaceJWST peels back dust to reveal star birth in W51AITriangle Health’s $4M AI won’t replace your doctor—yetSpaceAI’s Copyright Chaos Threatens Space Exploration DataAIHumble AI is just healthcare’s latest buzzword for ‘don’t trust us yet’SpaceExoplanet spins confirm a planetary mass ruleAIOpenAI’s teen safety tools: open source or open question?GamingCrimson Desert’s AI art fail: a mockup that slipped throughAITinder’s AI gambit: swiping left on endless swipingGamingPearl Abyss hid AI assets in Crimson Desert—now players want answersAINVIDIA’s Alpamayo AI: Self-Driving’s Hardest Problem or Just Another Demo?GamingCapcom Rejects AI AssetsAIWaymo’s police problem exposes AV’s real-world blind spotsRoboticsAtlas Redefines Humanoid DesignAILittlebird’s $11M bet: AI that reads your screen—without the screenshotsRoboticsOne antenna, two worlds: robot sniffs out realityAIUK firms drown in AI hype, emerge with empty spreadsheetsRoboticsDrone swarms take flight—but not off the demo lot yetAIApple’s Gemini Distillation: On-Device AI Without the Cloud HypeTechnologyTaiwan’s chip giants bet on helium and nukes to dodge supply shocksAICapcom’s AI partner talk is just corporate speak for ‘we’ll use it carefully’MedicineTelmisartan Boosts Cancer TreatmentAIOpenSeeker’s open gambit: Can 11K data points break AI’s data monopoly?MedicineXaira Unveils X-CellAIGimlet Labs Solves AI BottleneckMedicineAI Fails to Speed Lung Cancer DiagnosisAIHelion Powers OpenAIAINVIDIA’s OpenShell: Security for AI Agents or Just Another Hype Shell?AIDRAFT Boosts AI SafetyAIProject Glasswing: AI finds flaws everywhere—except in its own hypeAIPAM: Complex Math for a 10% Performance HitAIOpenAI’s erotic chatbot pause exposes AI’s adult content dilemmaAIAI Ranks Recovery Factors—but Who’s Really Listening?AIDeepMind’s AI safety play: real guardrails or just another demo?
⊞ Foto Review