minus-squaretomalley8342@lemmy.worldtoTechnology@lemmy.world•Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.linkfedilinkEnglisharrow-up0·3 days agoThey didn’t say “100% of humans can solve this benchmark”, they said “humans can solve 100% of this benchmark”. linkfedilink
They didn’t say “100% of humans can solve this benchmark”, they said “humans can solve 100% of this benchmark”.