minus-squareayyy@sh.itjust.workstoTechnology@lemmy.world•Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.linkfedilinkEnglisharrow-up0·2 days agoThe humans literally didn’t score 100% though. Why lie? linkfedilink
The humans literally didn’t score 100% though. Why lie?