@arcine - lemmy.ruv.wtf

arcine@jlai.lu

0 Posts
1 Comment

Joined 1 month ago

Cake day: February 15th, 2026

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

arcine@jlai.lutoTechnology@lemmy.world•Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.
link
fedilink
English
arrow-up
3·
3 days ago
Try spelling things phonetically (example: faux net tick alley), that’s one of my benchmarks that AI fails almost every time.

If the input is at all long, or purposefully includes a lot of words about a specific unrelated theme to the coded message, it’s impossible.

link
fedilink