Kevin Roose, of Hard Fork and NYT, was so impressed with OpenAIโ€™s rollout that he joked โ€œof course they have to announce AGI the day my vacation s

๐—ผ๐Ÿฏ โ€œ๐—”๐—ฅ๐—– ๐—”๐—š๐—œโ€ ๐—ฝ๐—ผ๐˜€๐˜๐—บ๐—ผ๐—ฟ๐˜๐—ฒ๐—บ ๐—บ๐—ฒ๐—ด๐—ฎ๐˜๐—ต๐—ฟ๐—ฒ๐—ฎ๐—ฑ: ๐˜„๐—ต๐˜† ๐˜๐—ต๐—ถ๐—ป๐—ด๐˜€ ๐—ด๐—ผ๐˜ ๐—ต๐—ฒ๐—ฎ๐˜๐—ฒ๐—ฑ, ๐˜„๐—ต๐—ฎ๐˜ ๐˜„๐—ฒ๐—ป๐˜ ๐˜„๐—ฟ๐—ผ๐—ป๐—ด, ๐—ฎ๐—ป๐—ฑ ๐˜„๐—ต๐—ฎ๐˜ ๐—ถ๐˜ ๐—ฎ๐—น๐—น ๐—บ๐—ฒ๐—ฎ๐—ป๐˜€

submited by
Style Pass
2024-12-22 17:30:06

Kevin Roose, of Hard Fork and NYT, was so impressed with OpenAIโ€™s rollout that he joked โ€œof course they have to announce AGI the day my vacation startsโ€.

For many people, what sealed the deal, or lead them to conclude, wrongly, that o3 necessarily โ€œmust be a step to AGIโ€, was o3โ€™s performance on @fcholletโ€™s ARCโ€”AGI.

1. As NYU prof Brenden Lake pointed out, the test should never have been called ARC-AGI. Even Chollet acknowledged this in his blog, saying โ€œitโ€™s not an acid test for AGIโ€. At *most* the test is necessary for AGI; it certainly isnโ€™t sufficient. Critical things like factuality, compositionality, and common sense arenโ€™t even addressed.

2. The video should have been much clearer about what was actually tested and what was actually trained. To the average listener it may have sounded like the AI took the test cold, with a few sample items, like a human would, but thatโ€™s not actually what happened.

3. What was actually done - pretraining on what I believe was hundreds of public examples - is NOT comparable to what humans require. Such pretraining is not uncommon in the field, but was not made clear in the video. Altman saying that the test wasnโ€™t โ€œtargetedโ€ added to the confusion.

Leave a Comment
Related Posts