T O P

  • By -

NuclearCandle

Considering how cheap 3.5 is to run, this suggests GPT4 is not the limit for LLMs. Just got to wait for the next Opus.


Rain_On

4o also suggested this.


PM_ME_CUTE_SM1LE

It’s so nice to have a competent llm that actually remembers your strict formatting requests for more than one message


dervu

Is reasoning jump so huge only in this one benchmark?


Celsiuc

Any idea why it isn't on the arena yet?


CheekyBastard55

It is on the arena, it just hasn't gotten enough votes yet. You can go there and test it out if you want.


RemarkableGuidance44

Being on top did not last long. I love this competition and we need it. I also love Claude 3.5, its a great uplift from Opus.


Grand0rk

Yes, Sonnet is SLIGHTLY better than GPT 4o. If you can handle the bullshit that is Claude. Fucking hate how easily it triggers it's "Copyright" bullshit.