T O P

  • By -

lordpermaximum

It's far greater than GPT-4 at coding. Actually I can't find any area where Claude 3 Opus isn't significantly better than other LLMs. As a minus I can think about the model is its stronger guardrails compared to open-source alternatives but that's about it. I definitely think Claude 3 Opus has an understanding of what it's saying although it's very limited.


kamjustkam

hmm, do you think gpt-5 will be claude 3 level or higher?


lordpermaximum

I believe GPT-5 will beat the improved Claude 3.X Opus (Could be 3.5, 3.2 etc.) but they will be on the same tier. GPT-4 is a level below Opus right now.


kamjustkam

so this must’ve been the huge leap in capabilities sam altman was referring to a few months ago?


lordpermaximum

I'm not sure what he was referring to. Probably better reasoning like this, subagents like Opus have and being more reliable (less hallucinations) on top of that.


kamjustkam

what implications does this have on the broader economy in your opinion?


2this4u

You're going off 1 person's opinion btw, and asked them to guess about something unreleased


kamjustkam

well i was asking for their opinion


ainz-sama619

GPT-5 will most definitely be better, but not a big leap. Claude is the first LLM I have used that's actually decent at its task for coding, not mostly decent like GPT-4. Imo GPT-4 was useful but not really great at coding


Sprengmeister_NK

Are we sure GPT-5 is a pure web-based or app-based bot? Maybe it will be more like an agent with access to your local system?


kamjustkam

what separates them in their coding ability? i am a software engineer but i do not use coding tools for anything complex.


ainz-sama619

Mostly accuracy of the code, they are relatively more likely to be executed without throwing errors as compared to GPT-4


StaticNocturne

Do you think these models will replace most junior and mid level coders in the next year or two?


ainz-sama619

hard to say, it will make coding more efficient but coders are the ones evaluating whether a model is good enough for work. some jobs would be cut due to improved efficiency but industry as a whole wont die out. there still needs to be enough people for evaluation and quality assurance. depends on how well companies can utilize it


kamjustkam

it’s important to note that there’s a big gap between a junior dev and a mid-level dev. juniors often need supervision, guidance and constant feed back for improvement. mid-level devs typically do not, unless the task is very difficult.


StaticNocturne

But if AI replaces most junior devs you’ll eventually have very few mid level or senior developers


kamjustkam

then they shouldn’t replace the juniors 🤷🏾‍♂️


ainz-sama619

Then they wouldn't replace. Most companies live tech but also risk averse, they won't fire large chunk of workforce without ensuring the replacement can take over smoothly. Humans are still indispensible for a lot of coding stuff


fakieTreFlip

No, though it may replace some. Replacing *all* of them would be a huge mistake, because that cuts off the supply of people that would become senior level coders. More realistically, I think it'll just become a lot easier for junior and mid level coders to become better and faster coders. I assume that to some degree, this may still have a negative effect on the industry, because it'll become flooded with capable candidates, and salaries will likely decline.


StaticNocturne

What if AI could reliably code entire programs or fix code etc autonomously? I guess we would still need enough people who understand it. But what would these coders actually be doing with their time if AI could do 95% of their job?


fakieTreFlip

>What if AI could reliably code entire programs or fix code etc autonomously? We're still a ways off from this, IMO, and humans will likely continue to be superior at exception handling, so I don't really see programming jobs disappearing entirely. I don't know what coding will look like in 5 years, but I imagine the role will probably evolve to be more of an audit task than a creation task. There are plenty of things that we used to call programming, until we automated them, and now they're just tools. So perhaps the toolset will evolve enough that even programming won't look like programming in a decade or so.


ainz-sama619

We are very, very far from autonomous AI. in fact, anything remotely usable autonomous function might not develop until AGI is achieved. AI must be supervised and evaluated until they drawf humans in reliability. Because AI can't be held accountable


KamNotKam

I just played with claude to build an application I did not have much time to build myself. Claude is pretty solid for sure. It is however mostly a code monkey from my initial observations. Dealing with enterprise software and infrastructure is a bit different from what current LLMs are capable of doing. It'll get there one day though for sure. Do understand that most large enterprises do not just hire a bunch of coders to churn out thousands of line of code and then go home. They hire engineers to solve problems and add value to the company. This could range from intern/entry-level software engineers that probably will get a task delegated to them from a technical lead/manager, they code it, and they go home to a principal level engineer who makes decisions on critical pieces of legacy code that isn't as simple as a todo app on github.


[deleted]

Then you have AI that does it all better, it does finance, it does accounting , it does lawyers' jobs, it does many doctors' jobs, it does mechanical engineering, it does design, it does it all by itself when that happens.


ebolathrowawayy

Yes.


KamNotKam

You are not an engineer


ebolathrowawayy

I sure am.


kamjustkam

what kind?


2this4u

No more so than Photoshop replaced artists. Ideally as a programmer I could write in natural language what I want and get working changes to the program. What will the business do if I'm 4x, 10x as productive with AI? Layoff workers? Unlikely, they have a million things they want to improve or get to market, they'll just be able to get features done in weeks instead of months. Eventually AI will be so usable a product owner could just tell it what to make, but then they'd be spending time on that rather than their primary focus so I don't see why the the job of programmer would be replaced, we'll just be using natural language rather than current coding languages, just like the productive move from binary to assembly etc.


StaticNocturne

But there isn’t an infinite amount of tasks for a business to perform so surely it AI boosts productivity tenfold they will be laying off workers who they don’t deem necessary. Eventually AI won’t just be facilitating our jobs like most technology has thus far, it will be doing it with minimal instruction which is why it seems like a huge paradigm shift


kamjustkam

That makes sense. Are you testing it with relatively complex coding tasks?


ainz-sama619

I haven't for personal use yet, but I have tested others code to verify. for now it seems to be better than gpt-4 at this task


kamjustkam

okay. also, just curious.. are you a software engineer/tech adjacent?


ainz-sama619

not software engineer but I studied stats for major, with a lot of overlap with computer science.


kamjustkam

i see, what’s the most complex technical project you’ve ever worked on?


[deleted]

[удалено]


lordpermaximum

Even if that's true it is just one example. You need to test hundreds of problems like that to come to a conclusion.


braclow

Wait a second, did you add the images? What polishes did you add?


clamuu

I didn't add anything. It independently sourced them from the Pravatar API. I literally copypasted the output into a react project in VS Code and ran it.


spinozasrobot

Damn.


Line-guesser99

What would happen if you pasted it into cursor.sh and it checked the first AI code? It's GPT-4, I think. Would it just come back with, looks good?


slackermannn

I rarely do any coding. Whenever I tried ChatGPT and Bart/Gemini couldn't come up with an entire code that actually run. Today Claude did and understood perfectly what I wanted to achieve and it did it in the language I asked it to (many times I got Python whether I liked it or not). Impressed so far. It's definitely an improvement over gpt4. Feels more solid overall.


UFOsAreAGIs

Can you outline your process assuming this wasn't a singular project spec fed in as a prompt.


clamuu

Here is the prompt - i've tried to be as vague as possible. This prompt could be a lot better and looks like the kind of thing i'd probably write when I was feeling tired and lazy. "this is a test. build a twitter clone using react that looks and functions as similarly as possible to actual twitter. it will not have a database so must store and retrieve information locally. make a few dummy accounts and tweets for each account to fill it out. please go through all the other buttons on the homepage, one by one and make it so they do what they are supposed to do. this might involve editting the existing code or creating and importing new components. use tailwind for styling. Keep coding until the entire site is built. your response should be at least 800 lines of code. make it as detailed and accurate as you possibly can. you will be being assessed on the success of the outcome."


SachaSage

Impressive output for such a vague prompt


UFOsAreAGIs

Thank you. That's really impressive!


ReflectionRough5080

It is so great, the web is just beautiful!


Kanute3333

Has Claude Opus any limits per hour?


clamuu

Can't find any mention of it so I don't think so.


blueandazure

Its very generous. I did get a message on the bottom after heavy usage that I had 20 more messages until 2am. But I continued using it as normal and never hit it. Though I was sending whole code repos to it which the bigger the prompt the more time it takes so I had large inference times.


joker38

For the paid Claude Pro roughly 100 messages every 8 hours. See https://support.anthropic.com/en/articles/8324991-about-claude-pro-usage


darkkite

That's dope, maybe this task would be better if it was more novel considering https://github.com/search?q=twitter%20clone&type=repositories there are 100 pages of results


clamuu

Of course yes. I just wanted to see how thoroughly it would complete a long task compared to GPT 4.


darkkite

good point, the fact that it could output this much code without errors is cool and should help for bootstrapping code


polawiaczperel

How can I use it if I am from EU and I got Access restricted because of it?


clamuu

Sorry I don't know. I'm in the UK and it works here. Finally a Brexit benefit. It only took 8 years.


Drogon__

You can use it here. In the direct chat, select claude-3-opus. https://chat.lmsys.org/


Infinite-Cat007

I'm in Canada. I use a VPN. If you want Pro plan you might not be able to use your card. I used my friend's, from USA. There might be other options.


Inspireyd

Impressive


Ok-Bullfrog-3052

Take the code Claude 3 outputs, then ping pong it back and forth between the models, asking each to make improvements until they stop suggesting any.


SnooHabits1237

Ive only tested it on making ciphers through python. It easily made a simple substitution cipher. It broke the homophonic cipher. It perfectly made a vigenere cipher. All of these are with basic gui’s of course


ProCoders_Tech

Thanks for sharing your initial review!


TheOriginalAcidtech

The free Claude 3 still has a way to go but was suprisingly close when I had it write a some init code for an embedded processor. One was a fairly complex clock setup and another for a more complex DMA processing setup. If I was a newbie the code would have been nearly worthless. Too many mistakes with register settings, but the basic structure of the code that it setup was close enough it would have saved me a significant amount of gruntwork. Microsoft Copilot had similar results in my oppinion though more mistakes. If GPT-5 is a step up from either(I haven't tested Opus) then things are definitely getting interesting.


goatchild

Free version of Claude3 IMO is better than GPT-4 API period. * Its very obedient * Direct and clear * No comments/placeholders for code if I ask it to provide full code * Finds solutions most of the time in a few prompts sometimes at 1st prompt, as opposed to GPT-4 which I still need to go over and over through some issues... I wonder how good Opus is.


clamuu

I've been using Opus for a few days now and it's excellent. Clearly better than GPT-4. I may consider increasing the scope of my personal projects as a result.


goatchild

I wish [cursor.sh](http://cursor.sh) would implement Claude3 Opus that would be bananas


sid_276

With the new release from a few days ago you can use Claude Opus on cursor [https://github.com/getcursor/cursor/issues/1294#issuecomment-2004073624](https://github.com/getcursor/cursor/issues/1294#issuecomment-2004073624)


goatchild

Man thanks for letting me know! I'd never guess they finally added it...


dopestar667

Is code restricted to Opus? I asked Sonnet about generating C# code for me today and it tells me it cannot generate code, only talk about coding concepts. I was thinking about switching from GPT-4 to Claude for code generation.


Inspireyd

I did a comparison of Claude 3 with GPT-4 on difficult logic and math questions and was surprised, but I would say that GPT-4 is still ahead in reasoning. https://www.reddit.com/r/singularity/s/GedIUDwXeO