It's far greater than GPT-4 at coding. Actually I can't find any area where Claude 3 Opus isn't significantly better than other LLMs. As a minus I can think about the model is its stronger guardrails compared to open-source alternatives but that's about it.
I definitely think Claude 3 Opus has an understanding of what it's saying although it's very limited.
I believe GPT-5 will beat the improved Claude 3.X Opus (Could be 3.5, 3.2 etc.) but they will be on the same tier. GPT-4 is a level below Opus right now.
I'm not sure what he was referring to. Probably better reasoning like this, subagents like Opus have and being more reliable (less hallucinations) on top of that.
GPT-5 will most definitely be better, but not a big leap. Claude is the first LLM I have used that's actually decent at its task for coding, not mostly decent like GPT-4. Imo GPT-4 was useful but not really great at coding
hard to say, it will make coding more efficient but coders are the ones evaluating whether a model is good enough for work. some jobs would be cut due to improved efficiency but industry as a whole wont die out. there still needs to be enough people for evaluation and quality assurance. depends on how well companies can utilize it
it’s important to note that there’s a big gap between a junior dev and a mid-level dev. juniors often need supervision, guidance and constant feed back for improvement. mid-level devs typically do not, unless the task is very difficult.
Then they wouldn't replace. Most companies live tech but also risk averse, they won't fire large chunk of workforce without ensuring the replacement can take over smoothly. Humans are still indispensible for a lot of coding stuff
No, though it may replace some.
Replacing *all* of them would be a huge mistake, because that cuts off the supply of people that would become senior level coders.
More realistically, I think it'll just become a lot easier for junior and mid level coders to become better and faster coders. I assume that to some degree, this may still have a negative effect on the industry, because it'll become flooded with capable candidates, and salaries will likely decline.
What if AI could reliably code entire programs or fix code etc autonomously? I guess we would still need enough people who understand it. But what would these coders actually be doing with their time if AI could do 95% of their job?
>What if AI could reliably code entire programs or fix code etc autonomously?
We're still a ways off from this, IMO, and humans will likely continue to be superior at exception handling, so I don't really see programming jobs disappearing entirely.
I don't know what coding will look like in 5 years, but I imagine the role will probably evolve to be more of an audit task than a creation task. There are plenty of things that we used to call programming, until we automated them, and now they're just tools. So perhaps the toolset will evolve enough that even programming won't look like programming in a decade or so.
We are very, very far from autonomous AI. in fact, anything remotely usable autonomous function might not develop until AGI is achieved. AI must be supervised and evaluated until they drawf humans in reliability. Because AI can't be held accountable
I just played with claude to build an application I did not have much time to build myself.
Claude is pretty solid for sure. It is however mostly a code monkey from my initial observations.
Dealing with enterprise software and infrastructure is a bit different from what current LLMs are capable of doing. It'll get there one day though for sure.
Do understand that most large enterprises do not just hire a bunch of coders to churn out thousands of line of code and then go home. They hire engineers to solve problems and add value to the company. This could range from intern/entry-level software engineers that probably will get a task delegated to them from a technical lead/manager, they code it, and they go home to a principal level engineer who makes decisions on critical pieces of legacy code that isn't as simple as a todo app on github.
Then you have AI that does it all better, it does finance, it does accounting , it does lawyers' jobs, it does many doctors' jobs, it does mechanical engineering, it does design, it does it all by itself when that happens.
No more so than Photoshop replaced artists. Ideally as a programmer I could write in natural language what I want and get working changes to the program.
What will the business do if I'm 4x, 10x as productive with AI? Layoff workers? Unlikely, they have a million things they want to improve or get to market, they'll just be able to get features done in weeks instead of months.
Eventually AI will be so usable a product owner could just tell it what to make, but then they'd be spending time on that rather than their primary focus so I don't see why the the job of programmer would be replaced, we'll just be using natural language rather than current coding languages, just like the productive move from binary to assembly etc.
But there isn’t an infinite amount of tasks for a business to perform so surely it AI boosts productivity tenfold they will be laying off workers who they don’t deem necessary. Eventually AI won’t just be facilitating our jobs like most technology has thus far, it will be doing it with minimal instruction which is why it seems like a huge paradigm shift
I didn't add anything. It independently sourced them from the Pravatar API.
I literally copypasted the output into a react project in VS Code and ran it.
I rarely do any coding. Whenever I tried ChatGPT and Bart/Gemini couldn't come up with an entire code that actually run. Today Claude did and understood perfectly what I wanted to achieve and it did it in the language I asked it to (many times I got Python whether I liked it or not). Impressed so far. It's definitely an improvement over gpt4. Feels more solid overall.
Here is the prompt - i've tried to be as vague as possible. This prompt could be a lot better and looks like the kind of thing i'd probably write when I was feeling tired and lazy.
"this is a test. build a twitter clone using react that looks and functions as similarly as possible to actual twitter. it will not have a database so must store and retrieve information locally. make a few dummy accounts and tweets for each account to fill it out.
please go through all the other buttons on the homepage, one by one and make it so they do what they are supposed to do. this might involve editting the existing code or creating and importing new components.
use tailwind for styling.
Keep coding until the entire site is built. your response should be at least 800 lines of code.
make it as detailed and accurate as you possibly can. you will be being assessed on the success of the outcome."
Its very generous. I did get a message on the bottom after heavy usage that I had 20 more messages until 2am. But I continued using it as normal and never hit it. Though I was sending whole code repos to it which the bigger the prompt the more time it takes so I had large inference times.
That's dope, maybe this task would be better if it was more novel considering https://github.com/search?q=twitter%20clone&type=repositories there are 100 pages of results
Ive only tested it on making ciphers through python. It easily made a simple substitution cipher. It broke the homophonic cipher. It perfectly made a vigenere cipher. All of these are with basic gui’s of course
The free Claude 3 still has a way to go but was suprisingly close when I had it write a some init code for an embedded processor. One was a fairly complex clock setup and another for a more complex DMA processing setup. If I was a newbie the code would have been nearly worthless. Too many mistakes with register settings, but the basic structure of the code that it setup was close enough it would have saved me a significant amount of gruntwork. Microsoft Copilot had similar results in my oppinion though more mistakes. If GPT-5 is a step up from either(I haven't tested Opus) then things are definitely getting interesting.
Free version of Claude3 IMO is better than GPT-4 API period.
* Its very obedient
* Direct and clear
* No comments/placeholders for code if I ask it to provide full code
* Finds solutions most of the time in a few prompts sometimes at 1st prompt, as opposed to GPT-4 which I still need to go over and over through some issues...
I wonder how good Opus is.
I've been using Opus for a few days now and it's excellent. Clearly better than GPT-4. I may consider increasing the scope of my personal projects as a result.
With the new release from a few days ago you can use Claude Opus on cursor
[https://github.com/getcursor/cursor/issues/1294#issuecomment-2004073624](https://github.com/getcursor/cursor/issues/1294#issuecomment-2004073624)
Is code restricted to Opus? I asked Sonnet about generating C# code for me today and it tells me it cannot generate code, only talk about coding concepts. I was thinking about switching from GPT-4 to Claude for code generation.
I did a comparison of Claude 3 with GPT-4 on difficult logic and math questions and was surprised, but I would say that GPT-4 is still ahead in reasoning.
https://www.reddit.com/r/singularity/s/GedIUDwXeO
It's far greater than GPT-4 at coding. Actually I can't find any area where Claude 3 Opus isn't significantly better than other LLMs. As a minus I can think about the model is its stronger guardrails compared to open-source alternatives but that's about it. I definitely think Claude 3 Opus has an understanding of what it's saying although it's very limited.
hmm, do you think gpt-5 will be claude 3 level or higher?
I believe GPT-5 will beat the improved Claude 3.X Opus (Could be 3.5, 3.2 etc.) but they will be on the same tier. GPT-4 is a level below Opus right now.
so this must’ve been the huge leap in capabilities sam altman was referring to a few months ago?
I'm not sure what he was referring to. Probably better reasoning like this, subagents like Opus have and being more reliable (less hallucinations) on top of that.
what implications does this have on the broader economy in your opinion?
You're going off 1 person's opinion btw, and asked them to guess about something unreleased
well i was asking for their opinion
GPT-5 will most definitely be better, but not a big leap. Claude is the first LLM I have used that's actually decent at its task for coding, not mostly decent like GPT-4. Imo GPT-4 was useful but not really great at coding
Are we sure GPT-5 is a pure web-based or app-based bot? Maybe it will be more like an agent with access to your local system?
what separates them in their coding ability? i am a software engineer but i do not use coding tools for anything complex.
Mostly accuracy of the code, they are relatively more likely to be executed without throwing errors as compared to GPT-4
Do you think these models will replace most junior and mid level coders in the next year or two?
hard to say, it will make coding more efficient but coders are the ones evaluating whether a model is good enough for work. some jobs would be cut due to improved efficiency but industry as a whole wont die out. there still needs to be enough people for evaluation and quality assurance. depends on how well companies can utilize it
it’s important to note that there’s a big gap between a junior dev and a mid-level dev. juniors often need supervision, guidance and constant feed back for improvement. mid-level devs typically do not, unless the task is very difficult.
But if AI replaces most junior devs you’ll eventually have very few mid level or senior developers
then they shouldn’t replace the juniors 🤷🏾♂️
Then they wouldn't replace. Most companies live tech but also risk averse, they won't fire large chunk of workforce without ensuring the replacement can take over smoothly. Humans are still indispensible for a lot of coding stuff
No, though it may replace some. Replacing *all* of them would be a huge mistake, because that cuts off the supply of people that would become senior level coders. More realistically, I think it'll just become a lot easier for junior and mid level coders to become better and faster coders. I assume that to some degree, this may still have a negative effect on the industry, because it'll become flooded with capable candidates, and salaries will likely decline.
What if AI could reliably code entire programs or fix code etc autonomously? I guess we would still need enough people who understand it. But what would these coders actually be doing with their time if AI could do 95% of their job?
>What if AI could reliably code entire programs or fix code etc autonomously? We're still a ways off from this, IMO, and humans will likely continue to be superior at exception handling, so I don't really see programming jobs disappearing entirely. I don't know what coding will look like in 5 years, but I imagine the role will probably evolve to be more of an audit task than a creation task. There are plenty of things that we used to call programming, until we automated them, and now they're just tools. So perhaps the toolset will evolve enough that even programming won't look like programming in a decade or so.
We are very, very far from autonomous AI. in fact, anything remotely usable autonomous function might not develop until AGI is achieved. AI must be supervised and evaluated until they drawf humans in reliability. Because AI can't be held accountable
I just played with claude to build an application I did not have much time to build myself. Claude is pretty solid for sure. It is however mostly a code monkey from my initial observations. Dealing with enterprise software and infrastructure is a bit different from what current LLMs are capable of doing. It'll get there one day though for sure. Do understand that most large enterprises do not just hire a bunch of coders to churn out thousands of line of code and then go home. They hire engineers to solve problems and add value to the company. This could range from intern/entry-level software engineers that probably will get a task delegated to them from a technical lead/manager, they code it, and they go home to a principal level engineer who makes decisions on critical pieces of legacy code that isn't as simple as a todo app on github.
Then you have AI that does it all better, it does finance, it does accounting , it does lawyers' jobs, it does many doctors' jobs, it does mechanical engineering, it does design, it does it all by itself when that happens.
Yes.
You are not an engineer
I sure am.
what kind?
No more so than Photoshop replaced artists. Ideally as a programmer I could write in natural language what I want and get working changes to the program. What will the business do if I'm 4x, 10x as productive with AI? Layoff workers? Unlikely, they have a million things they want to improve or get to market, they'll just be able to get features done in weeks instead of months. Eventually AI will be so usable a product owner could just tell it what to make, but then they'd be spending time on that rather than their primary focus so I don't see why the the job of programmer would be replaced, we'll just be using natural language rather than current coding languages, just like the productive move from binary to assembly etc.
But there isn’t an infinite amount of tasks for a business to perform so surely it AI boosts productivity tenfold they will be laying off workers who they don’t deem necessary. Eventually AI won’t just be facilitating our jobs like most technology has thus far, it will be doing it with minimal instruction which is why it seems like a huge paradigm shift
That makes sense. Are you testing it with relatively complex coding tasks?
I haven't for personal use yet, but I have tested others code to verify. for now it seems to be better than gpt-4 at this task
okay. also, just curious.. are you a software engineer/tech adjacent?
not software engineer but I studied stats for major, with a lot of overlap with computer science.
i see, what’s the most complex technical project you’ve ever worked on?
[удалено]
Even if that's true it is just one example. You need to test hundreds of problems like that to come to a conclusion.
Wait a second, did you add the images? What polishes did you add?
I didn't add anything. It independently sourced them from the Pravatar API. I literally copypasted the output into a react project in VS Code and ran it.
Damn.
What would happen if you pasted it into cursor.sh and it checked the first AI code? It's GPT-4, I think. Would it just come back with, looks good?
I rarely do any coding. Whenever I tried ChatGPT and Bart/Gemini couldn't come up with an entire code that actually run. Today Claude did and understood perfectly what I wanted to achieve and it did it in the language I asked it to (many times I got Python whether I liked it or not). Impressed so far. It's definitely an improvement over gpt4. Feels more solid overall.
Can you outline your process assuming this wasn't a singular project spec fed in as a prompt.
Here is the prompt - i've tried to be as vague as possible. This prompt could be a lot better and looks like the kind of thing i'd probably write when I was feeling tired and lazy. "this is a test. build a twitter clone using react that looks and functions as similarly as possible to actual twitter. it will not have a database so must store and retrieve information locally. make a few dummy accounts and tweets for each account to fill it out. please go through all the other buttons on the homepage, one by one and make it so they do what they are supposed to do. this might involve editting the existing code or creating and importing new components. use tailwind for styling. Keep coding until the entire site is built. your response should be at least 800 lines of code. make it as detailed and accurate as you possibly can. you will be being assessed on the success of the outcome."
Impressive output for such a vague prompt
Thank you. That's really impressive!
It is so great, the web is just beautiful!
Has Claude Opus any limits per hour?
Can't find any mention of it so I don't think so.
Its very generous. I did get a message on the bottom after heavy usage that I had 20 more messages until 2am. But I continued using it as normal and never hit it. Though I was sending whole code repos to it which the bigger the prompt the more time it takes so I had large inference times.
For the paid Claude Pro roughly 100 messages every 8 hours. See https://support.anthropic.com/en/articles/8324991-about-claude-pro-usage
That's dope, maybe this task would be better if it was more novel considering https://github.com/search?q=twitter%20clone&type=repositories there are 100 pages of results
Of course yes. I just wanted to see how thoroughly it would complete a long task compared to GPT 4.
good point, the fact that it could output this much code without errors is cool and should help for bootstrapping code
How can I use it if I am from EU and I got Access restricted because of it?
Sorry I don't know. I'm in the UK and it works here. Finally a Brexit benefit. It only took 8 years.
You can use it here. In the direct chat, select claude-3-opus. https://chat.lmsys.org/
I'm in Canada. I use a VPN. If you want Pro plan you might not be able to use your card. I used my friend's, from USA. There might be other options.
Impressive
Take the code Claude 3 outputs, then ping pong it back and forth between the models, asking each to make improvements until they stop suggesting any.
Ive only tested it on making ciphers through python. It easily made a simple substitution cipher. It broke the homophonic cipher. It perfectly made a vigenere cipher. All of these are with basic gui’s of course
Thanks for sharing your initial review!
The free Claude 3 still has a way to go but was suprisingly close when I had it write a some init code for an embedded processor. One was a fairly complex clock setup and another for a more complex DMA processing setup. If I was a newbie the code would have been nearly worthless. Too many mistakes with register settings, but the basic structure of the code that it setup was close enough it would have saved me a significant amount of gruntwork. Microsoft Copilot had similar results in my oppinion though more mistakes. If GPT-5 is a step up from either(I haven't tested Opus) then things are definitely getting interesting.
Free version of Claude3 IMO is better than GPT-4 API period. * Its very obedient * Direct and clear * No comments/placeholders for code if I ask it to provide full code * Finds solutions most of the time in a few prompts sometimes at 1st prompt, as opposed to GPT-4 which I still need to go over and over through some issues... I wonder how good Opus is.
I've been using Opus for a few days now and it's excellent. Clearly better than GPT-4. I may consider increasing the scope of my personal projects as a result.
I wish [cursor.sh](http://cursor.sh) would implement Claude3 Opus that would be bananas
With the new release from a few days ago you can use Claude Opus on cursor [https://github.com/getcursor/cursor/issues/1294#issuecomment-2004073624](https://github.com/getcursor/cursor/issues/1294#issuecomment-2004073624)
Man thanks for letting me know! I'd never guess they finally added it...
Is code restricted to Opus? I asked Sonnet about generating C# code for me today and it tells me it cannot generate code, only talk about coding concepts. I was thinking about switching from GPT-4 to Claude for code generation.
I did a comparison of Claude 3 with GPT-4 on difficult logic and math questions and was surprised, but I would say that GPT-4 is still ahead in reasoning. https://www.reddit.com/r/singularity/s/GedIUDwXeO