AI Coding Is Massively Overhyped, Report Finds

HaraldvonBlauzahn@feddit.org · edit-2 2 days ago

AI Coding Is Massively Overhyped, Report Finds

droans@midwest.social · 9 hours ago

It’s been clear that the best use of AI in a professional environment is as an assistant.

I don’t want something doing my job for me. I just want it to help me find something or to point out possible issues.

Of course, AI isn’t there yet. It doesn’t like reading through multiple large files. It doesn’t “learn” from you and what you’re doing, only what it’s “learned” before. It can’t pick up on your patterns over time. It doesn’t remember what your various responsibilities are. If I work in a file today, it’s not going to remember in a month when I work on it again.

And it might never get there. We’ve been rapidly approaching the limits of AI with two major problems. First, scaling is becoming exponential. Doubling the training data and computing resources won’t produce a model that’s twice as good. Second, overtraining is now a concern. We’re discovering that models can produce worse results if they receive too much training data.

And, obviously, it’s terrible for the environment and a waste of resources and electricity.

kidney_stone@lemmy.world · 7 hours ago

I have to say that I am surprised by how many people are bashing on AI coding.

I personally use AI tools like cursor and claude code to help me with code, and I have to say that they are incredibly helpful with managing huge codebases and fixing repetitive elements. Obviously, I have to know a lot about coding and system design and whatnot, and I have to write many instructions, so it can’t ‘replace programmers’ or anything. But in the hands of those who already know their stuff, it really does speed things up significantly.

I have no idea how other people are using it that is inspiring all these reports on how awful AI coding is. Do non-programmers just enter Claude and write “make program!!!” and then expect it to work?

Dogiedog64@lemmy.world · 6 hours ago

Do non-programmers just enter Claude and write “make program!!!” and then expect it to work?

Unironically, yes. The main proponents of this tech absolutely DO just do that, because they’re Business Idiots who are fully happy and content with code that looks “”“good enough”“”, even if it does nothing useful. You can see this in all the credulous articles detailing tech CEOs who decree “All workers shall now use AI!!!” while vibecoding a bunch of slopware. They have no idea what the fuck they’re actually doing, and they don’t care. All the other Business Idiots have decided that this is the new hotness, so they either fall in line and “do AI”, even at massive financial loss to their company and themselves, or they stop being part of the “In” crowd.

And to the Business Idiot, there is no fate worse than being irrelevant.

The Era Of The Business Idiot - Ed Zitron

kidney_stone@lemmy.world · 6 hours ago

Kinda wild that someone would do this and send it to prod, like it is destined to break everything and you won’t even know what broke it because it was made by AI without and human guidance, and the next time you turn on Claude it’s not gonna know this, so you will have to waste so much time trying to figure out what is wrong.

Crazy that everyone is apparently okay with this, like I can see the business people trying to sell the hype and the whole narrative of “AGI is around the corner!!!”, but it’s silly seeing business people actually believing those narratives created by other business people.

Dogiedog64@lemmy.world · 6 hours ago

That’s the truly hilarious part - NOBODY is okay with this! EVERYONE knows it sucks! The ONLY reason it’s still relevant is because Business Idiots are so goddamn pervasive and stupid. Any rational society would’ve taken GenAI out back and shot it years ago, that’s how bad it is. If you want to get into the real nitty-gritty of JUST HOW TRASH things are, I recommend more of Zitron’s work; it’s a common topic for him to rip apart in a well-researched rage.

rmrf@lemmy.ml · 5 hours ago

What type of projects do you work on?

Zacryon@feddit.org · edit-2 11 hours ago

There was a similar study / survey ~~by Microsoft~~ (idk anymore if it was really them) recently where similar results where found. In my experience, LLM based coding assistants are pretty okay for low level complexity tasks, creating boilerplate code, especially if it does not require deeper understanding of the system architecture.

But the more complex the task becomes, the harder they start to suck and fail. This is where the time drag begins. Common mistakes or outdated coding approaches are also used rather often instead of newer standards. The deviations from the given instructions are also happening way too often. And if you do not check the generated code thoroughly, which can happen if the code “looks okay” on first glance, then finding bugs and error sources due to this can become quite cumbersome.

Debugging is where I have wasted most of my time with AI assitants. While there is some advantage in having a somewhat more capable rubber duck, it is usually not really helpful in fixing stuff. Either the error/bug sources are completely missed (even some beginner mistakes) or it tries to apply band-aid solutions rather than solving the cause or, and this is the worst of all, it is very stubborn about the alleged problem cause (possibly combined with forgetting earlier debugging findings, resulting in a tedious reasoning and chat loop). I have found myself more often than I’d like to arguing with the machine. Hallucinations or unfounded fix hypotheses make this regularly worse.
However, letting the AI assistant add some low level debug code to help analyze the problem has often been useful in my experience. But this requires clear and precise instructions, you can’t just hope the assistant will cover all important values and aspects.

When I ask the assistant to logically go through some lines of code step by step, possibly using an example, to nudge it towards seeing how it’s reasoning was wrong, it’s funny to see, e.g. with Claude, how it first says stuff like “This works as intended!” and a moment later “Wait… this is not right. Let me think about it again.”

This becomec less funny for very fundamental stuff. There were times where the AI assistant told me that 0.5 is greater than 0.8 for example, which really shows the “autocorrect on steroids” nature of LLMs rather than an active critical thinking process. This is bad, obvio. But it also makes jobs for humans in various fields of IT safe.

Typing during the whole conversation is naturally also really slow, especially when writing more than a few sentences to provide context.

Where I do find AI assistants in coding mostly useful, is in exploring APIs that I do not know so well, or code written by others that is possibly underdocumented. (Which is unfortunately really common. Most devs don’t seem to like writing documentation.)
Generating documentation for this or my own code is also pretty good most cases but also tends to contain mistakes or misses important mechanisms.

Overall in my experience I find AI assistance useful and see a mild productivity speed boost for very low level tasks with low complexity and low contextual knowledge requirements. They are useful for exploring code and writing documentation, but I can not really recommend them for debugging. It is important to learn and know how to use such AI tools precisely in order to save time instead of wasting time, since as of now they are not really capable of much.

dumples@midwest.social · 1 day ago

I always wondered how they got those original productivity claims. I assume they are counting everytime a programmer uses a AI suggestion. Seems like the way to get the highest markable number for a sales team. I know that when I use those suggestions occasionally they will be 100% correct and I won’t have to make any changes. More often than not it starts correct and then when it fills it adds things I don’t need or is wrong or isn’t fitting how I like to write my code. Then I have to delete and recreate it.

The most annoying is when I think I am tabbing for autocomplete and then it just adds more code that I don’t need

ATPA9@feddit.org · 10 hours ago

“My source is that I made it the fuck up!” -CEO of every AI company

HaraldvonBlauzahn@feddit.org · 15 hours ago

I always wondered how they got those original productivity claims.

Probably by counting produced lines of code, regardless their correctness or maintainability.

And that’s probably combined with what John Ousterhout calls “Debugging a System into Existence”, which is, just assuming the newly generated code works until inevitably somebody comes with a bug report and then doing the absolute minimum to make that specific bug report go away, preferably by adding even more code.

droans@midwest.social · 9 hours ago

It seems like a good way to actually determine productivity would be to make it competitive.

Have marathon and long-term coding competitions between 100% human coding, AI assisted, and 100% AI. Rate them on total time worked, mistakes, coverage, maintainability, extensibility, etc. and test the programmers for knowledge of their own code.

dumples@midwest.social · 8 hours ago

That what I thought. Each line of generated code even if deleted afterwards. Or have someone try to get as high as possible in a single trial

flatbield@beehaw.org · edit-2 1 day ago

I have a friend that is a professional programmer. They think AI will generate lots of work fixing the shit code it creates. I guess we will see.

thingsiplay@beehaw.org · 1 day ago

I actually think a new field of “real” programmers will emerge, in which they are specialized at looking for Ai problems. So companies using Ai and get rid of programmers, will start hiring programmers to get rid of Ai problems.

blarghly@lemmy.world · 1 day ago

I mean, more realistically… ai can’t really write code reliably, but if utilized appropriately it can write code faster than a developer on their own. And in this way, it is similar to every other kind of tooling we’ve created. And what we’ve seen in the past is that when developers get better tooling, the amount of available software work increases rather than decreases. Why? Because when it takes fewer developer hours to bring a product to market, it lowers the barrier to entry for trying to create a new product. It used to be that custom software was only written for large, rich institutions who would benefit from economies of scale. Now every beat up taco truck has its own website.

And then, once all these products are brought to market, that code needs maintenance. Upgrades. New features. Bug fixes. Etc.

tohuwabohu@programming.dev · 1 day ago

Yes it will, while at the same time augmenting experienced developers that know what they’re doing. I evaluated Claude code for a month. Does it help building simple, well-defined tasks faster? Yes. Do I imagine it working well in a large scale project, maintained by multiple teams? Absolutely not.

Knock_Knock_Lemmy_In@lemmy.world · 1 day ago

AI generates really subtle bugs. Fixing the code will not be a nice job.

blarghly@lemmy.world · 1 day ago

Idk, that was basically 90% of my last job. At least the ai code will be nicely formatted and use variable names longer than a single character.

Baizey@feddit.dk · 12 hours ago

Idk, I have experienced that it’s (junio w gpt5) actually very happy to use single letter variables and random abbreviations unless explicitly forbidden. And this was for a project I had already written out with proper variable names so far

Knock_Knock_Lemmy_In@lemmy.world · 1 day ago

Oh yes. Get the AI to refactor and make pretty.

But I’ve just spent 3 days staring something that was missing a ().
However, I admit that a human could have easily made the same error.

python@lemmy.world · 12 hours ago

Something like Stackoverflow is probably the biggest source of code to train a LLM on, and since it’s based around posting code that almost works but you got some problem with it, I’m absolutely not surprised that the LLMs would pick up the habit of making the same subtle small mistakes that humans make.

abbadon420@sh.itjust.works · 2 days ago

They say the same about scrum.

“It doesn’t work in you company, because you haven’t fully implemented all aspects of scrum”

Coincidentally it costs about a gazillion dollars to become fully Scrum certified.

Knock_Knock_Lemmy_In@lemmy.world · 1 day ago

Scrum works because of 2 things.

Projects get simplified or abandoned much quicker
Tasks are assigned to the team, not the individual

Everything else is entirely optional.

itkovian@lemmy.world · 2 days ago

In other news, water is wet.

Lembot_0004@discuss.online · 1 day ago

No, they say that you can’t make bricks with water, therefore water is useless shit.

Lem Jukes@sopuli.xyz · 1 day ago

also, water isn’t wet.

AnyOldName3@lemmy.world · 1 day ago

It depends on the sense of wet that you’re using. Most of the time, the relevant kind of wet is how much water something contains, and water achieves peak theoretical wetness by that definition. It’s only in specific circumstances that the surface is coated evenly by a wetting agent definition is relevant, like painting or firefighting.

Lembot_0004@discuss.online · 1 day ago

He knows that. It is just an old stupid “pedantic” joke that still refuses to die despite its patheticness.

AnyOldName3@lemmy.world · 1 day ago

I know people who say exactly this kind of thing entirely seriously (potentially because they first saw it as an unlabelled joke that they took too seriously). Sometimes people are just incorrect pedants smugly picking fault with things that aren’t even wrong.

Not a newt@piefed.ca · 2 days ago

“Fully commit” to see the light? That… sounds more like a kind of religion, not like critical or even rational thinking.

It also gives these shovel peddlers an excuse: “Oh, you’re not seeing gains? Are you even ~~lifting~~ AI-ing, bro? You probably have some employees not using enough AI, you have to blame them instead of us.”

captainastronaut@seattlelunarsociety.org · 1 day ago

I, for one, am shocked at this news.

thingsiplay@beehaw.org · 1 day ago

Billions of dollars are spent, unimaginable amount of power is used, ton of programmers are fired, million of millions code is copied without license and credit, nasty bugs and security issues are added due to trusting the ai system or being lazy. Was it worth it? Many programmers get disposable as they have to use ai. That means “all” programmers are the same and differ only in what model they use, at least that’s the future if everyone is using ai from now on.

Ai = productivity increases, quality decreases… oh wait, Ai = productivity seems to increase, quality does decrease

dinckelman@programming.dev · edit-2 1 day ago

This is just a very fucked reminder of that easy success never comes without a cost. Unfortunately, normal people paid that debt, while business majors continue feeding the pump and dump machine

thingsiplay@beehaw.org · 1 day ago

You can’t be rich without poverty somewhere else

blarghly@lemmy.world · 1 day ago

That’s literally not true at all. Developed nations enjoy unprecedented levels of wealth these days, while incomes have consistently been rising in developing nations for decades. If it were true, then for every person we have now on Lemmy shit posting, we would need someone else living on less substance than our paleolithic anscestors did. We can certainly argue about the overall distribution of the wealth that has been generated - but it is blatently obvious that higher standards of living do not imply that someone, somewhere else must be living in poverty.

super_user_do@feddit.it · 1 day ago

I asked chatgpt for a function to parse text from a file in PHP. It gave me 200 lines of weird ahh code

Then I looked the actual documentation and it took me 2 seconds bc someone in the comments gave me a better response

python@lemmy.world · 11 hours ago

We were having pretty big problems with our yarn configurations a few weeks ago because the default npm registry is now blocked for security reasons and we’re supposed to only use our own internal registry. The coworker who was supposed to fix it asked his AI whether he can override the configuration for the location for the default registry and the AI said no, so he closed the ticket as “can’t do anything about it I guess 🤷”. A few days later, another coworker just went and changed that configuration by looking at how custom scopes are defined and using common sense. Worked instantly, we haven’t been having any yarn issues since.

majster@lemmy.zip · 1 day ago

I’m a dev at a tech startup. Most devs at the company are pretty impressed by claude code and find it very useful. Hence the company has a pretty hefty budget allocated for it.

What I need to do is think trough the problem at hand and claude will do the code->build->unit test cycles until it satisfies the objective. In the meantime I can drink cofee in peace and go to bathroom.

To me and to many of my coworkers its a completley new work paradigm.

nic2555@lemmy.world · 1 day ago

Maybe I should try it to understand, but to me, this kind of feel like it would produce code that would not follow the company standards, code that will be harder to debug since the dev have little to no idea on how it work and code that is overall of less quality than the code produce by a dev that doesn’t use AI.

And I would not trust those unit tests, since how can you be sure if they test the correct thing, if you never made them fail in the first place. A unit test that passes right away is not a test you should rely on.

Don’t take it the wrong way, but if Claude write all of your code, doesn’t that make you more of a product owner than a dev ?

majster@lemmy.zip · 1 day ago

sorry my comment mislead you, it’s not that hands off experience that you transform from dev to pm. Its more like a smart code monkey that helps you. I absolutely have to review almost all of the code but I’m sparred typing

nic2555@lemmy.world · 1 day ago

Thank you for clarifying. It does align more with the way I would use LLM in my day to day work then, which is quite reassuring.

Even if it doesn’t work for me, I can still see the advantage of using AI assistant in those context. In the end, as long as you are doing the work required, the tools you use don’t really matters!

Ŝan@piefed.zip · 1 day ago

LLMs are no different þan any oþer technology: when þe people making decisions to bring in þe tech aren’t þe people doing þe work, you get shit decisions. You get LLMs, or Low Code/No Code platforms, or cloud migrations. Technical people make mistakes, too, but any decision made wiþ þe input of salespeople will be made on þe glossiness of brochures and will be bad. Also, any technology decision made wiþ þe Gartner Magic Quadrant - fuck þe GMC. Any decision process using it smells bad.

thingsiplay@beehaw.org · 1 day ago

Well, there is a key difference of Ai compared to other technology: The ability to “think” and “decide” themselves. That’s the point of the tech. The problem is, that people “think” that’s true.