• 0 Posts
  • 138 Comments
Joined 2 years ago
cake
Cake day: June 21st, 2023

help-circle
  • Why not?

    Are you asking the author or people in general? If the author didn’t answer “why not” for you, then I can.

    Yes, I’ve used Claude. Let’s skip that part.

    If you don’t know how to write or identify defensive code, you can’t know if the LLM generated defensive code. So in order for a LLM to be trusted to generate defensive code, it needs to do so 100% of the time, or very close to that.

    You seem to be under the impression that Claude does so, but you presumably can tell if code is written with sufficient guards and tests. You know to ask the LLM to evaluate and revise the code. Someone without experience will not know to ask that.

    Speaking now from my experience, after using Claude for work to write tests, I came out of that project with no additional experience writing tests. I had to do another personal project after that to learn the testing library we used. Had that work project given me sufficient time to actually do the work, I’d have spent some time learning the testing library we used. That was unfortunately not the case.

    The tests Claude generated were too rigid. It didn’t test important functionality of the software. It tested exact inputs/outputs using localized output values, meaning changing localizations was potentially enough to break tests. It tested cases that didn’t need to be tested, like whether certain dependency calls were done in a specific order (those calls were done in parallel anyway). It wrote some good tests, but a lot of additional tests that weren’t needed, and skipped some tests that were needed.

    As a tool to help someone who already knows what they’re doing, it can be useful. It’s not a good tool for people who don’t know what they’re doing.



  • Mixins are composition! They don’t describe what a type is (“circle” is a “shape”, etc) but rather what they can do (“circle” can have its area calculated, it can be drawn, it can be serialized, etc). Mixins in Python just so happen to be implemented by adding base classes.

    Inheritance itself isn’t really a problem. It usually only matters when you have unnecessarily deep hierarchies, where a change in a base class can change functionality in dozens of classes in an unintentional way. Similarly, it can add complexity once the hierarchy is deep enough, but only really if you throw too much into the base classes.

    Python’s ABCs are more of interfaces though, which is why despite Python using base classes to “inherit” them, a lot of that is really composition (or putting a class together from parts) rather than inheriting and overriding implementation details from a parent/grandparent/etc type.




  • I miss the days when it was simpler as well. Back before there were botnets with hundreds of thousands of compromised routers across several countries that could send tens of terabytes per second of data to your server for a sustained period of time. Back before there were thousands of bots crawling every IP and domain imaginable for exposed, abusable ports and wp-admin endpoints. Back before people started to compete in how many 9s of uptime they supported (before killing that all with LLMs anyway).

    Sadly, we can’t go back to those times. Doing so with a production service would not end well.

    The issue is not npm. Npm is a solution to a problem, even if it isn’t perfect.

    The issue is we live in a different landscape.

    Eclipse was great, having used it in the past, but its features are not exclusive to Eclipse. I can do the same inlining and extracting of code in vscode with code actions. The compile times weren’t seconds for me in the past, but they are for me now. Vite helps that even more (though that’s comparing JS to Java).


  • I agree in general with the list, but there is some stuff I disagree with still. For example, the very first section: “Work on more than one thing”.

    Like a CPU thread, if you’re responsible for multiple streams of work, you can deal with one stream getting blocked by rolling onto another one.

    This is written from the perspective of the developer, not the stakeholders. Compared to a CPU, you are a single thread. You cannot work on two things at the same time. What this is referring to is not parallelism, but a form of concurrency. Like a CPU thread, when two tasks are being executed concurrently, one task is always blocked. This means that while you, the developer, are always working, you also are always blocking at least one task, meaning you are also always blocked on at least one task.

    Instead of working on two tasks at once, pick up the second task only when the first becomes blocked.

    I believe this might be what the author was trying to convey, but the title, some wording in the section, and the bullet point at the end (“Working on at least two things at a time, so when one gets blocked you can switch to the other”) contradict that and give the impression that you should always be working on two or more things at a time.

    use as normal a developer stack as possible.

    This, I mostly agree with, but I disagree with the wording. You should be using the same tools as the rest of your team when the tool matters. However, using different Git interfaces shouldn’t matter. I’d argue the same holds true for editors as long as the editors all have the features needed for the project.

    For application work, some variety in dev environments can help you find bugs sooner even. Using different environments for development lets you test different environments naturally. For services, this is less relevant.


  • This is a super interesting approach to JS. Conceptually, it’s really cool. In practice, I don’t think I’d do it (at least for any projects I can think of) because explaining it to others would be difficult and representing complex logic as “commands” sounds a bit difficult.

    In a weird way, it reminds me of actor frameworks though. The difference is of course the separation of effects.

    One thing I wish the author would have done, though, is add some type hints. I know it’s about JS, but even some jsdoc types would have helped. It was a bit hard to know at first what the input types were to these functions.



  • Yep. This was the difference between a silent, recoverable error and a loud failure.

    It seems like they’re planning to remove all potential panics based on the end of their article. This would be a good idea considering the scale of the service’s usage.

    (Also, for anyone who’s not reading the article, the unwrap caused the service to crash, but wasn’t the source of the issues to begin with. It was just what toppled over first.)



  • monitoring how they are used is good to identify if people are actually more productive with it

    Unfortunately, many jobs skipped this step. The marketing on AI tools should be illegal.

    Far too many CEOs are promised that their employees will do more with less, so of course they give their employees more to do and make them use AI, then fire employees because the remaining ones are supposed to be more productive.

    Some are. Many aren’t.

    Like your comparison, the issue is that it’s not the right tool for every job, nor is it the right tool for everyone. (Whether it’s the right tool for anyone is another question of course, but some people feel more productive with it at times, so I’ll just leave it at that.)

    Anyway, I’m fortunate enough to be in a position where AI is only strongly encouraged, but not forced. My friend was not though. Then he used it because he had to, despite it being useless to him. Then he, a chunk of his management chain, and half his department were fired. Nobody was hired to replace them.





  • Is this your first time here?

    Your account is brand new and you’ve already posted now three posts related to JPlus in this community in one day. Please tell me you’re joking with this one.

    This post is a GitHub link to the project. Cool, I love seeing new projects, especially when the goal is to make it harder to write buggy code.

    The other post is an article that immediately links to the GitHub. The GitHub contains a link at the top to, what I can tell, the same exact article. Both the article and the GitHub README explain what JPlus is and how to use it.

    Why is this two posts when they contain the same information and link to each other directly at the top?





  • The conclusion of this experiment is objectively wrong when generalized. At work, to my disappointment, we have been trying for years to make this work, and it has been failure after failure (and I wish we’d just stop, but eventually we moved to more useful stuff like building tools adjacent to the problem, which is honestly the only reason I stuck around).

    There are a couple reasons why this problem cannot succeed:

    1. The outputs of LLMs are nondeterministic. Most problems require determinism. For example, REST API standards require idempotency from some kinds of requests, and a LLM without a fixed seed and a temperature of 0 will return different responses at least some of the time.
    2. Most real-world problems are not simple input-output machines. When calling, let’s say for example, an API to post a message to Lemmy, that endpoint does a lot of work. It needs to store the message in the darabase, federate the message, and verify that the message is safe. It also needs to validate the user’s credential before all of this, and it needs to record telemetry for observability purposes. LLMs are not able to do all this. They might, if you’re really lucky, be able to generate code that does this, but a single LLM call can’t do it by itself.
    3. Some real world problems operate on unbounded input sizes. Context sizes are constrained and as currently designed cannot handle unbounded inputs. See signal processing for an example of this, and for an example of a problem a LLM cannot solve because it cannot receive the input.
    4. LLM outputs cannot be deterministically improved. You can make changes to prompts and so on but the output will not monotonically improve when doing this. Improving one result often means sacrificing another result.
    5. The kinds of models you want to run are not in your control. Using Claude? K Anthropic updated the model and now your outputs all changed and you need to update your prompts again. This fucked us over many times.

    The list keeps going on. My suggestion? Just don’t. You’ll spend less time implementing the thing than trying to get an LLM to do it. You’ll save operating expenses. You’ll be less of an asshole.