Off-and-on trying out an account over at @tal@oleo.cafe due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 1 Post
  • 143 Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle


  • It’s not, and I think that Excel is often used where other tools would be more-appropriate because of existing expertise with Excel, but you don’t necessarily need to use a database for all tasks where a bunch of data gets stored.

    I have plenty of scripts that deal with large amount of schlorped up data that just leave it in a text file, and Unix has a long and rich tradition and toolset for using text files for data storage and processing data in them in bulk.

    GNU R, a statistics package, has a lot of tools to schlorp up data from many sources, including scraping it from the web, and storing it large data frames to be processed and maybe visualized. It’s probably rather more performant than databases for some kinds of bulk data processing.

    Okay, so…is it appropriate here?

    One thing that spreadsheets can be handy for is for making specialized calculators that plonk some data into some simple model and spit out a result. Having, say, the current temperature in a given city may be a perfectly reasonable input to make available to a spreadsheet, I think.


  • I’m not familiar with FreshRSS, but assuming that there’s something in the protocol that lets a reader push up a “read” bit on an per article basis — this page references a “GReader” API — I’d assume that that’d depend on the client, not the server.

    If the client attempts an update and fails and that causes it to not retry again later, then I imagine that it wouldn’t work. If it does retry until it sets the bit, I’d imagine that it does work. The FreshRSS server can’t really be a factor, because it won’t know whether the client has tried to talk to it when it’s off.

    EDIT: Some of the clients in the table on the page I linked to say that they “work offline”, so I assume that the developers at least have some level of disconnected operation in mind.

    The RSS readers I’ve always used are strictly pull. They don’t set bits on the server, and any “read” flag lives only on the client.


  • tal@lemmy.todaytoSelfhosted@lemmy.worldLVM question
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 days ago

    Secondly, is there a benefit to creating an LVM volume with a btrfs filesystem vs just letting btrfs handle it?

    Like, btrfs on top of LVM versus btrfs? Well, the latter gives you access to LVM features. If you want to use lvmcache or something, you’d want it on LVM.


    1. The best engineers are obsessed with solving user problems.

    Ehh. Not sure I agree. I mean, I think that there is a valid insight that it’s important to keep track of what problem you’re actually trying to solve, and that that problem needs to translate to some real world, meaningful thing for a human.

    But I also think that there are projects that are large enough that it’s entirely reasonable to be a perfectly good engineer who isn’t dealing with users much at all, where you’re getting requirements that are solid that have been done by up someone else. If you’re trying to, say, improve the speed at which Zip data decompression happens, you probably don’t need to spend a lot of time going back to the original user problems. Maybe someone needs to do so, but that doesn’t need to be the focus of every engineer.

    1. Bias towards action. Ship. You can edit a bad page, but you can’t edit a blank one.

    I think I’d go with a more specific “It’s generally better to iterate”. Get something working, keep it working, and make incremental improvements.

    There are exceptions out there, but I think that they are rare.

    1. At scale, even your bugs have users.

    With enough users, every observable behavior becomes a dependency - regardless of what you promised. Someone is scraping your API, automating your quirks, caching your bugs.

    This creates a career-level insight: you can’t treat compatibility work as “maintenance” and new features as “real work.” Compatibility is product.

    This is one thing that I think that Microsoft has erred on in a number of cases. Like, a lot of the value in Windows to a user is a consistent workflow where they can use their existing expertise. People don’t generally want their workflow changed. Even if you can slightly improve a workflow, the re-learning cost is high. And people want to change their workflow on their own schedule, not to have things change underfoot. People don’t like being forced to change their workflow.

    The fastest way to learn something better is to try teaching it.

    I don’t know if it’s the fastest, but I do think that you often really discover how embarrassingly large the gaps in your own understanding are when you teach it.

    A little kid asking “why” can be a humbling experience.




  • From my /etc/resolv.conf on Debian trixie, which isn’t using openresolv:

    # Third party programs should typically not access this file directly, but only
    # through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
    # different way, replace this symlink by a static file or a different symlink.
    

    I mean, if you want to just write a static resolv.conf, I don’t think that you normally need to have it flagged immutable. You just put the text file you want in place of the symlink.


  • Also, when you talk about fsck, what could be good options for this to check the drive?

    I’ve never used proxmox, so I can’t advise how to do so via the UI it provides. As a general Linux approach, though, if you’re copying from a source Linux filesystem, it should be possible to unmount it — or boot from a live boot Linux CD, if that filesystem is required to run the system — and then just run fsck /dev/sda1 or whatever the filesystem device is.


  • I’d suspect that too. Try just reading from the source drive or just writing to the destination drive and see which causes the problems. Could also be a corrupt filesystem; probably not a bad idea to try to fsck it.

    IME, on a failing disk, you can get I/O blocking as the system retries, but it usually won’t freeze the system unless your swap partition/file is on that drive. Then, as soon as the kernel goes to pull something from swap on the failing drive, everything blocks. If you have a way to view the kernel log (e.g. you’re looking at a Linux console or have serial access or something else that keeps working), you’ll probably see kernel log messages. Might try swapoff -a before doing the rsync to disable swap.

    At first I was under suspicion was temperature.

    I’ve never had it happen, but it is possible for heat to cause issues for hard drives; I’m assuming that OP is checking CPU temperature. If you’ve ever copied the contents of a full disk, the case will tend to get pretty toasty. I don’t know if the firmware will slow down operation to keep temperature sane — all the rotational drives I’ve used in the past have had temperature sensors, so I’d think that it would. Could try aiming a fan at the things. I doubt that that’s it, though.