So I had a micro PC that was running one of my core services and it only supports NVMe drives. Unfortunately, this little guy cooked itself and I’m not in a position to replace the drive. The system is still good and is fairly powerful, so I want to be able to reuse it.
I’m thinking I want to set up some kind of netboot appliance on another server to be able to allow me to boot the system without ever having a local disk. One thing I want to is run some docker images (specifically Frigate) but i wont be able to write anything to persistent storage locally. NFS shares are common in my setup.
Is it even possible to make a ‘gold image’ of a docker host and have it netboot? I expect that memory limitations (16GB) will be my main issue, but I’m just trying to think of how to bring this system back into use. I have two NAS appliances that I can use for backend long term storage (where I keep my docker files and non-database files anyway), so it shouldn’t be too difficult to have some kind of easily editable storage solution. I don’t want to use USB drives as persistent storage due to lifespan concerns from using them in production environments.
Id be pretty wary of using any system that “cooked” an nvme. That not the sign of an actual healthy system.
Was the failure just heat damage?
I’m actually not 100% what killed the drive. It could have been an issue with the drive wearing out, but my services didn’t write much locally and it wasn’t super old so I assume its a heat issue with a fanless micro system. I try to write everything important to my NASs so I don’t have to worry about random hardware failures, but this one didn’t have backups configured before it failed. Other than the drive issue its been solid for 1.5-2 years of near constant uptime.
Unless you are writing petabytes the nvme did not just burn “wear” out. Probably shouldn’t do anything until you figured out what caused this failure
Consumer SSDs generally only have a 200-600TBW rating, not petabytes. Its pretty easy to wear one out in a few years installed in a server.
Yeah, I didnt think that was a realistic possibility. Given that it was a bitty fan less nuc style system, I’m leaning more to a heat death as I originally surmised.
E: though another person suggested a frigate misconfig could have worn the drive out early
Is the drive totally dead? Curious what SMART would report.
My gut feeling is that it’s probably cheaper to buy a replacement m.2 than the hours of time to get netboot working but it could be a fun project!
I might be able to hook it up to a usb NVMe reader, but when I initially tried I barely got any recognition of the drive from the OS. My primary system is windows, so I might get more info from one of my linux systems, just haven’t had the fucks to give to the dead drive. As for a replacement drive, funds are scarce and time/learning is (comparatively) free. Someone else suggested kubernetes, so I might look into that to see if that can accomplish what I’m looking for.
Modern minipc often place nvme near other elements that heats and that’s what kills nvme since they need to be cooled too, you can try to place cooling pads and micro radiators here and there and try to isolate them from each other but many mini pc have this flaw nowadays
Yeah, pretty much what I guessed. The drive came with a cooling pad but it didn’t do much at all