Since Linux 5.2, there is a new metric in the hood: Pressure Stall Information (PSI). Currently it supplies 3 ciritical metrics from kernel: CPU, IO and memory. You can find the values simply by looking to files under /proc/pressure folder. I am not going to try explaining it in detail, since this page already does that pretty good. I've found it way better than just the load metric because of granularity.
User gets a warning about disk, sees it's full and starts to investigate. After removing few GB's, there is a little problem: df and du commands' conflicting output. Where is my free disk? After searching it on Google, people says "check out lsof | grep deleted" and user realizes this is really a thing on Linux. Processes hold files on disk even the files are deleted. Restarts the process, and everything seems fine now.
Scripting under different unices is a bit painful. Especially if you have a completely mixed environment with plenty of obsolete systems in it. I was trying to get md5sum of something quickly. Here is the result: Sure, you can use md5sum command on every system if you install it, but this seems like the most painless way. Still failing on some old Solaris (like 8/9) but luckily I can skip those :)
I've discovered systemd added a container utility called systemd-nspawn. It's basically chroot on steroids. (No, don't think Docker) I decided to give it a shot even they don't consider it stable yet. I tried to implement encryption a bit. Data normally sitting duck on bare-metal unencrypted servers (mainly because encryption seems hard or you trust your data center & country). If someone reboots the server and adds "rescue" to grub kernel line, (s)he will get a root user prompt, bye to personal/commercial sensitive info!
Couldn't find an easy & reliable way to check if memcached is running (for HA tools). Current check scripts on the net are generally stalling when memcached hits to maximum connections. So I wrote a dirty script that timeouts after 1 second on unsuccesful try. It's ugly but you get the idea. Why if ? Because my version of netcat wasn't exiting with correct error status on timeout for some reason.
Ext3 has 30K directory limit inside one single directory and I needed to create ~180K. Crap, I need another filesystem, which might be ext4, xfs or something like it. But I can't shutdown a production server or plug another disk. So I created one file which is big enough for my needs (let's say 100G): truncate -s 100G my_ext4_blockfile Let's format the file: mkfs.ext4 my_ext4_blockfile Now, you can mount the file wherever you like: