I don’t bother to back up the media files that are served to devices. I have backups of the script that created the rip for each file, so I can just re-create it if need be. For the same reason, I just use sha1sum on the media files, and run a check each month to see if anything has changed. The scripts that create a file update the sha1sum automatically. The hardware RAID controller does the patrol read/consistency check.
I do back up all the metadata (posters, .nfo files, etc.), though, because that’s really small.
Media isn’t usually worth going to massive troubles to preserve, although I am still rather dismayed at losing the original Beavis and Butthead collection with commentaries.
The problem does indeed seem to have occurred when Seagate started playing with platter sizes. I understand that the failed drives had a three year warranty in my situation, and they failed after around 28 months. I didn’t return the disks however as I suspected the shipping costs would outweigh the value of the replacement disks. I’ve two DIMMs of Corsair DDR3 1600Mhz that I replaced three times under their lifetime warranty, until I realised that the replacements were duff.
Make sure that it’s a good controller, and not something like a FakeRAID.
As far as I’m concerned, a card isn’t a real RAID controller without a backup capacitor for the on-board cache RAM. Pretty much the minimum I use is an LSI 9265-8i, although my chassis that support 12Gbps SAS use more modern controllers, even though I don’t have any disks that support that speed yet.
I learned how to do this stuff by protecting petabytes of data…at that scale, you learn that defining what you really need to protect, and how much protection it needs is the key to being able to get the job done.
I can confirm most of what you state here. Only that I don’t use LVM’s (It has, from experience, a performance impact and makes the FS management more complex). And regarding proprietary Raid hardware, even though I have a Microserver Gen8, I use software raid. If the raid controller dies and you don’t find a backup for it, you definitely need Backup disks.
And also - as you said - Backups are Primordial! Never do without it.
PS: For Backups you may want to take a look at “Borgbackup” … Amazing piece of software! Even though that for my media partitions I tend to use rsync + external disk, for everything else I use borgbackup!
Now you’ve got me worried. I have a 4x6TB RAID5 array in my QNAP and it’s nearly full so thinking of slowly replacing each 6TB with an 8TB disc. These are WD REDs and have been totally fine so far. I always spread out the buying to avoid getting discs from the same batch.
Can you explain why RAID5 is not recommended for disc if such size please?
Linux software RAID has been as fast as HW RAID for a long time and it is still infinitely more flexible.
For versioned backups, I’ve been using rdiff-backup for about 8 yrs. Pretty happy with that. The interface is very much like rsync, but with versioning of ownership, permissions, ACLs, etc. Pull or push. Best to use Pull for better security.
I used to think that re-ripping from the original purchased media was sufficient to backup. Then I thought about all the TIME it took over the years and came to the conclusion that a few $150 disks sure would make recovery easier.
Sounds like we all have extensive experience doing these things, which has shaped our solutions. I’d rather have 2 cheap baskets for my eggs, than 1 super-safe basket. Mom was right.
So Sam, what sort of NAS device/advice where you thinking? Leaning towards the $800 commercial units or more in the sub-$200 range that can have disks added as needed? Internal disk or external? eSATA or USB3? What are you thinking? Got any requirements?
I can see why someone beginning a new backup solution would definitely want to take a look at borg. Last fall, I was having a discussion about it on the Ubuntu Forums and trying to compare it with rdiff-backup. Both sides were relatively happy with their chosen solution. The borg guys didn’t have information for how long backups tool or how much storage about 60 days of daily backups took, on average.
I know that rdiff-backup is like rsync for the first backup, then just 2-3 minutes for daily backups after that. 60 days require 1.20x the storage for the original, as a rough estimate.
hard drive - RAID 5 with big SATA disks. Do or don't? - Server Fault - one of the storage vendors was using 4TB disks for a test failure in RAID5. After 60 days of trying to rebuild the array, post-failed disk, he stopped the testing. I had a 4TB disk fail here about a year ago. I wasn’t using RAID5. Just restoring the data onto the replacement disk took 26 hrs on a quiet system. Use RAID1 or RAID10 on large disks. There was a presentation about this at SELF in either 2017 or 2016. Sorry, I don’t remember. The video should be on youtube. I’ve been attended the SELF conference http://www.southeastlinuxfest.org/ for the last 5 years. Good group of folks and pretty cheap for 3 days of knowledge from industry experts. Redhat is just down the road, so they send 15 of their experts. Learn huge amount on linux containers - not just the how to, but the reasons WHY you need to do some things and WHY doing it the way most docker users do it is bad for security.
The quesiton I ask myself is if rotating disks should not be replaced by SSDs.
Then again, loosing data on an SSD is way more abrupt. If a rotating disk starts to fail, you usually have a SmartD-Warning coming first, then a while after the disk dies. It however always gave me time to prepare for the replacement in time and make sure the last backups were Ok etc.
On an SSD, a failure usually involves failure of the electronics. So when it fails, it fails hard and now. No warning etc. And you cannot really rely on SMART data as we are used to read on spinning disks.
There are way more errors showing up on the SSD as on the normal disk, all the time. But apparently that is normal. My desktop SSD, 6 years old - if the data would be applied to a spinning disk, I would have burned it already!
What do you guys think? SSD or not SSD? -> For NAS (IMHO, should not be an issue, as the files don’t change that much).
I wouldn’t use an SSD for a NAS even if I was forced to. SSD’s are powerful for their read / write speeds, not for storage.
And they are still expensive nowadays, so I would gladly spend 100 € for a 4TB HDD instead of buying a 512gb SSD.
Yeah, some hosting companies started using SSD’s but mainly for small hosting plans where you can install db-intensive / hdd-not-so-intensive applications (Wordpress). And they of course have the money to do so lightheartedly.
For years, I used a couple of Dlink DNS-323’s hacked with the funplug. When I needed to expand, I added a dual sata enclosure to a Pi3 I was using to monitor my UPS, and it performs better than the Dlinks! With the new Pi3 b+ running Raspbian stretch, they are faster and more responsive. I have two 4 TB disk in a Vantec MX-400 enclosure.
My OSMC is on a Pi 3 also! They work great together.
Got an 8gen MicroServer from HP running OMV from an SSD + 3 Drives (with the option for a fourth)
Im also using snapraid as a kind of failsave/backup. I know raid != backup but im only talking about my movie collection here. Nothing that cant be recovered if everything would fail (rerip and download) and i dont have the money to have a real backup solution for that much data.
My Nextcloud instance and other important personal files are also “secured” using this method but also with a “real” backup copy on an external drive every month.
I can really recommend OMV, every basic nas functions can be controlled through the web interface. If you want to do more you may have to get into the shell for some things (full debian OS) or use a docker container (which is available through the web interface).
My “Nas” is currently running my Home Network shares, a nextcloud instance, letsencrypt updates and 2 docker containers for Home Assistant and a MQTT Server.
RAID disk sets created by LSI RAID controllers are completely transportable to any other LSI controller that is at least as full featured (as far as RAID levels go). In addition, mdadm can also read the LSI array metadata and get you access to the data.
I use SSD as cache for some LVM volumes, but I’m serving iSCSI that stores VM images, so random write is important. I don’t have any SSD cache backing the LVM where my media sits.
SSDs are too expensive for media use, at least for me.
Also, SMART data only shows disk issues about 78% of the time, pre-failure, so count yourself lucky if you’ve experienced 100% pre-fail SMART warnings. I haven’t seen that personally or at an enterprise with over 50K servers.
Consumer SSDs aren’t made to be used 24/7/365 and MLC writes add up quicker than expected. SLC SSDs are too expensive for most consumers still.
After my MLC SSD failed last month (or was it 2 months ago?), I started suspending my laptop hoping I’d get more than 3 yrs of life. We shall see. The device started showing issues a year earlier - the file system would switch to read-only mode, but a reboot would remove the error and all was fine for another month or so. I checked the system logs and nothing related to storage ever showed up beyond the read failure that caused the read-only remount.
For an enterprise, just buy the enterprise 100k write cycle SSDs and they should last beyond the useful life of a server. In a business, that life is somewhere between 5-7 yrs. At home, I’ve been getting 10+ yrs from systems since the Core2Duo CPUs were released.
A few years ago, I noticed some of my spinning-rust drives were 7 yrs old and I freaked out - started replacing them ASAP. Then relegated those old disks to temporary use in docking ports/stations. I have 6 Seagate 320G drives that were used in a RAID5 system for 7 yrs each is still working and last time I checked the SMART data, it was all 100% fine.
It is not the fatal warnings that I look at. It is the overall state hdparm reports me.
Currently, I have from my old NAS 8 disks, where 6 show signs of aging and only 2 are still 100% Ok.
Problem with disks is not when they spin. They tend to show when you shutdown the server and power it back up. Sometimes, some of the disks won’t come back up. That is the real problem!
My desktop PC ( Mac Mini 6,2 i7 CPU, 16GB Ram and 250GB SSD) runs for almost 6 years now.
But it runs under Linux, and I make sure that all temporary files and Log-files go to tmpfs … (Temporarily Ram Filesystem). This way the variable files writes don’t impact the drive itself.
Doing the same on my Server-main drives and so far no issues.
There is a lot one can do when using SSD when one knows what to do.
How do you guys set caching, hotswap, performance etc? I usually turn off everything and tell the disks to spin down pretty quick. However since Im not on raid many of my disks dont need to be accessed everyday.
Can anyone explain this statement please? Is it purely down to the time to rebuild the array if a disc fails and the inherent risk of another disc failing while doing so?
That is also my understanding. Since the long re-organization time on a failed or replaced disk, the risk of another failure while this time can get unacceptable high. Raid 6 is meanwhile often recommended by hardware vendors … but that is also a good marketing argument to sell more disks. Raid 6 has a disadvantage regarding WRITE IOs since it requires higher write activities compared to Raid 5.
They really aren’t significant different with modern RAID controllers. Both suffer from the “read/write” issue where a write isn’t aligned with the RAID stripe size, but writes can be done in parallel, so all disks get written at the same time. The extra time for parity computation on RAID-6 is insignificant.
Any final difference can be mitigated with battery-backed cache RAM on the controller, plus controller-based SSD caching.
@nabsltd Don’t want to start a controversy but Raid 6 IO overhead using mechanical disks can be crucial especially if your interested in performance and your hdds already operate near 50% utilization at Raid5. But you’re right, for normal consumers there is no demand running their disks at constant performance thresholds and therefore raid 6 is a more safe solution.