Data Loss, Data Recovery, Backup and LTO Tape

Post Reply
User avatar
onecircles
Posts: 333
Joined: Wed Mar 26, 2014 7:07 am

Data Loss, Data Recovery, Backup and LTO Tape

Post by onecircles »

Hello everybody, I've been away for a while due to my having experienced a data loss. I wanted to share what I learned over the course of this process. TLDR? I lost my data. Got it back, and now I can return to BUZZING WOO!

Data Loss -

RAID-

I believed that I was doing my best to prevent the possibility of a data loss. My data was originally on a mirroring RAID 1 on two physical drives in my system. I ran into problems with this long ago when the onboard RAID controller on my ITX motherboard gave me issues that prevented the computer from booting and necessitated the reinstallation of my OS. This was a separate issue that did not effect my data, but did cause me to have to reinstall and reconfigure buzz including all my midi mappings more than a year ago.

I admit that part of the reason I decided to use a RAID array, although I could acknowledge that there were some draw backs, was a sort of 'cool factor'. I just wanted to do it because it seemed like a cool system. I'm now of the opinion that you should not use a RAID array with any mission critical data unless you have a dedicated, highly reliable RAID card. This is not possible in my system because it is an ITX system and my PCIe slot is occupied by my Lynx Interface. I never had to rebuild a broken RAID array, but I get the impression that it's the kind of thing that can really make you tear your hair out, and it sometimes is impossible.

The only advantage to a RAID 1 setup (mirroring) over a single drive is redundancy. If you delete your data due to human error, or if the data is somehow corrupted as you are saving it, it will be written to both drives incorrectly. It DOES however protect you from a single drive failure.

Sync -

After running into issues relating to the onboard RAID controller on my Asus z-97 I plus motherboard, I decided to move to a Sync regimen, rather than a RAID setup. The advantage of a Sync regimen over raid, is that it gives you the opportunity to notice a problem if it arises and then work to fix it. If something goes wrong in a RAID setup, the fault will propagate across all disks simultaneously. If you develop a problem in a Sync setup, hopefully you still have all your data on the second drive, and can fix things from there.

I decided to use a utility from Microsoft called 'Sync Toy'. I had good results for a time, but later noticed that my backup drive had a different amount of free space to my primary drive. The backup was not the same size as the original, so I could see that there was a problem. The Sync tool was set up to Sync all changes across the first drive to the second drive, so this indicates that there was a problem with the Sync program itself. I decided to remedy the problem by deleting the backup drive and then re-syncing the first drive to the second drive. Because of the way that Sync-toy works this was an error on my part. I deleted the backup drive, and when I Synced I expected the primary drive to sync to the backup drive, but the backup drive synced to the primary drive, causing all my data to be deleted.

I wanted this kind of relationship
primary drive ---> backup drive

But what I had was this relationship
primary drive <---> backup drive

A change made on either copy was synced to the other copy. Because I deleted the second drive with the intention of re-syncing the first drive to the second, so that the copies would be identical the original, the primary drive was erased. This was human error.

If I had used a reliable file copy utility like shadowcopy or tera copy rather than a sync program, I would not have had this issue. I would have deleted the second drive, copied the first drive to the second, and everything would have gone as I expected.

When copying large files, never never use the windows built in file copy. If it fails in copying a file, the whole transfer will be interrupted. Files can fail their copy for a variety of reasons including if they have too long a file name. The windows file copy utility is completely unreliable for large amounts of data. I have lately been using Tera Copy, and have great results.

Data Recovery -

When files are 'deleted' as I understand it, all that happens is the 'headers' of the files get over-written as containing no data. unless you do a true forensic delete, where each memory address is overwritten with data, nothing is actually lost. So if you have an appropriate program, it is possible to reconstruct those headers, and regain the files.

If you suffer a data loss it is imperative that you stop using that drive. The data is still present on the drive, but if you add any new files to the drive where you suffered the data loss, they will be written into the 'empty' space that contains the files that you lost. Additionally, before attempting a data recovery it is a good idea to make a complete copy of the drive. You must make a 'forensic copy'. A 'forensic copy' is a copy that images the entire contents of the drive including 'empty' space. I used Macrium reflect to do this. It is a longer process, but if the file recovery goes wrong, or if the drive itself suffers a hardware failure during the process, a forensic copy will allow you to restore the state of the drive, or copy it to a new drive and attempt the data recovery from there.

Data Recovery Software -

I wanted to save money (of course) so I tried all the free software first. None of the free programs found my files. Many of the paid programs allow you to install the software and do a scan, but you must purchase the software before you can attempt the data recovery. I tried all the major offerings on the market. The process took a few weeks in itself. In the end the software that gave me the best results was Stellar Data Recovery, which is also sold as OnTrack Data Recovery. If you purchase the software from Stellar, which is the company that created the software, you have the option of purchasing a lifetime license. Many of the commercial offerings only allow a 1-year license. I decided that if I was going to invest in some data recovery software, I wanted to be able to use it forever. So this is another reason I chose Stellar Data Recovery.

The scans take a tremendous amount of time. 9 hours per scan for my hard drives, but they really find everything that can be found. The recovery process itself takes even longer. In my case I believe it took 3 or 4 days of uninterrupted CPU time.

It turned out to be a very good thing that I had two copies of the data, that were both deleted in different ways. One drive was formatted, and one drive had all it's files deleted by Sync Toy. The files that were recovered were different on each drive, but across both drives I was able to recover all my data.

LTO Tape -

A proper backup should not be in the same physical location as the original. I needed a new system that would prevent this problem happening again. I also needed space to manage all the drive images, and incremental states of the drives as my recovery project progressed. I did some research and learned about LTO (Linear Tape Open) tape.

I didn't know it, but magnetic data tape has been undergoing constant development alongside CD and DVD and hard drives and all the other data storage technologies that we are more familiar with. It's really become an amazing technology. The current generation of LTO tape (LTO-8) can hold 30tb per tape! It offers the best price per GB of any storage technology, and each new generation of LTO tape has more than DOUBLE the capacity of the previous generation. LTO tapes and drives are backwards compatible through two generations. So an LTO-5 tape can be read by an LTO-8 tape drive.

The advantages are -

low cost per GB.
Highly reliable enterprise-level solution.
Extremely fast data transfers. 8Gbit for fiber LTO-5 and 6Gbit for SAS LTO-5. In my case, I can write to the drive at 90 Megabytes per second.

The disadvantages are -

Extremely slow random reads and writes: If you are doing any operation with an LTO tape that requires it to write data to a large number of locations on the tape, it literally has to rewind and fast forward over and over again in a process that can take days or WEEKS! You can avoid this by only writing and reading sequential files.

Specialized hardware required: SAS, Fiber or Lightening.

Proprietary sotware: LTO tape is not a commonly used technology. Software can be difficult to acquire, and in some cases, free software is not available.

The cheapest drives avalable are FC (Fiber Channel) Drives. These drives are capable of extremely fast transfer speeds, but they require implementation of a Fiber Optic SAN (Storage Area Network) requiring Fiber optic cards and a fiber optic switch as well as basic knowledge of fiber optic networking, which is a whole different beast from traditional ethernet networking. One intriguing possibility in this space is the LTO Tape Library. An automated system that stores a large number of LTO tapes, and treats them as one large storage unit. These require a Fiber Optic SAN and Archiving software like VEEAM. These systems can have truly MASSIVE amounts of storage, easily in the hundreds of terabytes.

If you decide to try LTO tape, you should get a SAS tape drive. This will require a SAS card, and SAS cables, but if you have the proper hardware and drivers, it's as easy as setting up a typical DVD drive. It's also very fast, assuming you get a decent SAS interface.

The final step -

Since my data was in two places, deleted in two different ways, after I recovered the data I wanted to do a byte level compare of the two sets of data, and then merge into a single data set, containing everything. The only software that I found that could do this is called "beyond compare'. This turned out to be a very important step. There's no way I could have done it manually. I have about 500,000 files. After running the comparison, I saw that for whatever reason, some files were recoverable on one drive, some were recoverable on the other. By having my data in two places, doing a byte-level compare and then merging the data sets while keeping all differences, I was able to save everything. I also discovered a small amount of data corruption across the two data sets, but In every case of this I have both original files, maximizing my chances of having a working file. In most cases the corrupted files, files that show byte-level differences across the two copies, are actually still functional, but in some cases they are truly corrupted, and only one copy is functional.

So I guess that's everything. It's been a very lengthy and stressful process. I lost every file I ever had, and then got them all back. I treated the set-back as an opportunity, and now I have an LTO tape back up system, giving me infinitely extensible, low cost storage, and an off site backup. I am happy that I can return to making electronic music.

Feel free to ask if you have any questions. Also if any Buzzer has a data loss, I would be happy to help out any Buzzer free of charge, using my new data recovery system.
User avatar
HerrFornit
Posts: 435
Joined: Sat Feb 25, 2017 12:27 pm
Location: Dortmund
Contact:

Re: Data Loss, Data Recovery, Backup and LTO Tape

Post by HerrFornit »

Hi onecircles,

sounds like hell man! thanx for sharing your experience.

welcome back. :)

(Beside the automatic full system backup on an internal HDD (containing my full buzz installation) I use regularly a simple robocopy batch for incremental backup of my songs to an external HDD.
User avatar
IXix
Posts: 1113
Joined: Wed Nov 23, 2011 3:24 pm

Re: Data Loss, Data Recovery, Backup and LTO Tape

Post by IXix »

Glad you got it all back. Well done! 8-)
Post Reply