md5deep and hashdeep on Synology DS212+

My personal photos are located at my desktop computer, and I have backups of them on my Synology DS212+ and my old faithful Linksys NSLU2 with an external disk.

The issue that I have now is that to keep backup times short I only backup the current year, because, well, the other years are static data. Previous years are already backed up and that data is not to get modified. Right? So what if corruption happens? How do I detected it? How can I be warned of such event? Right now I have no method to check if my digital photos of 2003 are totally intact in all of my three copies. And if  I detect corruption on one bad file/photo, which file is the correct one?

So I’m investigating this issue, and one of the tools available to create checksums and then to verify if everything is ok and no corruption has appeared is the md5deep and hashdeep programs. These are available as sources at this link: http://md5deep.sourceforge.net/start-hashdeep.html

These are the instructions for cross compiling these tools for the ARM based Synology DS212+. As usual this is done on a Linux Desktop/server machine.

1st) Set up the cross-compiling environment: Check out this link on the topic: https://primalcortex.wordpress.com/2015/01/04/synology-ds-crosscompiling-eclipseibm-rsmb-mqtt-broker/

2nd) Setting up the cross compiling environment variables is now a bit different:

export INSTALLDIR=/usr/local/arm-marvell-linux-gnueabi
export PATH=$INSTALLDIR/bin:$PATH
export TARGETMACH=arm-marvell-linux-gnueabi
export BUILDMACH=i686-pc-linux-gnu
export CROSS=arm-marvell-linux-gnueabi
export CC=${CROSS}-g++
export LD=${CROSS}-ld
export AS=${CROSS}-as
export AR=${CROSS}-ar
export GCC=${CROSS}-g++
export CXX=${CROSS}-g++

We will use the C++ compiler.

3rd) Create a working directory and clone the repository to your local machine: git clone https://github.com/jessek/hashdeep.git
4th) Change to the hashdeep directory and execute the following commands:

[pcortex@pcortex:hashdeep]$ sh bootstrap.sh
[pcortex@pcortex:hashdeep]$ ./configure –host=arm-marvell-linux-gnueabi
[pcortex@pcortex:hashdeep]$ make

And that’s it. If everything went well then at hashdeep/src directory the commands are available:

[pcortex@pcortex:src]$ file hashdeep
hashdeep: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.3, for GNU/Linux 2.6.16, not stripped

We can now copy the commands to the DiskStation and start using them.

Now all I need is to devise a method that creates and checks each year photo hash/signatures, and warns me if a difference is detected. I’m thing on using the audit mode of the hashdeep command, and do each day a check for one year, for example, on Mondays check 2003 and 2013, on Tuesday check 2004 and 2014 and so on.

Backup your digital photographs

When film ruled you had the prints and the negatives, and that’s was about it. If you lose the negatives, or by any chance destroyed or corrupted them, there was almost no way to get it back. But at least the photo prints might be available.

With digital cameras, all photos are computer files, are easier to mantain, easier to copy, and yes, also easier to lost them. I know people that have lost huge amount of photos, including their unique family photos, and didn’t even had a print of them, just because it was all stashed on one computer and one hard disk. You know, hard disks do fail…

So when I’ve switched to digital, my mind kept worrying in how to make sure if disaster strikes at least the loss is not a huge one.

So I’m documenting here my digital photography backup process. This method has two parts: A easy one that every one can do, and the complex part just make sure that some loose ends are covered…

The easy part:

I use a external card reader to read my camera SD cards. Why? They are normally faster than connecting to the camera directly, and during the copy process the camera is safely stored. Also there is no fiddling with USB cables and low battery’s.  My Nikon D80 uses SD cards, and an external USB 2.0 card reader costs around 3€.

I also use Picasa to organize and browse my photos. So I copy all the photos from the cards to a meaningful named folder (Ex: 2008-Holidays_Portugal) under a root folder that I named Photos, that Picasa scans automatically. After all photos are scanned by Picasa, it’s time for quality control.

Please notice that at this point I’ve not deleted anything from the cards. The cards are like the original master copy.

Using Picasa, I view and check every single photo that was copied. It can be time consume process but it’s necessary. This is to make sure if a corrupted photo appears, you can check the card to see if it happened during the copy process or not.

At this point still don’t format your cards. First make a backup of your Picasa folder!

So now you have all your photos checked, and it’s time for the first backup.

I use an external USB hard disk with it’s own power connection. This disk is always disconnected from the PC and from the mains. Why? If a power surge happens It can fry my computer, but it will not fry the hard disk and it’s power supply…

So run a Picaca backup procedure into your external hard disk. If you want you can also copy them to DVD, albeit my experience with DVD’s are not very good.

(If you have several hard disks, you can make multiple backups, for example, and one of the hard disks is stored at safe).

So now you have three copys of your photos: The master original on your SD cards, on your computer hard disk and on your external hard disk.

After seeing that all photos are ok, using the camera I format all the used cards. Why? First, formating on the camera will make sure that the card is usable by it, and second, if you don’t do this when using again the camera and the card has photos, you may get the doubt if you already have copy them or not…

So, now you are safe, right?

Well, more or less.

This is the point where the easy process ends. With the above instructions, you are almost 100% covered, and if you store your external hard disk on a fire proof vault, it will be safe from fire or being stolen. Yet, it’s not safe if it fails, so you may use at least two different hard disks, preferably from different vendors, so that you do not hit the misfortune of a bad hard disk batch from the same vendor…

The complex part:

Despite of no hardware problems with the hard disks, data corruption happens, due to software or hardware glitch.  So I have a second level and a third level of data protection. These are based on the rsync command and on the Linksys NSLU2 with the SlugOS firmware that allows a full blown Linux operating system to be running on a device that consumes around 5W of energy… (http://www.nslu2-linux.org). I’ve installed on all my windows PC’s the DeltaCopyprogram.

This will allow to do backups into my NLSU2 rsync daemon of the folders and files that I want, namely my digital photography files. You can start the backup process manually or you can schedulle it at a time that it’s more probable that you have your home computer on.

So, all my data from my home computer, and wife’s laptop are backup into the NSLU2 external hard disk. This is just like the Picasa backup, but with the possibility of backing up other files also.

For protecting data corruption and inadvertent change on files, the NSLU2 is also running internally the rsnapshotprogram. This is in practice like Apple Time Machine or Linux Time vault, without the fancy interface.

So now I have a full history of all my files and associated changes, and if by any chance I do find a corrupted file or change that I want to revert, I can go to the snapshot folders and retrieve the file until a year ago if necessary.

Finally, because the NSLU2 can’t run inside a fire proof vault, I do an off-site backup by using rsync over ssh from my Kubuntu laptop located in my work. Ideally it should be another NSLU2 on the datacenter, but let’s keep this simple…

Because this is personal data, my work computer has an external hardisk for this. To avoid, if stolen, any access to my files all the backups are done into a TrueCrypt volume.

So at around two o’clock in the morning, my work computer connects through ssh to my home NSLU2 and rsync’s all changes into the truecrypt volume.

If the laptop is stolen and/or the hard disk is stolen, nobody will know what’s the data inside it.

Hope that this can give some ideas to how a complete set of instructions to backing up your photographs can be done.