I recently got married. Part of this process is getting a handoff of digital files from different vendors. What to do with these files? Archive them of course! And then share them with family.
Some photographers offer a hosted image site (perhaps white-label SmugMug) or similar, but I’m planning for this marriage to last decades and it seems unlikely for the photographer’s site to persist. Given the tradeoff, my wife and I preferred delivery of the raw digital files over a glossy website. I trust myself to keep these files over the decades.
The problem is there currently aren’t any good products that handle archiving sentimental files like photos and movies.
Even though I run an agency that in theory could build the perfect product, the reality is that photos are hard. Someone else, hopefully funded by a large brand, should solve this problem.
This blog is me writing down what I need so I can talk more intelligently about this at parties and startup gatherings.
The setup: What I want out of a photos product
This section describes my background and motivation for approaching a photos product. The Photos space is so varied, I think it’s important to explain my goals and where I am coming from.
tl;dr ‐ I currently backup 1.2 TB of photos and movies to 5 disconnected hard drives, s3 and glacier. I am only looking for another place to store these photos and movies, and it would be a bonus if I could share all photos without having host my own site generated by sigal.
What photos I have today
You may notice how there are few of my pictures on this post. I’m not an avid photographer, or even a good photographer. However, I am sentimental about the photographs I have. They remind me where I’ve been, who I’ve known, and what moments we’ve shared together. Most of them are of people and the places we know. They are among the most “priceless” possessions I own.
It’s important to remember that these numbers and statistics are not about files. They are modest measures of sentimentality.
In total, I have
1.2 TB of photos and movies that I “cannot lose.”
That’s not intended to invite competition. I expect that I am middle of the pack with regards to total photo storage. These photo include:
- All of my photos from 2000
- All of my wife’s photos from 2002
- All of my parents’ photos from 2000
- All of my grandparents’ photos from 2001
- All of my wife’s parents’ photos from 2004
- 200 GB of wedding photos and 400 GB of wedding videos
- 100 GB of digitized 8mm film from the 1950s
- 120 GB of digitized VHS tapes from the 1980s and 90s
I estimate of the 1.2 TB, about 400 GB are from photos. Of the photos, the vast majority of files are JPEG files, but we switched to “shooting in the RAW” in 2013. I estimate now that RAW files account for 80% of the space needed to store “new photos.”
Running ls and grep together, it seems of the 400 GB, about 100 GB are from the RAW photos. These “RAW” files (actually NEF files) sit alongside their JPG counterparts, so any photo solution that requires me to separate NEF files from JPG files is a non-starter.
How my photos are organized
I don’t have the best way to organize photos, but it’s important to explain what I do.
In short, we organize photos following what is now called a “slug” in the blogosphere. Obviously, I set this up back in 2002 before I had heard the word “blog.” You would configure this as /:category/yyyy/:title/ in django or similar.
If that doesn’t make sense, let me explain in prose what we do. All of our photos are organized at a top level by “source” (i.e. me, my wife, my parents, me before I met my wife, etc.). Then, within each source photos are grouped by year into folders and then grouped again into “topics.” Each topic stores the dump of the photos straight from the memory card.
What alternative online photo products we’ve tried
We’ve tried a few alternatives. Basic shortcomings of each:
- iPhoto or Aperture - Opaque photo library that changes “original” files. Have to use TimeMachine to backup (insufficient). I tried it for a year. Thank god for backups.
- Windows Live Photo Gallery - My favorite, but discontinued, and I don’t own Windows any more
- Facebook - Privacy concerns, and Facebook changes “original” files. Doesn’t support RAW.
- Dropbox - Privacy concerns (there is Content ID), unfavorable reputation propagating deletes, I just am not comfortable with the brand, and it costs.
- iCloud - Changes original files, too little space, doesn’t support RAW, and can’t view in browser.
- OneDrive - Micro$oft (I’m an old hacker who remembers the 90s), some versions change “original” files.
- flickr - Privacy concerns, changes “original” files (specifically renames the original files), only photos included in 1 TB limit (as far as I know).
- S3 and Glacier - What I use today, but family members can’t browse.
Current backup system today
In short, I don’t trust one service or option. We store the 1.2 TB on 5 disconnected hard drives, s3, and glacier.
There is 1 master, a WD Passport Ultra from Costo (a USB drive). This is the drive I plug into my computer when I want to read or write photos. It’s the drive I take with me to relatives’ houses.
There is 1 slave, a second WD Passport Ultra from Costco that sits in a firesafe in my house. Each week or month, I run rsync between the master and slave. I also run awscli to push changes to s3 and glacier.
A 3rd hard drive (unspecified) lives in another state (unspecified) that would be unaffected by natural disasters which may affect my home. This is backed up once a year during the holidays.
Yes, this all manual work. I probably spend 10 hours a year handling backups.
If there are files I want to share with family members, I put them into a separate folder, publish them using sigal, and send a link over mail or message.
Would I wish this system to be automated? Sure. But no such product exists on the market.
And to me, the fact that 4 of the hard drives are physically disconnected is a feature. One of the risks I have to worry about is malicious code attacking my computer and either deleting, ransoming, or otherwise corrupting my files.
Also, some of the files I have are from the 1950s (not the files, but their content). This stuff doesn’t need to be changed. It just needs to be stored. Should a natural disaster occur, it would be a shame to lose pictures from the camera roll, but it would be another disaster to lose those files.
The perfect product
What do I wish were available? I wish there was the following product that:
- Has a “master” version of photos and videos all available from anywhere for me and people I share with.
- It should be private by default.
- Preserves the binary files. Does not change them. JPEG is lossy, so every change has a “cost” even to the metadata.
- Preserves the organization I’ve already done, but it’s ok to let me search and filter in other dimensions.
- Allows me to backup the entire collection to different hard drives and store these hard drives in safe deposit boxes.
- Allows me to add family members to my account.
- Supports NEF and JPG files sitting side-by-side in the same directory.
- Reasonable price based on usage, not over-priced fixed-fee subscription.
If you know of such a product, please tell me. If you are interested in building such a product, you will be a hero. I’m not worried about lockin if I can synchronize the data with offline hard drives since presumably the next product will gladly import from such a hard drive.