paper-thin-bits

Tech articles, mini tech blog

Storing encrypted data in Google Drive

XKCD Opinions on Privacy

Contents

Preface

By now I reckon you already know that Google Drive offers the astounding 15 GB of storage for free [1]. That’s quite a lot!

Microsoft’s OneDrive offers just about 5 GB [2] for free and Dropbox, although a pioneer in the field, only about 2 GB [3].

Google Drive seems the clear winner here in terms of offering a greater amount of storage to its users for no cost at all. Needless to say, free online storage is a great way to store your documents, backups and data. However, if you are like me, keeping sensitive documents, data backups or SQL exports unprotected in the cloud is really not a nice option. Google do offer extended security via a 2-Step verification process [4] which makes it quite hard for your account to get compromised, but the human factor [5] in security cannot be discarded.

Furthermore, call me paranoid, but the following text in Google’s Terms of Service [6] leaves me with a feeling of unease:

Our automated systems analyze your content (including emails) to provide you personally relevant product features, such as customized search results, tailored advertising, and spam and malware detection. This analysis occurs as the content is sent, received, and when it is stored.

So the question is, can we store our documents or data backups on the Google Drive safely encrypted?

Goal

Our goal here is to establish a way for server or desktop stored files to be encrypted and uploaded to a Google Drive folder. To achieve this, a couple of tools may be used in tandem. The first one would be GnuPG [7] for encryption and the second a Google Drive Client tool for synchronization.

The synchronization will be one way only, meaning files will only be uploaded from the server or desktop machine to a designated Google Drive folder. Here is a simple diagram of the intended workflow:

Action Diagram

Required Tools

What is basically needed here is a way to collect, encrypt and synchronize files. Collecting files may be realized via shell scripts like Bash on Unix or Powershell on Windows.

For encrypt operations one may use GnuPG, either version 1.4 or 2.1. It should be downloaded using the system’s package manager, e.g., Aptitute or Brew, or alternatively from GnuPG’s website.

For synchronization an open source Google Drive CLI Client written in Go is available. It’s a binary tool that supports an impressive range of platforms.

Setup

Generate GnuPG Keys

If you already have a GnuPG key pair that you want to use for encryption, you may skip this step completely.

Alright, now I’m going to first start with a fair warning that dealing with GnuPG could, at times, be somewhat irritating. Cryptography is generally complex and unfortunately GPG doesn’t make it a lot easier, but we are going to do a minimal set of steps, so it should all be fine.

First, you need a place to store your GPG key rings. A key ring contains one or more public and/or private keys. By default, this would be a place in your home dir, i.e., /home/user/.gnupg. I recommend using your own isolated directory for this setup. This makes it easier to manage the keys for this particular scenario only.

$ cd /var/local
$ mkdir mygpgkeys && chmod 700 mygpgkeys
$ export GNUPGHOME="/var/local/mygpgkeys"

Let’s generate the GnuPG key pair.

$ cd mygpgkeys
$ gpg --gen-key

Fill in a name and an Email address. Although recommended, the Email address does not need to be a real one. You should then be asked about a password to secure your private key. Needless to say, choose a good, long password. It might take a while for your key to get generated depending on the available entropy on your system [8].

gpg (GnuPG) 2.1.21; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

gpg: keybox '/var/local/mygpgkeys/pubring.kbx' created
Note: Use "gpg2 --full-generate-key" for a full featured key generation dialog.

GnuPG needs to construct a user ID to identify your key.

Real name: Max Mustermann
Email address: maxmustermann@example.org
You selected this USER-ID:              
    "Max Mustermann <maxmustermann@example.org>"

Change (N)ame, (E)mail, or (O)kay/(Q)uit? o
...
gpg: /var/local/mygpgkeys/trustdb.gpg: trustdb created
gpg: key 1A2B04F0E994DAF4 marked as ultimately trusted
gpg: directory '/var/local/mygpgkeys/openpgp-revocs.d' created
gpg: revocation certificate stored as '/var/local/mygpgkeys/openpgp-revocs.d/31208B5D45320FAE3D7E7FE21A2B04F0E994DAF4.rev'
public and secret key created and signed.

pub   rsa2048 2017-05-25 [SC] [expires: 2019-05-25]
        31208B5D45320FAE3D7E7FE21A2B04F0E994DAF4
        31208B5D45320FAE3D7E7FE21A2B04F0E994DAF4
uid                      Max Mustermann <maxmustermann@example.org>
sub   rsa2048 2017-05-25 [E] [expires: 2019-05-25]

GnuPG will generate a public and a secret key ring files named, e.g., pubring.gpg and secring.gpg. The file names may differ on macOS. You can get a glimpse on the available keys in your GPG key ring by running the following command:

$ gpg --list-keys

-------------------------------------------------------------------
pub   rsa2048 2017-05-25 [SC] [expires: 2019-05-25]
        31208B5D45320FAE3D7E7FE21A2B04F0E994DAF4
uid           [ultimate] Max Mustermann <maxmustermann@example.org>
sub   rsa2048 2017-05-25 [E] [expires: 2019-05-25]

A successfully generated key pair is all that is required to continue to the synchronization step.

Setup sync folder

The first requirement is a Google Drive folder where you’ll be uploading your encrypted files to. Notice that the unique id of the folder is already available in the browser url.

Sync Folder

Next, you need to initialize the gdrive CLI and configure a sync folder on the hard drive.

To create a sync folder run:

$ mkdir mysyncfolder && cd mysyncfolder

To initialize the gdrive CLI run the command below using the folder id you got from Google Drive. You will need to manually copy and paste the authorization url in your browser and follow the steps to authorize the gdrive CLI to upload files on your behalf.

$ gdrive sync upload . 0BxNabiLkX8lpSGN3UVRmWEVQWWM -c /Users/max/.gdrive-test-sync

Authentication needed
Go to the following url in your browser:
https://accounts.google.com/o/oauth2/auth?access_type=offline&client_id=someverylongstringeg.apps.googleusercontent.com&redirect_uri=...

Enter verification code: 7/TWTDPBq32xYiRqqAlWp4tkcVBYgL3n5qsD_bfzXr8B0
Starting sync...
Collecting local and remote file information...
Found 0 local files and 0 remote files
Sync finished in 2.095311029s

Notice that the gdrive CLI access token parameters are saved to a manually specified path, i.e., in /Users/max/.gdrive-test-sync. This should normally be a location that only your user is allowed to access.

Usage

Linux and macOS

Let’s write a Bash script that will encrypt all your files from a given folder on the hard drive and then upload those files to the specified Google Drive folder.

$ touch sync.sh && chmod +x sync.sh

Open the file in an editor and add the following. Check the comments to adjust the paths where needed.

#!/bin/sh

GPG=$(which gpg)
GDRIVE=$(which gdrive)

# checks if the required tools are available 
if [ "XY$GPG" = "XY" ]; then
    echo "GPG not found!"
    exit 1
fi
if [ "XY$GDRIVE" = "XY" ]; then
    echo "gdrive not found!"
    exit 2
fi


####################################
## configuraitons
####################################

##### !IMPORTANT! #####
## Make sure to adjust the paths to your system
#######################

# key to encrypt with
GPG_KEY="maxmustermann@example.org"
export GNUPGHOME="/var/local/mygpgkeys"
# destination in gdrive
GDRIVE_DEST="0BxNabiLkX8lpSGN3UVRmWEVQWWM"
# local sync folder path
GDRIVE_SYNC_DIR="/var/local/mysyncfolder"
# gdrive config
GDRIVE_CONFIG_PATH="/Users/max/.gdrive-test-sync"

####################################
## encrypt source files
####################################
echo

FOLDER="/<PATH-TO-MY-IMPORTANT-FILES-OR-BACKUPS>"

for f in $FOLDER/*; do
    echo "Encrypting $f ..."
    echo y | $GPG --recipient="$GPG_KEY" -e $f
    GPG_FILE="$f.gpg"

    BASE_NAME=$(basename $GPG_FILE)
    echo "Moving to $GDRIVE_SYNC_DIR/$BASE_NAME ..."
    mv $GPG_FILE $GDRIVE_SYNC_DIR/$BASE_NAME
done

## ...you may add more folders here

####################################
## sync everything
####################################

echo
echo "Sync files with Google Drive ..."

$GDRIVE sync upload $GDRIVE_SYNC_DIR $GDRIVE_DEST -c $GDRIVE_CONFIG_PATH

####################################
## cleanup
####################################

echo
if [ "XZ$GDRIVE_SYNC_DIR" != "XZ" ]; then
    echo "Removing encrypted files from $GDRIVE_SYNC_DIR ..."
    rm -f $GDRIVE_SYNC_DIR/*
fi

Now this is a very simple script that my be further extended. For example, traversing a folder tree or producing and encrypting a tar archive composed of many small sized files. I leave this up to you to adjust as needed.

Alright, let’s run the sync.sh script.

$ ./sync.sh 

Encrypting LICENSE ...
Moving to /var/local/mysyncfolder/LICENSE.gpg ...
Encrypting Makefile ...
Moving to /var/local/mysyncfolder/Makefile.gpg ...
Encrypting MobileDevice.h ...
Moving to /var/local/mysyncfolder/MobileDevice.h.gpg ...
Encrypting README.md ...
Moving to /var/local/mysyncfolder/README.md.gpg ...

Sync files with Google Drive ...
Starting sync...
Collecting local and remote file information...
Found 4 local files and 4 remote files

4 local files has changed
[0001/0006] Updating LICENSE.gpg -> TestSync/LICENSE.gpg
[0002/0006] Updating Makefile.gpg -> TestSync/Makefile.gpg
[0003/0006] Updating MobileDevice.h.gpg -> TestSync/MobileDevice.h.gpg
[0004/0006] Updating README.md.gpg -> TestSync/README.md.gpg
Sync finished in 7.074881652s

Removing encrypted files from /var/local/mysyncfolder ...

If you now take a look in Google Drive, you’ll find your encrypted files placed in the target folder.

Synchronized files

So what if you need this done on regular basis? Simple as cake. Just setup a cron job to invoke the sync.sh script, say at 00:30 every day, e.g.,

# m h  dom mon dow   command
30 0 * * * /var/local/sync.sh >> /var/log/sync.log

Windows

The sync workflow and script on Windows systems is pretty much the same. Of course, you would need the Windows version of GnuPG - Gpg4win and a Windows binary of the gdrive CLI.

@echo off
setlocal

REM ################################
REM ## configuraitons
REM ################################
SET RC=0
SET "CURRENT_DIR=%cd%"

SET "GDRIVE=C:\SYNC_TEST\gdrive-windows-x64.exe"

REM # key to encrypt with
SET "GPG_KEY=maxmustermann@example.org"
SET "GNUPGHOME=C:\SYNC_TEST\mygpgkeys"
REM # destination in gdrive
SET "GDRIVE_DEST=0BxNabiLkX8lpSGN3UVRmWEVQWWM"
REM # local sync folder path
SET "GDRIVE_SYNC_DIR=C:\SYNC_TEST\mysyncfolder"
REM # gdrive config
SET "GDRIVE_CONFIG_PATH=C:\Users\<MY_USERNAME>\AppData\Local\gdrive-test"

REM ####################################
REM ## encrypt source files
REM ####################################

SET "FOLDER=C:\SYNC_TEST\sourcefiles"

PUSHD
CD %FOLDER%

for %%f in (*) do (	
	echo Encrypting %%f ...
	gpg --recipient="%GPG_KEY%" -e %%f
	if errorlevel 1 (
		goto error_encrypt
	)
	
	echo Moving %%f.gpg to %GDRIVE_SYNC_DIR% ...
	move %%f.gpg %GDRIVE_SYNC_DIR%
)
POPD


REM ####################################
REM ## sync everything
REM ####################################

echo Sync files with Google Drive ...

PUSHD
CD %GDRIVE_SYNC_DIR%
%GDRIVE% sync upload %GDRIVE_SYNC_DIR% %GDRIVE_DEST% -c %GDRIVE_CONFIG_PATH%
POPD

REM ####################################
REM ## cleanup
REM ####################################
:cleanup

PUSHD
cd %GDRIVE_SYNC_DIR%
echo Removing encrypted files from %GDRIVE_SYNC_DIR% ...
del /F /Q *
POPD

goto end

:error_encrypt
echo Error encrypting file.
goto end

:end

Running the sync.cmd script would produce the following:

> sync.cmd
Encrypting LICENSE ...
Moving LICENSE.gpg to C:\SYNC_TEST\mysyncfolder ...
        1 file(s) moved.
Encrypting Makefile ...
Moving Makefile.gpg to C:\SYNC_TEST\mysyncfolder ...
        1 file(s) moved.
Encrypting MobileDevice.h ...
Moving MobileDevice.h.gpg to C:\SYNC_TEST\mysyncfolder ...
        1 file(s) moved.
Encrypting README.md ...
Moving README.md.gpg to C:\SYNC_TEST\mysyncfolder ...
        1 file(s) moved.
Sync files with Google Drive ...
Starting sync...
Collecting local and remote file information...
Found 4 local files and 6 remote files

4 local files has changed
[0001/0004] Updating LICENSE.gpg -> TestSync\LICENSE.gpg
[0002/0004] Updating Makefile.gpg -> TestSync\Makefile.gpg
[0003/0004] Updating MobileDevice.h.gpg -> TestSync\MobileDevice.h.gpg
[0004/0004] Updating README.md.gpg -> TestSync\README.md.gpg
Sync finished in 4.2520023s
Removing encrypted files from C:\SYNC_TEST\mysyncfolder ...

You may use the Windows Task Scheduler [9] to run the sync.cmd script at a desired time or interval.

Decrypt files

Alright, the encrypt and sync process is ready and this is awesome, but how does one get back their content in case they need it? Downloading the files from Google Drive is not an issue, but how about decrypting the content?

Here is a simple bash script the decrypts all encrypted gpg files in a directory.

#!/bin/sh

if [ ! -d "$1" ]; then
    echo "No input folder specified!"
    exit 1
fi

GPG=$(which gpg)
if [ "XY$GPG" = "XY" ]; then
    echo "GPG not found!"
    exit 1
fi

FOLDER=$1

for f in $FOLDER/*.gpg; do
    echo "Decrypt $f ..."
    echo "<your-gpg-key-password>" | gpg --passphrase-fd 0 --batch --yes $f
done

Simply pass the directory path where all encrypted files reside and run the script to get all of them decrypted.

$ cd my-gdrive-downloaded-files && ./decrypt .

Decrypt ./LICENSE.gpg ...
gpg: encrypted with 2048-bit RSA key, ID E31A85E6729D5490, created 2017-05-25
    "Max Mustermann <maxmustermann@example.org>"
Decrypt ./Makefile.gpg ...
gpg: encrypted with 2048-bit RSA key, ID E31A85E6729D5490, created 2017-05-25
    "Max Mustermann <maxmustermann@example.org>"
Decrypt ./MobileDevice.h.gpg ...
gpg: encrypted with 2048-bit RSA key, ID E31A85E6729D5490, created 2017-05-25
    "Max Mustermann <maxmustermann@example.org>"
Decrypt ./README.md.gpg ...
gpg: encrypted with 2048-bit RSA key, ID E31A85E6729D5490, created 2017-05-25
    "Max Mustermann <maxmustermann@example.org>"

That’s it. Now go have cake!

Conclusion

So, in the end, what is all this good for? I personally use it to keep server backups safely stored in the cloud. It probably doesn’t make sense to store frequently changing documents using this method, because this way you will not able to edit them using the Google Docs suite. But everything else, including binary files, PDFs, archives, etc., could be safely encrypted and put on Google Drive.

Oh, and one more thing …just don’t lose your GPG key pair. ;)

Annex - Metadata

One topic that I did not consider in this article is Metadata [10]. You may have noticed that the original filenames in the scripts above are always preserved when uploaded to Google Drive. The name and size of a file may be enough information for complex algorithms to still extract quite a lot of meaningful info about what the purpose and contents of that file may be. A simple counter mechanism could be used to produce a hash value of each filename, thus preventing filename metadata extraction.

Here is a modification of the encryption script for Linux/macOS that hashes the filenames using arbitrary salt value and produces a CSV file index of all encrypted files.

# a registry of hashed files
INDEX="/var/local/gdrive_index.csv"
SALT="someveryveryveryveryveryveryverylongstring"

## reset index contents
echo "FILENAME;HASHED NAME" > $INDEX

####################################
## Encrypt source files
####################################
echo

FOLDER="/<PATH-TO-MY-IMPORTANT-FILES-OR-BACKUPS>"
 
for f in $FOLDER/*; do
	echo "Encrypting $f ..."
	echo y | $GPG --recipient="$GPG_KEY" -e $f
 
	BASE_NAME=$(basename $f)
	HASHED=$(echo $BASE_NAME.$SALT | openssl dgst -sha256)
	echo "$BASE_NAME;$HASHED" >> $INDEX

	GPG_FILE="$f.gpg"
	HASHED_GPG_FILE="$HASHED.gpg"

	echo "Moving to $GDRIVE_SYNC_DIR/$HASHED_GPG_FILE ..."
	mv $GPG_FILE $GDRIVE_SYNC_DIR/$HASHED_GPG_FILE
done

References

  1. GDrive Pricing Guide - google.com/drive/pricing/
  2. Microsoft OneDrive Plans - onedrive.live.com/about/en-us/plans
  3. How much does Dropbox cost? - dropbox.com/help/billing/cost
  4. Google’s 2-Step Verification - google.com/landing/2step
  5. The human factor is key to good security - computerweekly.com/opinion/The-human-factor-is-key-to-good-security
  6. Google Terms of Service - google.com/intl/en/policies/terms
  7. What’s GnuPG? - gnupg.org/faq/gnupg-faq.html#whats_gnupg
  8. GPG does not have enough entropy - serverfault.com/questions/214605/gpg-does-not-have-enough-entropy
  9. Schedule a Task - technet.microsoft.com/en-us/library/cc748993(v=ws.11).aspx
  10. Metadata - en.wikipedia.org/wiki/Metadata