We will be investigating Dropbox's delta incremental syncing feature. This is the feature that only uploads the internal file changes of a big file when it has been updated. This is very efficient when working with large files that need to be synced to your cloud provider.

Delta sync advantages:

  • Saves network bandwidth
  • Saves upload time
  • Saves storage space
  • Creates file a history for recovery

A possible disadvantage is computational overheads in computing the delta differences. The advantages mostly outweigh the disadvantage though - I/O is generally more expensive than CPU cycles. 

We will look at the threshold at which Dropbox initiates the delta sync instead of uploading the entire file. For comparison we will compare this with Google drive and its mechanisms to handle and sync small changes inside large files.

DROPBOX


TEST 1 (50MB)

We generated a 50MB file filled with random data from here (pinetools.com). We then dropped the file in our Dropbox folder and timed the upload until the tray icon shows all files up to date:

Upload speed is approximately 8MBit/s. The 50MB file uploaded in approximately 60 seconds.

Next we made a small internal change by changing 2 bytes inside the file using a hex editor. We then measure the time it took for Dropbox to equalize after pressing the save button. Time to equalize took 8 seconds.

It's clear that Dropbox initiates delta syncing at the 50MB level already otherwise it would have taken another 60 seconds to upload the changes.

TEST 2 (25MB)

Next we halve the file size to 25MB. The upload took 31 seconds as expected. Changing 2 bytes inside the file and saving takes 6 seconds to upload. So once again we can safely say the delta syncing is happening at 25MB level.

TEST 3 (10MB)

Next we upload a 10MB file. The upload takes 17 seconds. Still long enough to discern between full and delta uploads. The sync took 6.5 seconds. We can still assume that delta syncing happens at 10MB level.

TEST 4 (5MB)

5MB took 10.5 seconds to upload in full. The changed upload took 6 seconds. Delta sync still active at 5MB level.

TEST 5 (2MB)

For the 2MB test we will limit Dropbox to 100KB/s upload to be able to have a more accurate result between a full upload and a changed upload. As expected the upload takes longer (25 seconds). The changed upload after 2 bytes took 5 seconds. So delta sync still active.

TEST 6 (500KB)

500KB took 8.5 seconds to upload and the changed upload took 3.6 seconds.


GOOGLE DRIVE


TEST 1

Next we compare it to Google Drive, which is now called "Backup and sync from Google". Dropping the 25MB file into the Google folder takes 38 seconds to sync. Changing 2 bytes and saving the file takes another 39 seconds to finish syncing.

TEST 2

10MB took 23 seconds to upload, the changed version took 23 seconds as well.

The rest of the tests all exhibited the same results, equal full uploads and changed uploads, so we will exclude them for the sake of brevity.


To summarize:


25MB

10MB

5MB

2MB

(limit 100KB/s

upload)




FULL
CHANGE
FULL
CHANGED
FULL
CHANGEDFULLCHANGED
Dropbox
31s
6s
17s
6.5s
10s
6s
25s
2s

Google Drive
38s
39s
23s
23s
13s
13s
27s
27s


Compressed Files Insight and Disproving Filecloud Claims

Googling 'delta sync' has a link to this blog that states that delta sync is a myth and hyperbole. It also mentions that file compression renders it useless and the only useful scenario is uncompressed files such as logs.

Lets run a test on a 30MB file with random non-repeating data inside a compressed zip file:


(Curiously, the non-repeating data means that the file is larger after compression).

RESULTS

The full upload took 32s.
Next we'll add a 1KB text file to the zip file, and measure the time to sync: 4s.
Next we change some random internal bytes using a hex editor and measure the sync time: 7 seconds.

This proves that delta sync can be implemented independent of file type or contents. The delta comparison most likely occurs at byte level thus it is completely file type agnostic. It simply doesn't care or matter what the contents of the file is. The advantages remain clear, disproving the above mentioned blog.

CONCLUSION

Dropbox applies a delta sync algorithm to all files. There doesn't seem to be a threshold at which it just uploads the entire file instead of implementing a delta sync. Dropbox may upload full files at an extremely small sizes (ie, 5KB or so) but at this stage delta sync becomes irrelevant.

Google drive uploads the full file every time regardless of size. This is ineffective and bandwidth intensive.

In extreme cases this becomes almost unusable. Imagine a scenario where you're storing files greater than 500mb up to a few gigabytes, any changes to those files will prompt a full upload in services such as Google Drive, Onedrive etc.

Delta sync opens up possibilities such as keeping a virtual machine image inside a cloud provider folder while keeping it in sync both local and in cloud.

Hopefully this will help you make an informed decision when choosing a cloud storage provider.


Credits:

Symmetric IT is an IT support provider in Auckland.