Active Storage - Migrate between providers, from local to amazon

on under developer
3 minute read

Recently, I've come to like the ActiveStorage functionality from newer Rails versions (from 5.1 upwards).

As ActiveStorage is relatively new, there are still some things missing, like Validations. On the other hand, standardizing on a attachment solution brings many advantages, like better doc, community support, easier to build/use extensions, like previewers, encoders etc.

In this guide I want to show, how we migrated an existing ActiveStorage integration, that was located on the local filesystem into the "Cloud". But because of the modularity you can migrate all services into each other.

I. Add new service to storage.yml

local:
  service: Disk
  root: <%= Rails.root.join("storage") %>

# Use rails credentials:edit to set the AWS secrets (as aws:access_key_id|secret_access_key)
amazon:
  service: S3
  access_key_id: <%= Rails.application.secrets.dig(:aws, :access_key_id) %>
  secret_access_key: <%= Rails.application.secrets.dig(:aws, :secret_access_key) %>
  region: <%= Rails.application.secrets.dig(:aws, :region) %>
  bucket: <%= Rails.application.secrets.dig(:aws, :bucket) %>

Make sure your new service is references in the storage.yml.

II. Add migration script

First, add a migration script that you deploy on your production host. You can also run this script from a migration if you'd like, but we prefer having a separate scripts directory and loading the script from the rails console. Also, add all the credentials required to config/storage.yml / secrets/credentials.yml for the new service adapter. But don't use it just now.

The first part of the script is defining a ActiveStorage::Downloader which is not released in a Rails 5.2 version yet. So if you are running Rails 6, you might not need that part (Check, if this commit is already released)

# Part 1 - AS::Downlaoder
module ActiveStorage
  class Downloader #:nodoc:
    def initialize(blob, tempdir: nil)
      @blob    = blob
      @tempdir = tempdir
    end

    def download_blob_to_tempfile
      open_tempfile do |file|
        download_blob_to file
        verify_integrity_of file
        yield file
      end
    end

    private
      attr_reader :blob, :tempdir

      def open_tempfile
        file = Tempfile.open([ "ActiveStorage-#{blob.id}-", blob.filename.extension_with_delimiter ], tempdir)

        begin
          yield file
        ensure
          file.close!
        end
      end

      def download_blob_to(file)
        file.binmode
        blob.download { |chunk| file.write(chunk) }
        file.flush
        file.rewind
      end

      def verify_integrity_of(file)
        unless Digest::MD5.file(file).base64digest == blob.checksum
          raise ActiveStorage::IntegrityError
        end
      end
  end
end

module AsDownloadPatch
  # define a blob.open method
  def open(tempdir: nil, &block)
    ActiveStorage::Downloader.new(self, tempdir: tempdir).download_blob_to_tempfile(&block)
  end
end

Then follows the migration code. We just grab each Blob, download it locally, and push it into the service.

# ... below into the same file

# Rails.application.config.to_prepare do
#  wrap the include in a to_prepare wrapper if you intent to run the code in a
#  development environment.
# end
ActiveStorage::Blob.send(:include, AsDownloadPatch)

def migrate(from, to)
  configs = Rails.configuration.active_storage.service_configurations
  from_service = ActiveStorage::Service.configure from, configs
  to_service   = ActiveStorage::Service.configure to, configs

  ActiveStorage::Blob.service = from_service

  puts "#{ActiveStorage::Blob.count} Blobs to go..."
  ActiveStorage::Blob.find_each do |blob|
    print '.'
    blob.open do |tf|
      checksum = blob.checksum
      to_service.upload(blob.key, tf, checksum: checksum)
    end
  end
end

migrate(:local, :amazon)

As you can see, because of the granularity of ActiveStorage services, it is relatively easy to copy.

On the other hand, if you are migrating from S3 -> S3 you might not need that (expensive) script, and just can copy all the bucket content via other means.

Full script

module ActiveStorage
  class Downloader #:nodoc:
    def initialize(blob, tempdir: nil)
      @blob    = blob
      @tempdir = tempdir
    end

    def download_blob_to_tempfile
      open_tempfile do |file|
        download_blob_to file
        verify_integrity_of file
        yield file
      end
    end

    private
      attr_reader :blob, :tempdir

      def open_tempfile
        file = Tempfile.open([ "ActiveStorage-#{blob.id}-", blob.filename.extension_with_delimiter ], tempdir)

        begin
          yield file
        ensure
          file.close!
        end
      end

      def download_blob_to(file)
        file.binmode
        blob.download { |chunk| file.write(chunk) }
        file.flush
        file.rewind
      end

      def verify_integrity_of(file)
        unless Digest::MD5.file(file).base64digest == blob.checksum
          raise ActiveStorage::IntegrityError
        end
      end
  end
end

module AsDownloadPatch
  def open(tempdir: nil, &block)
    ActiveStorage::Downloader.new(self, tempdir: tempdir).download_blob_to_tempfile(&block)
  end
end

Rails.application.config.to_prepare do
  ActiveStorage::Blob.send(:include, AsDownloadPatch)
end

def migrate(from, to)
  configs = Rails.configuration.active_storage.service_configurations
  from_service = ActiveStorage::Service.configure from, configs
  to_service   = ActiveStorage::Service.configure to, configs

  ActiveStorage::Blob.service = from_service

  puts "#{ActiveStorage::Blob.count} Blobs to go..."
  ActiveStorage::Blob.find_each do |blob|
    print '.'
    blob.open do |tf|
      checksum = blob.checksum
      to_service.upload(blob.key, tf, checksum: checksum)
    end
  end
end

migrate(:local, :amazon)