Cloning Firebase Firestore Data

Although Firebase initially began development in 2011 and was bought by Google in 2014, it is still very much a work in progress.  Firestore, Firebase’s No SQL Database, was launched in 2017 and is still in public Beta.

One very desirable feature that has not yet been implemented in Firebase is the ability to clone Projects. Although an elegant cloning of a complete Firebase Project remains unavailable, this document seeks to help you clone the data from one Firebase Database to another with only a few commands. The process on the whole is actually quite simple.  The only real challenge was finding out which commands to use to open up the access rights to the data Bucket.

It would be my hope that this document is very quickly obviated by the addition of Project cloning as a native feature in Firebase, but there is no telling how long it will take for this wish to come true.

Create destination Project

The very first thing that you need to do is to create the destination Project that will receive the data. That is, make your dataless clone.

There is absolutely nothing special to take into consideration when creating the destination Project, beyond the consideration that it needs to have the same Location as your source Project. For some reason, Google is not yet allowing copying of data between different geographic locations.

I am assuming that you already know how to create a Firebase project, so won’t include any instructions on this action.

Upgrade to paid

There is one very important caveat that has to be noted at the beginning.  This data cloning won’t be free.  In order to clone the data we will need to upgrade the Firebase Project from “Spark” to “Blaze” and then create Buckets in Google’s Cloud Console to temporarily hold the data we want to clone.  Please refer to the Firestore documentation for information on Google’s Firestore pricing. You can also change the billing of your Project in the Cloud Console, but for our purposes it is easier to change the Project in Firebase.

The Project billing type is found at the very bottom left of your browser window. Note that you will need to enable billing on both the source and destination Projects.

Once you click on “Upgrade” you will receive a dialog on the various pricing plans.  Click on the “Select Plan” button under Blaze and you’re good to go.

It is important to note that Google, by default, limits you to 2 Projects that you can have on a paid plan. This isn’t much of an issue for testing purposes as you only need 2 paid Projects and once the cloning is completed, you can easily return to the free Spark plan.

Create a Database

After your destination Firebase Project is created and set to the Blaze plan, you will want to create your Firestore Database in it. Open your destination Project, select the “Database” menu item and click on the subsequent “Create Database” button.

After the Database is created, you will be asked which security rules you would like to start with. For the purposes of our work, you will want to start in test mode. You can always manually change the rules after the data is imported

That’s all for Firebase right now. Google Cloud Console is next.

Create a Bucket

Once you have your Firebase Projects and Databases set up, head on over to Google’s Cloud platform , click on the “To Console” button, open the Navigation menu, scroll down to “Storage” and select “Browser”.

It could be that your Firebase Project is not selected when you get into the Storage management facility. Double check this by looking at the Project selected and changing it if necessary. The name of the project is the first element to the right of the “Google Cloud Platform” menu text.

What can be a bit confusing is that each Firebase Project has both a Name and an ID. While you can change the Name of the project at any time and it only needs to be unique to your own projects, the ID cannot (currently) be changed after it is created and must be globally unique for all projects in Google. No – you don’t have to dig through a global list of Project IDs – Google selects a unique ID for you when the Firebase Project is created, usually the Project Name following by a random number if it is not the first global instance of the Name.

If you are unsure what the Project ID for your Firebase Project is, go to your Firebase Project settings.

Once the correct Project is selected, you will be shown a list of Buckets associated with the Project. Create a Bucket that will hold the data you are about to export by clicking on the “CREATE BUCKET” button.

One thing you will see very quickly when you create your Bucket is that, just like Project IDs, all Bucket names have to be unique across the whole Google Cloud Store platform. So, the most obvious Bucket names are already taken. In this case, Google isn’t (currently) suggesting alternate names or suffixing duplicate names with random numbers, so you’ll have to be a little creative here.

When you make your Bucket, ensure that the Location you select is consistent with your Firebase Project’s Location. Everything else can be left at default values for the purpose of this exercise.

Cloud Console work

Once your Bucket is created, you’ll need to go to the Cloud Shell, which is accomplished by clicking on the first icon on the top right hand side of your browser window.

If everything goes according to plan, the shell window will open at the bottom of your browser window. It could take a while to initialise the shell, so patience will be a virtue today.

Set your Project

The first command you will want to enter is to ensure that you are in the correct Project. Google is kind enough to remind you of the syntax for this command, which is:

gcloud config set project [PROJECT_ID]

Note that you will not need to quote the Project ID, but it is case sensitive. Interestingly, it will allow you to enter a subset of the Project ID.  For example, the Project “experiment-with-firebase.appspot.com” will be chosen when you enter “experiment” or even “ex”, assuming that this is the only Project ID in your account that starts with that string of characters.

Export data from Firestore

The next step is to copy the data from your Project to the newly created Bucket with the command:

gcloud beta firestore export gs://[BUCKET_NAME]

The export can take a while, but it will be clear from the Cloud Shell when it is complete and if it is successful.

Move to Destination Project

Once the export is complete you will want to change to the Project for your cloning destination. Use the ID of the Project that you created at the beginning of this procedure.

gcloud config set project [DESTINATION_PROJECT_ID]

Change Access Rights

Now you will want to set the access control of the Bucket.  I have been told that you MUST be in the recipient Project to do this, but don’t see any logic that would dictate this.

 gsutil acl ch -u [RIGHTS_RECIPIENT]:R gs://[BUCKET_NAME]

The RIGHTS_RECIPIENT would be the email address of the Google account that should receive the read rights. This strikes me as odd that this is required. Even if both source and destination Projects and the data Bucket are under the same Google account, for some reason I need to explicitly grant myself access.

Import data into your Firestore Project

For the grand finale (drumroll please), you will issue the command to import the Bucket data into your Firestore Project.

gcloud beta firestore import gs://[BUCKET_NAME]/[TIMESTAMPED_DIRECTORY]

Note that you don’t just import the Bucket name.  You have to go into the Bucket and copy the name of the subdirectory that was automatically created for you with the “export” command.  The import command will look for a file that has the prefix of the timestamp and the suffix of “.overall_export_metadata”.

A couple of things worth noting. First, when you do the import, the correct structure is maintained – as you would hope, and maybe even expect. Second, if you already have data in your destination Project it will be overwritten. This was helpful when I was experimenting, because it meant that I didn’t have to delete the existing data manually first. I don’t know if there is a command available to override replacing the data and would welcome comments that could clarify this.

Did I screw anything up or leave our important information? Please let me know how this blog post can be improved!