Skip to main content

Datafold VPC Deployment on GCP

info

VPC deployments are an Enterprise feature. Please email sales@datafold.com to enable your account.

Create a Domain Name (optional)

You can either choose to use your domain (for example, datafold.domain.tld) or to use a Datafold managed domain (for example, yourcompany.dedicated.datafold.com).

Customer Managed Domain Name

Create a DNS A-record for the domain where Datafold will be hosted. For the DNS record, there are two options:

  • Public-facing: When the domain is publicly available, we will provide an SSL certificate for the endpoint.
  • Internal: It is also possible to have Datafold disconnected from the internet. This would require an internal DNS (for example, AWS Route 53) record that points to the Datafold instance. It is possible to provide your own certificate for setting up the SSL connection.

Once the deployment is complete, you will point that A-record to the IP address of the Datafold service.

Create a New Project

For isolation reasons, it is best practice to create a new project within your GCP organization. Please call it something like yourcompany-datafold to make it easy to identify:

After a minute or so, you should receive confirmation that the project has been created. Afterward, you should be able to see the new project.

Set IAM Permissions

Navigate to the IAM tab in the sidebar and click Grant Access to invite Datafold to the project.

Add your Datafold solutions engineer as a principal. You have two options for assigning IAM permissions to the Datafold Engineers.

  1. Assign them as an owner of your project.
  2. Assign the extended set of Minimal IAM Permissions.

The owner role is only required temporarily while we configure and test the initial Datafold deployment. We'll inform you when it is ok to revoke this permission and provide us with only the Minimal IAM Permissions.

Required APIs

The following GCP APIs need to be additionally enabled to run Datafold:

  1. Compute Engine API
  2. Secret Manager API

The following GCP APIs we use are already turned on by default when you created the project:

  1. Cloud Logging API
  2. Cloud Monitoring API
  3. Cloud Storage
  4. Service Networking API

Once the access has been granted, make sure to notify Datafold so we can initiate the deployment.

Minimal IAM Permissions

Because we work in a Project dedicated to Datafold, there is no direct access to your resources unless explicitly configured (e.g., VPC Peering). The following IAM roles are required to update and maintain the infrastructure.

cloudsql.admin
compute.loadBalancerAdmin
compute.networkAdmin
compute.securityAdmin
compute.storageAdmin
container.admin
container.clusterAdmin
iam.roleViewer
iam.serviceAccountUser
iap.tunnelResourceAccessor
storage.admin
viewer

Some roles we need from time to time. For example, when we do the first deployment. Since those are IAM-related, we will ask for temporary permissions when required.

iam.roleAdmin
iam.securityAdmin
iam.serviceAccountKeyAdmin
iam.serviceAccountAdmin
serviceusage.serviceUsageAdmin