Datafold VPC Deployment on AWS
Learn how to deploy Datafold in a Virtual Private Cloud (VPC) on AWS.
INFO
VPC deployments are an Enterprise feature. Please email sales@datafold.com to enable your account.
Create a Domain Name (optional)
You can either choose to use your domain (for example, datafold.domain.tld
) or to use a Datafold managed domain (for example, yourcompany.dedicated.datafold.com
).
Customer Managed Domain Name
Create a DNS A-record for the domain where Datafold will be hosted. For the DNS record, there are two options:
- Public-facing: When the domain is publicly available, we will provide an SSL certificate for the endpoint.
- Internal: It is also possible to have Datafold disconnected from the internet. This would require an internal DNS (for example, AWS Route 53) record that points to the Datafold instance. It is possible to provide your own certificate for setting up the SSL connection.
Once the deployment is complete, you will point that A-record to the IP address of the Datafold service.
Give Datafold Access to AWS
For setting up Datafold, it is required to set up a separate account within your organization where we can deploy Datafold. We’re following the best practices of AWS to allow third-party access.
Create a separate AWS account for Datafold
First, create a new account for Datafold. Go to My Organization to add an account:
Click Add an AWS Account:
You can name this account anything that helps identify it clearly. In our examples, we name it Datafold. Make sure that the email address of the owner isn’t used by another account.
When you click the Create AWS Account button, you’ll be returned back the organization screen, and see the notification that the new account is being created. After refresh a few minutes later, the account should appear in the organizations list.
Grant Third-Party access to Datafold
To make sure that deployment runs as expected, your Datafold Support Engineer may need access to the Datafold-specific AWS account that you created. The access can be revoked after the deployment if needed.
To grant access, log into the account created in the previous step. You can switch to the newly created account using the Switch Role page:
By default, the role name is OrganizationAccountAccessRole.
Click Switch Role to log in to the Datafold account.
Grant Access to Datafold
Next, we need to allow Datafold to access the account. We do this by allowing the Datafold AWS account to access your AWS workspace. Go to the IAM page or type IAM in the search bar:
Go to the Roles page, and click the Create Role button:
Select Another AWS Account, and use account ID 710753145501
, which is Datafold’s account ID. Select Require MFA and click Next: Permissions.
On the Permissions page, attach the AdministratorAccess permissions for Datafold to have control over the resources within the account, or see Minimal IAM Permissions.
Next, you can set Tags; however, they are not a requirement.
Finally, give the role a name of your choice. Be careful not to duplicate the account name. If you named the account in an earlier step Datafold
, you may want to name the role Datafold-role
.
Click Create Role to complete this step.
Now that the role is created, you should be routed back to a list of roles in your organization.
Click on your newly created role to get a sharable link for the account and store this in your password manager. When setting up your deployment with a support engineer, Datafold will use this link to gain access to the account.
After validating the deployment with your support engineer, and making sure that everything works as it should, we will let you know when it’s clear to revoke the credentials.
Minimal IAM Permissions
Because we work in a Account dedicated to Datafold, there is no direct access to your resources unless explicitly configured (e.g., VPC Peering). The following IAM policy are required to update and maintain the infrastructure.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"acm:AddTagsToCertificate",
"acm:DeleteCertificate",
"acm:DescribeCertificate",
"acm:GetCertificate",
"acm:ListCertificates",
"acm:ListTagsForCertificate",
"acm:RemoveTagsFromCertificate",
"acm:RequestCertificate",
"acm:UpdateCertificateOptions",
"autoscaling:*",
"ec2:*",
"eks:*",
"elasticloadbalancing:*",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"iam:GetOpenIDConnectProvider",
"iam:GetRole",
"iam:GetRolePolicy",
"iam:GetUserPolicy",
"iam:GetUser",
"iam:ListAccessKeys",
"iam:ListAttachedRolePolicies",
"iam:ListGroupsForUser",
"iam:ListInstanceProfilesForRole",
"iam:ListPolicies",
"iam:ListPolicyVersions",
"iam:ListRolePolicies",
"iam:PassRole",
"iam:TagOpenIDConnectProvider",
"iam:TagPolicy",
"iam:TagRole",
"iam:TagUser",
"kms:CreateAlias",
"kms:CreateGrant",
"kms:CreateKey",
"kms:Decrypt",
"kms:DeleteAlias",
"kms:DescribeKey",
"kms:DisableKey",
"kms:GenerateDataKey",
"kms:GetKeyPolicy",
"kms:GetKeyRotationStatus",
"kms:ListAliases",
"kms:ListResourceTags",
"kms:PutKeyPolicy",
"kms:RevokeGrant",
"kms:ScheduleKeyDeletion",
"kms:TagResource",
"logs:CreateLogGroup",
"logs:DeleteLogGroup",
"logs:DescribeLogGroups",
"logs:ListTagsLogGroup",
"logs:PutRetentionPolicy",
"logs:TagResource",
"rds:*",
"s3:*"
],
"Resource": "*"
}
]
}
Some policies we need from time to time. For example, when we do the first deployment. Since those are IAM-related, we will ask for temporary permissions when required.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"iam:AttachRolePolicy",
"iam:CreateAccessKey",
"iam:CreateOpenIDConnectProvider",
"iam:CreatePolicy",
"iam:CreateRole",
"iam:CreateUser",
"iam:DeleteAccessKey",
"iam:DeleteOpenIDConnectProvider",
"iam:DeletePolicy",
"iam:DeleteRole",
"iam:DeleteRolePolicy",
"iam:DeleteUser",
"iam:DeleteUserPolicy",
"iam:DetachRolePolicy",
"iam:PutRolePolicy",
"iam:PutUserPolicy"
],
"Resource": "*"
}
]
}