> ## Documentation Index
> Fetch the complete documentation index at: https://docs.datafold.com/llms.txt
> Use this file to discover all available pages before exploring further.

# PostgreSQL

<Note>
  **INFO**

  Column-level Lineage is supported for AWS Aurora and RDS Postgres and *requires* Cloudwatch to be configured.
</Note>

**Steps to complete:**

1. [Run SQL script and create schema for Datafold](/integrations/databases/postgresql#run-sql-script-and-create-schema-for-datafold)
2. [Configure your data connection in Datafold](/integrations/databases/postgresql#configure-in-datafold)

## Run SQL script and create schema for Datafold

To connect to Postgres, you need to create a user with read-only access to all tables in all schemas, write access to Datafold-specific schema for temporary tables:

```Bash theme={null}
/* Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in your warehouse. */

CREATE SCHEMA datafold_tmp;

/* Create a datafold user */

CREATE ROLE datafold WITH LOGIN ENCRYPTED PASSWORD 'SOMESECUREPASSWORD';

/* Give the datafold role write access to the temporary schema */

GRANT ALL ON SCHEMA datafold_tmp TO datafold;

/* Make sure that the postgres user has read permissions on the tables */

GRANT USAGE ON SCHEMA <myschema> TO datafold;
GRANT SELECT ON ALL TABLES IN SCHEMA <myschema> TO datafold;

```

Datafold utilizes a temporary schema, named `datafold_tmp` in the above script, to materialize scratch work and keep data processing in the your warehouse.

## Configure in Datafold

| Field Name                  | Description                                                     |
| --------------------------- | --------------------------------------------------------------- |
| Name                        | A name given to the data connection within Datafold             |
| Host                        | The hostname address for your database; default value 127.0.0.1 |
| Port                        | Postgres connection port; default value is 5432                 |
| User                        | The user role created in our SQL script, named datafold         |
| Password                    | The password created in our SQL script                          |
| Database Name               | The name of the Postgres database you want to connect to        |
| Schema for temporary tables | The schema (datafold\_tmp) created in our SQL script            |

Click **Create**. Your data connection is ready!

***

## Column-level Lineage with Aurora & RDS

This will guide you through setting up Column-level Lineage with AWS Aurora & RDS using CloudWatch.

**Steps to complete:**

1. [Setup Postgres with Permissions](#run-sql-script)
2. [Increase the logging verbosity of Postgres](#increase-logging-verbosity) so Datafold can parse lineage
3. [Set up an account for fetching the logs from CloudWatch.](#connect-datafold-to-cloudwatch)
4. [Configure your data connection in Datafold](#configure-in-datafold)

### Run SQL Script

To connect to Postgres, you need to create a user with read-only access to all tables in all schemas, write access to Datafold-specific schema for temporary tables:

```Bash theme={null}
/* Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in the your warehouse. */

CREATE SCHEMA datafold_tmp;

/* Create a datafold user */

CREATE ROLE datafold WITH LOGIN ENCRYPTED PASSWORD 'SOMESECUREPASSWORD';

/* Give the datafole role write access to the temporary schema */

GRANT ALL ON SCHEMA datafold_tmp TO datafold;

/* Make sure that the postgres user has read permissions on the tables */

GRANT USAGE ON SCHEMA <myschema> TO datafold;
GRANT SELECT ON ALL TABLES IN SCHEMA <myschema> TO datafold;

```

### Increase logging verbosity

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_dbs-89843982d984ed977c0254adca7a5fa0.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=9714c4922ccea50fd944c73765583084" width="1277" height="820" data-path="images/psql_aurora_dbs-89843982d984ed977c0254adca7a5fa0.png" />
</Frame>

Then, create a new `Parameter Group`. Database instances run with default parameters that do not include logging verbosity. To turn on the logging verbosity, you'll need to create a new Parameter Group. Hit **Parameter Groups** on the menu and create a new Parameter Group.

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_parameter_group-044563cebd48ae81a9d22ab2319d160e.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=6bffe4b697e92e129c84c504967b3b4e" width="1277" height="886" data-path="images/psql_aurora_parameter_group-044563cebd48ae81a9d22ab2319d160e.png" />
</Frame>

Next, select the `aurora-postgresql10` parameter group family. This depends on the cluster that you're running. For Aurora serverless, this is the appropriate family.

Finally, set the `log_statement` enum field to `mod` - meaning that it will log all the DDL statements, plus data-modifying statements. Note: This field isn't set by default.

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_logstatement-6f0ba20fd7217047ae62fd01cbfa50d4.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=b1243c6b87aaa2c4b9efea52f51b8b61" width="1682" height="927" data-path="images/psql_aurora_logstatement-6f0ba20fd7217047ae62fd01cbfa50d4.png" />
</Frame>

After saving the parameter group, go back to your database, and select the database cluster parameter group.

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_clustergroup-6a1c25e3eae1563130b7565a5b5f0ba7.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=5fed528e2e74cbc506ca4ddda648eede" width="1682" height="927" data-path="images/psql_aurora_clustergroup-6a1c25e3eae1563130b7565a5b5f0ba7.png" />
</Frame>

### Connect Datafold to CloudWatch

Start by creating a new user to isolate the permissions as much as possible. Go to IAM and create a new user.

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_iam_user-0d82fc2408ab78e2bd94b20e0e2d363e.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=aa54c7573e13e8eaff8e277bd549b692" width="1682" height="927" data-path="images/psql_aurora_iam_user-0d82fc2408ab78e2bd94b20e0e2d363e.png" />
</Frame>

Next, create a new group named `CloudWatchLogsReadOnly` and attach the `CloudWatchLogsReadOnlyAccess` policy to it. Next, select the group.

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_user_permissions-f48596fc79a01aea8d9aadd3688381ce.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=130f10b905ba8f0c7360b671ec6459b1" width="1682" height="927" data-path="images/psql_aurora_user_permissions-f48596fc79a01aea8d9aadd3688381ce.png" />
</Frame>

When reviewing the user, it should have the freshly created group attached to it.

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_user_review-637256675791599a381ee290bd7e05b7.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=6f96f214f462c92d2304250cad4c11e5" width="1682" height="927" data-path="images/psql_aurora_user_review-637256675791599a381ee290bd7e05b7.png" />
</Frame>

After confirming the new user you should be given the `Access Key` and `Secret Key`. Save these two codes securely to finish configurations on Datafold.

The last piece of information Datafold needs is the CloudWatch Log Group. You will find this in CloudWatch under the Log Group section in the sidebar. It will be formatted as `/aws/rds/cluster/<my_cluster_name>/postgresql`.

<Frame>
  <img src="https://mintcdn.com/datafold/9DgdnO4sVNte36u-/images/psql_aurora_log_group-5dd6c4c2728cf4d55352976449d05c12.png?fit=max&auto=format&n=9DgdnO4sVNte36u-&q=85&s=0178488e4be5c0b38a28d569bf0d799f" width="1682" height="927" data-path="images/psql_aurora_log_group-5dd6c4c2728cf4d55352976449d05c12.png" />
</Frame>

### Configure in Datafold

| Field Name                    | Description                                                                                                                             |
| ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| Name                          | A name given to the data connection within Datafold                                                                                     |
| Host                          | The hostname address for your database; default value 127.0.0.1                                                                         |
| Port                          | Postgres connection port; default value is 5432                                                                                         |
| User                          | The user role created in the SQL script; datafold                                                                                       |
| Password                      | The password created in the SQL permissions script                                                                                      |
| Database Name                 | The name of the Postgres database you want to connect to                                                                                |
| AWS Access Key                | The Access Key provided in the [Connect Datafold to CloudWatch](/integrations/databases/postgresql#connect-datafold-to-cloudwatch) step |
| AWS Secret                    | The Secret Key provided in the [Connect Datafold to CloudWatch](/integrations/databases/postgresql#connect-datafold-to-cloudwatch) step |
| Cloudwatch Postgres Log Group | The path of the Log Group; formatted as /aws/rds/cluster/\<my\_cluster\_name>/postgresql                                                |
| Schema for temporary tables   | The schema created in the SQL setup script; datafold\_tmp                                                                               |

Click **Create**. Your data connection is ready!
