Enable your PostgreSQL database to accept connections from
Cloud Data Fusion. To do this securely, we recommend that you use a
private Cloud Data Fusion instance.
Required roles
To get the permissions that
you need to connect to a PostgreSQL database,
ask your administrator to grant you the
following IAM roles:
Dataproc Worker (roles/dataproc.worker)
on the Dataproc service account in the project that contains the cluster
Cloud Data Fusion Runner (roles/datafusion.runner)
on the Dataproc service account in the project that contains the cluster
To use Cloud SQL without the Cloud SQL Auth Proxy:
Cloud SQL Client (roles/cloudsql.client)
on the project that contains the Cloud SQL instance
Enter your PostgreSQL password as a secure key to encrypt in your
Cloud Data Fusion instance. For more information about keys, see
Cloud KMS.
In the Cloud Data Fusion UI, click System admin >
Configuration.
Click Make HTTP Calls.
In the dropdown menu, choose PUT.
In the path field, enter namespaces/default/securekeys/pg_password.
In the Body field, enter {"data":"POSTGRESQL_PASSWORD"}.
Replace POSTGRESQL_PASSWORD with your PostgreSQL
password.
Click Send.
The Response field notifies you of any errors.
Connect to Cloud SQL for PostgreSQL
In the Cloud Data Fusion UI, click the menu menu
and navigate to the Wrangler page.
Click Add connection.
Choose Database as the source type to connect.
Under Google Cloud SQL for PostgreSQL, click Upload.
Upload a JAR file that contains your PostgreSQL driver. Your JAR file must
follow the format NAME-VERSION.jar. If
your JAR file doesn't follow this format, rename it before you upload.
Click Next.
Enter the driver's name, class name, and version in the fields.
Click Finish.
In the Add connection window that opens, click Google Cloud SQL for
PostgreSQL. Your JAR name should appear under Google Cloud SQL for
PostgreSQL.
Fill in the required connection fields. In the Password field, select the
secure key you stored previously.
This ensures that your password is retrieved using Cloud KMS.
In the Connection string field, enter your connection string as:
Click Test connection to ensure that the connection can be
established with the database.
Click Add connection.
After your PostgreSQL database is connected, you can apply transformations to
your data (in Wrangler), create a pipeline,
and write your output to a sink (in Studio).
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eThis guide details the process of reading data from a PostgreSQL database within a Cloud Data Fusion instance.\u003c/p\u003e\n"],["\u003cp\u003eConnecting to a PostgreSQL database in Cloud Data Fusion requires specific IAM roles, including Dataproc Worker and Cloud Data Fusion Runner, and Cloud SQL Client if not using Cloud SQL Auth Proxy.\u003c/p\u003e\n"],["\u003cp\u003eUsers must store their PostgreSQL password as a secure key within Cloud Data Fusion for encrypted retrieval.\u003c/p\u003e\n"],["\u003cp\u003eConnecting to Cloud SQL for PostgreSQL involves uploading a properly formatted JAR file with the PostgreSQL driver, and filling out the connection information, including a connection string.\u003c/p\u003e\n"],["\u003cp\u003eThe connection string to the database needs to specify the database name and instance connection name, and enable the Cloud SQL Admin API.\u003c/p\u003e\n"]]],[],null,["# Read from a PostgreSQL database\n\nThis page describes how you read data from a PostgreSQL database in a\nCloud Data Fusion instance.\n\nBefore you begin\n----------------\n\n- [Create a Cloud Data Fusion instance](/data-fusion/docs/how-to/create-private-ip).\n- Enable your PostgreSQL database to accept connections from Cloud Data Fusion. To do this securely, we recommend that you use a private Cloud Data Fusion instance.\n\n### Required roles\n\n\nTo get the permissions that\nyou need to connect to a PostgreSQL database,\n\nask your administrator to grant you the\nfollowing IAM roles:\n\n- [Dataproc Worker](/iam/docs/roles-permissions/dataproc#dataproc.worker) (`roles/dataproc.worker`) on the Dataproc service account in the project that contains the cluster\n- [Cloud Data Fusion Runner](/iam/docs/roles-permissions/datafusion#datafusion.runner) (`roles/datafusion.runner`) on the Dataproc service account in the project that contains the cluster\n- To use Cloud SQL without the Cloud SQL Auth Proxy: [Cloud SQL Client](/iam/docs/roles-permissions/cloudsql#cloudsql.client) (`roles/cloudsql.client`) on the project that contains the Cloud SQL instance\n\n\nFor more information about granting roles, see [Manage access to projects, folders, and organizations](/iam/docs/granting-changing-revoking-access).\n\n\nYou might also be able to get\nthe required permissions through [custom\nroles](/iam/docs/creating-custom-roles) or other [predefined\nroles](/iam/docs/roles-overview#predefined).\n\nOpen your instance in Cloud Data Fusion\n---------------------------------------\n\n1. In the Google Cloud console, go to the Cloud Data Fusion page.\n\n2. To open the instance in the Cloud Data Fusion Studio,\n click **Instances** , and then click **View instance**.\n\n [Go to Instances](https://console.cloud.google.com/data-fusion/locations/-/instances)\n\nStore your PostgreSQL password as a secure key\n----------------------------------------------\n\nEnter your PostgreSQL password as a secure key to encrypt in your\nCloud Data Fusion instance. For more information about keys, see\n[Cloud KMS](/kms/docs).\n\n1. In the Cloud Data Fusion UI, click **System admin \\\u003e\n Configuration**.\n\n2. Click **Make HTTP Calls**.\n\n \u003cbr /\u003e\n\n3. In the dropdown menu, choose **PUT**.\n\n4. In the path field, enter `namespaces/default/securekeys/pg_password`.\n\n5. In the **Body** field, enter `{\"data\":\"`\u003cvar translate=\"no\"\u003ePOSTGRESQL_PASSWORD\u003c/var\u003e`\"}`.\n Replace \u003cvar translate=\"no\"\u003ePOSTGRESQL_PASSWORD\u003c/var\u003e with your PostgreSQL\n password.\n\n6. Click **Send**.\n\nThe **Response** field notifies you of any errors.\n\nConnect to Cloud SQL for PostgreSQL\n-----------------------------------\n\n1. In the Cloud Data Fusion UI, click the menu *menu*\n and navigate to the **Wrangler** page.\n\n2. Click **Add connection**.\n\n3. Choose **Database** as the source type to connect.\n\n \u003cbr /\u003e\n\n4. Under **Google Cloud SQL for PostgreSQL** , click **Upload**.\n\n \u003cbr /\u003e\n\n5. Upload a JAR file that contains your PostgreSQL driver. Your JAR file must\n follow the format \u003cvar translate=\"no\"\u003eNAME\u003c/var\u003e`-`\u003cvar translate=\"no\"\u003eVERSION\u003c/var\u003e`.jar`. If\n your JAR file doesn't follow this format, rename it before you upload.\n\n6. Click **Next**.\n\n7. Enter the driver's name, class name, and version in the fields.\n\n8. Click **Finish**.\n\n9. In the **Add connection** window that opens, click **Google Cloud SQL for\n PostgreSQL** . Your JAR name should appear under **Google Cloud SQL for\n PostgreSQL**.\n\n \u003cbr /\u003e\n\n10. Fill in the required connection fields. In the **Password** field, select the\n [secure key you stored previously](#store_your_postgresql_password_as_a_secure_key).\n This ensures that your password is retrieved using [Cloud KMS](/kms/docs).\n\n \u003cbr /\u003e\n\n11. In the **Connection string** field, enter your connection string as:\n\n ```\n jdbc:postgresql://google/DATABASE_NAME?cloudSqlInstance=INSTANCE_CONNECTION_NAME&socketFactory=com.google.cloud.sql.postgres.SocketFactory&useSSL=false\n ```\n\n \u003cbr /\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eDATABASE_NAME\u003c/var\u003e: the Cloud SQL database name as listed in the **Databases** tab of the instance details page.\n - \u003cvar translate=\"no\"\u003eINSTANCE_CONNECTION_NAME\u003c/var\u003e: the Cloud SQL instance connection name as displayed in the **Overview** tab of the instance details page.\n\n For example: \n\n ```\n jdbc:postgresql://google/postgres?cloudSqlInstance=dis-demo:us-central1:pgsql-1&socketFactory=com.google.cloud.sql.postgres.SocketFactory&useSSL=false\n ```\n\n \u003cbr /\u003e\n\n12. Enable the Cloud SQL Admin API.\n\n13. Click **Test connection** to ensure that the connection can be\n established with the database.\n\n14. Click **Add connection**.\n\nAfter your PostgreSQL database is connected, you can apply transformations to\nyour data (in **Wrangler** ), create a [pipeline](/data-fusion/docs/concepts/overview#pipeline),\nand write your output to a sink (in **Studio**)."]]