Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Cloud Data Fusion menyediakan plugin Sumber Dataplex Universal Catalog untuk membaca data dari entitas (tabel) Dataplex Universal Catalog yang ada di aset Cloud Storage atau BigQuery. Plugin Sumber Dataplex Universal Catalog memungkinkan Anda memperlakukan data di aset Cloud Storage sebagai tabel dan memfilter data dengan kueri SQL.
Sebelum memulai
Buat instance Cloud Data Fusion,
jika Anda belum memilikinya. Plugin ini tersedia di instance yang berjalan di
Cloud Data Fusion versi 6.6 atau yang lebih baru.
Data sumber harus sudah menjadi bagian dari zona Dataplex Universal Catalog dan aset
(baik bucket Cloud Storage maupun set data BigQuery).
Untuk menggunakan tabel dari Cloud Storage, Anda harus mengonfigurasi metastore
untuk lake Anda.
Agar data dapat dibaca dari entity Cloud Storage,
Dataproc Metastore harus dilampirkan ke data lake.
Data CSV di entitas Cloud Storage tidak didukung.
Di project Dataplex Universal Catalog, aktifkan Private Google Access di subnetwork, yang biasanya disetel ke default, atau setel internal_ip_only ke false.
Batasan
Untuk aset Cloud Storage: plugin ini tidak mendukung pembacaan dari file CSV. Alat ini mendukung pembacaan dari format JSON, Avro, Parquet, dan ORC.
Untuk aset Cloud Storage: Tanggal Mulai Partisi dan Tanggal Akhir
Partisi tidak berlaku.
Peran yang diperlukan
Untuk mendapatkan izin yang
diperlukan untuk mengelola peran,
minta administrator untuk memberi Anda
peran IAM berikut pada agen layanan Dataproc dan agen layanan Cloud Data Fusion (service-CUSTOMER_PROJECT_NUMBER@gcp-sa-datafusion.iam.gserviceaccount.com):
Klik Lihat instance untuk membuka instance di UI Cloud Data Fusion.
Buka halaman Studio, perluas menu Sumber, lalu klik Dataplex.
Mengonfigurasi plugin
Setelah menambahkan plugin ini ke pipeline di halaman Studio, klik
sumber Dataplex Universal Catalog untuk mengonfigurasi propertinya.
Untuk mengetahui informasi selengkapnya tentang konfigurasi, lihat referensi Sumber Dataplex.
Opsional: Mulai menggunakan pipeline contoh
Pipeline contoh tersedia, termasuk pipeline sumber SAP ke sink Dataplex Universal Catalog dan pipeline sumber Dataplex Universal Catalog ke sink BigQuery.
Untuk menggunakan pipeline contoh, buka instance Anda di UI Cloud Data Fusion,
klik Hub > Pipelines, lalu pilih salah satu
pipeline Dataplex Universal Catalog. Dialog akan terbuka untuk membantu Anda membuat
pipeline.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-19 UTC."],[[["\u003cp\u003eCloud Data Fusion's Dataplex Source plugin allows reading data from Dataplex entities (tables) located on Cloud Storage or BigQuery assets, treating data in Cloud Storage as tables with SQL filtering capabilities.\u003c/p\u003e\n"],["\u003cp\u003eUsing this plugin requires a Cloud Data Fusion instance version 6.6 or later, and the source data must reside in a Dataplex zone and asset.\u003c/p\u003e\n"],["\u003cp\u003eTo read from Cloud Storage, a metastore must be configured for the lake and the data must be in JSON, Avro, Parquet, or ORC formats, as CSV is not supported.\u003c/p\u003e\n"],["\u003cp\u003eSpecific IAM roles, including Dataplex Developer, Dataplex Data Reader, Dataproc Metastore Metadata User, Cloud Dataplex Service Agent, and Dataplex Metadata Reader, are required to manage roles and utilize this plugin.\u003c/p\u003e\n"],["\u003cp\u003eSample pipelines, such as SAP source to Dataplex sink and Dataplex source to BigQuery sink, are available in the Cloud Data Fusion UI under the Hub section.\u003c/p\u003e\n"]]],[],null,["# Process data with Cloud Data Fusion\n\n[Cloud Data Fusion](/data-fusion) provides a Dataplex Universal Catalog Source plugin\nto read data from Dataplex Universal Catalog entities (tables) residing on\nCloud Storage or BigQuery assets. The Dataplex Universal Catalog Source\nplugin lets you treat data in Cloud Storage assets as tables and filter\nthe data with SQL queries.\n\nBefore you begin\n----------------\n\n- [Create a Cloud Data Fusion instance](/data-fusion/docs/how-to/create-instance),\n if you don't have one. This plugin is available in instances that run in\n Cloud Data Fusion version 6.6 or later.\n\n- The source data must already be part of a Dataplex Universal Catalog\n [zone](/dataplex/docs/add-zone) and an [asset](/dataplex/docs/manage-assets)\n (either a Cloud Storage bucket or a BigQuery dataset).\n\n- To use tables from Cloud Storage, you must configure a metastore\n for your lake.\n\n- For data to be read from Cloud Storage entities,\n Dataproc Metastore must be attached to the lake.\n\n- CSV data in Cloud Storage entities isn't supported.\n\n- In the Dataplex Universal Catalog project, enable Private Google Access on the\n subnetwork, which is usually set to `default`, or set `internal_ip_only` to\n `false`.\n\n### Limitations\n\n- For Cloud Storage assets: this plugin does not support reading from\n CSV files. It supports reading from JSON, Avro, Parquet, and ORC formats.\n\n- For Cloud Storage assets: **Partition Start Date** and **Partition\n End Date** aren't applicable.\n\n### Required roles\n\n\nTo get the permissions that\nyou need to manage roles,\n\nask your administrator to grant you the\nfollowing IAM roles on the Dataproc service agent and the Cloud Data Fusion service agent (service-\u003cvar translate=\"no\"\u003eCUSTOMER_PROJECT_NUMBER\u003c/var\u003e@gcp-sa-datafusion.iam.gserviceaccount.com):\n\n- [Dataplex Developer](/iam/docs/roles-permissions/dataplex#dataplex.developer) (`roles/dataplex.developer`)\n- [Dataplex Data Reader](/iam/docs/roles-permissions/dataplex#dataplex.dataReader) (`roles/dataplex.dataReader`)\n- [Dataproc Metastore Metadata User](/iam/docs/roles-permissions/metastore#metastore.metadataUser) (`roles/metastore.metadataUser`)\n- [Cloud Dataplex Service Agent](/iam/docs/roles-permissions/dataplex#dataplex.serviceAgent) (`roles/dataplex.serviceAgent`)\n- [Dataplex Metadata Reader](/iam/docs/roles-permissions/dataplex#dataplex.metadataReader) (`roles/dataplex.metadataReader`)\n\n\nFor more information about granting roles, see [Manage access to projects, folders, and organizations](/iam/docs/granting-changing-revoking-access).\n\n\nYou might also be able to get\nthe required permissions through [custom\nroles](/iam/docs/creating-custom-roles) or other [predefined\nroles](/iam/docs/roles-overview#predefined).\n\nAdd the plugin to your pipeline\n-------------------------------\n\n1. In the Google Cloud console, go to the Cloud Data Fusion **Instances** page.\n\n [Go to Instances](https://console.cloud.google.com/data-fusion/locations/-/instances)\n\n This page lets you manage your instances.\n2. Click **View instance** to open your instance in the Cloud Data Fusion\n UI.\n\n3. Go to the **Studio** page, expand the **Source** menu, and click **Dataplex**.\n\nConfigure the plugin\n--------------------\n\nAfter you add this plugin to your pipeline on the **Studio** page, click\nthe Dataplex Universal Catalog source to configure its properties.\n\nFor more information about configurations, see the\n[Dataplex Source](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1766817793/Google+Dataplex+Batch+Source) reference.\n\nOptional: Get started with a sample pipeline\n--------------------------------------------\n\nSample pipelines are available, including an SAP source to\nDataplex Universal Catalog sink pipeline and a Dataplex Universal Catalog source to\nBigQuery sink pipeline.\n\nTo use a sample pipeline, open your instance in the Cloud Data Fusion UI,\nclick **Hub \\\u003e Pipelines**, and select one of the\nDataplex Universal Catalog pipelines. A dialog opens to help you create the\npipeline.\n\nWhat's next\n-----------\n\n- [Ingest data with Cloud Data Fusion](/dataplex/docs/ingest-with-data-fusion) using the Dataplex Universal Catalog Sink plugin."]]