pandas.read_iceberg#
- pandas.read_iceberg(table_identifier, catalog_name=None, *, catalog_properties=None, row_filter=None, selected_fields=None, case_sensitive=True, snapshot_id=None, limit=None, scan_properties=None)[source]#
- Read an Apache Iceberg table into a pandas DataFrame. - Added in version 3.0.0. - Warning - read_iceberg is experimental and may change without warning. - Parameters:
- table_identifierstr
- Table identifier. 
- catalog_namestr, optional
- The name of the catalog. 
- catalog_propertiesdict of {str: str}, optional
- The properties that are used next to the catalog configuration. 
- row_filterstr, optional
- A string that describes the desired rows. 
- selected_fieldstuple of str, optional
- A tuple of strings representing the column names to return in the output dataframe. 
- case_sensitivebool, default True
- If True column matching is case sensitive. 
- snapshot_idint, optional
- Snapshot ID to time travel to. By default the table will be scanned as of the current snapshot ID. 
- limitint, optional
- An integer representing the number of rows to return in the scan result. By default all matching rows will be fetched. 
- scan_propertiesdict of {str: obj}, optional
- Additional Table properties as a dictionary of string key value pairs to use for this scan. 
 
- Returns:
- DataFrame
- DataFrame based on the Iceberg table. 
 
 - See also - read_parquet
- Read a Parquet file. 
 - Examples - >>> df = pd.read_iceberg( ... table_identifier="my_table", ... catalog_name="my_catalog", ... catalog_properties={"s3.secret-access-key": "my-secret"}, ... row_filter="trip_distance >= 10.0", ... selected_fields=("VendorID", "tpep_pickup_datetime"), ... )