INSPECT_PARQUET

import FunctionDescription from '@site/src/components/FunctionDescription';

Retrieves a table of comprehensive metadata from a staged Parquet file, including the following columns:

ColumnDescription
created_byThe entity or source responsible for creating the Parquet file
num_columnsThe number of columns in the Parquet file
num_rowsThe total number of rows or records in the Parquet file
num_row_groupsThe count of row groups within the Parquet file
serialized_sizeThe size of the Parquet file on disk (compressed)
max_row_groups_size_compressedThe size of the largest row group (compressed)
max_row_groups_size_uncompressedThe size of the largest row group (uncompressed)

SQL Syntax

INSPECT_PARQUET('@<path-to-file>')

SQL Examples

This example retrieves the metadata from a staged sample Parquet file named books.parquet. The file contains two records:

Transaction Processing,Jim Gray,1992
Readings in Database Systems,Michael Stonebraker,2004
-- Show the staged file
LIST @my_internal_stage;

┌──────────────────────────────────────────────────────────────────────────────────────────────┐
      name       size          md5                last_modified               creator     
├───────────────┼────────┼──────────────────┼───────────────────────────────┼──────────────────┤
 books.parquet     998  NULL              2023-04-19 19:34:51.303 +0000  NULL             
└──────────────────────────────────────────────────────────────────────────────────────────────┘

-- Retrieve metadata from the staged file
SELECT * FROM INSPECT_PARQUET('@my_internal_stage/books.parquet');

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
             created_by              num_columns  num_rows  num_row_groups  serialized_size  max_row_groups_size_compressed  max_row_groups_size_uncompressed 
├────────────────────────────────────┼─────────────┼──────────┼────────────────┼─────────────────┼────────────────────────────────┼──────────────────────────────────┤
 parquet-cpp version 1.5.1-SNAPSHOT            3         2               1              998                             332                               320 
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Last modified March 27, 2024 at 12:01 PM EST: adding databend functions and removing PostGID and MADLib (b049aed)