Google Sheets connector
Zipstack Cloud features a powerful SQL querying engine on top of many types of connectors, including those from Trino, some custom connectors and connectors from the open source Airbyte project. The underlying native connectors are Trino's connectors. Additionally, some parts of the documentation for these connectors have been adapted from the connector documentation found in Trino's open source project.
The Google Sheets connector allows reading Google Sheets spreadsheets as tables in Zipstack Cloud.
Configuration properties
The following configuration properties are available:
| Property name | Description |
|---|---|
credentials-path | Path to the Google API JSON key file |
credentials-key | The base64 encoded credentials key |
metadata-sheet-id | Sheet ID of the spreadsheet, that contains the table mapping |
max-data-cache-size | Maximum number of spreadsheets to cache, defaults to 1000 |
data-cache-ttl | How long to cache spreadsheet data or metadata, defaults to 5m |
read-timeout | Timeout to read data from spreadsheet, defaults to 20s |
Credentials
The connector requires credentials in order to access the Google Sheets API.
Open the Google Sheets API page and click the Enable button. This takes you to the API manager page.
Select a project using the drop down menu at the top of the page. Create a new project, if you do not already have one.
Choose Credentials in the left panel.
Click Manage service accounts, then create a service account for the connector. On the Create key step, create and download a key in JSON format.
The key file needs to be available on the Trino coordinator and workers.
Set the credentials-path configuration property to point to
this file. The exact name of the file does not matter --- it can be
named anything.
Alternatively, set the credentials-key configuration property.
It should contain the contents of the JSON file, encoded using base64.
Metadata sheet
The metadata sheet is used to map table names to sheet IDs. Create a new metadata sheet. The first row must be a header row containing the following columns in this order:
Table Name
Sheet ID
Owner (optional)
Notes (optional)
See this example sheet as a reference.
The metadata sheet must be shared with the service account user, the one for which the key credentials file was created. Click the Share button to share the sheet with the email address of the service account.
Set the metadata-sheet-id configuration property to the ID of
this sheet.
Querying sheets
The service account user must have access to the sheet in order for Trino to query it. Click the Share button to share the sheet with the email address of the service account.
The sheet needs to be mapped to a Trino table name. Specify a table name
(column A) and the sheet ID (column B) in the metadata sheet. To refer
to a specific range in the sheet, add the range after the sheet ID,
separated with #. If a range is not provided, the connector loads only
10,000 rows by default from the first tab in the sheet.
The first row of the provided sheet range is used as the header and will determine the column names of the Trino table. For more details on sheet range syntax see the google sheets docs.
API usage limits
The Google Sheets API has usage
limits, that may
impact the usage of this connector. Increasing the cache duration and/or
size may prevent the limit from being reached. Running queries on the
information_schema.columns table without a schema and table name
filter may lead to hitting the limit, as this requires fetching the
sheet data for every table, unless it is already cached.
Type mapping
Because Trino and Google Sheets each support types that the other does
not, this connector modifies some types <type-mapping-overview> when
reading data.
Google Sheets type to Trino type mapping
| Google Sheets type | Trino type |
|---|---|
TEXT | VARCHAR |
No other types are supported.
SQL support
The connector provides globally available <sql-globally-available> and
read operation <sql-read-operations> statements to access data and
metadata in Google Sheets.
Table functions
The connector provides specific table functions </functions/table> to
access Google Sheets.
sheet(id, range) -> table
The sheet function allows you to query a Google Sheet directly without
specifying it as a named table in the metadata sheet.
For example, for a catalog named example:
SELECT *
FROM
TABLE(example.system.sheet(
id => 'googleSheetIdHere'));
A sheet range or named range can be provided as an optional range
argument. The default sheet range is $1:$10000 if one is not provided:
SELECT *
FROM
TABLE(example.system.sheet(
id => 'googleSheetIdHere',
range => 'TabName!A1:B4'));