ArrowFlight table engine
The ArrowFlight table engine enables ClickHouse to read from and write to remote datasets via the Apache Arrow Flight protocol. This integration allows ClickHouse to interact with external Flight-enabled servers in a columnar Arrow format with high performance.
Creating a Table
Engine Parameters
host:port— Address of the remote Arrow Flight server. If the port is omitted, the default port8815is used. String.dataset_name— Identifier of the dataset on the Flight server (used as a PATH descriptor or in aSELECT *query depending on thearrow_flight_request_descriptor_typesetting). String.username— Username for basic HTTP authentication. String.password— Password for basic HTTP authentication. String.
If username and password are omitted, authentication is not used (this works only if the Arrow Flight server allows unauthenticated access).
The column list is optional — if omitted, the schema is inferred from the remote Arrow Flight server via GetSchema.
Named Collections
The engine supports named collections for storing connection parameters:
Named collection parameters:
| Parameter | Required | Default | Description |
|---|---|---|---|
host or hostname | No | "" | Server hostname. |
port | Yes | — | Server port. |
dataset | Yes | — | Dataset name or descriptor. |
use_basic_authentication | No | true | Enable basic authentication. |
user or username | If auth enabled | — | Username for authentication. |
password | No | "" | Password for authentication. |
enable_ssl | No | false | Enable TLS encryption. |
ssl_ca | No | "" | Path to the CA certificate file for TLS verification. |
ssl_override_hostname | No | "" | Override the hostname checked during TLS verification. |
Settings
arrow_flight_request_descriptor_type— Controls how the dataset name is sent to the Flight server. Possible values:path(default, sends as a PATH descriptor) orcommand(sends as a CMD descriptor withSELECT * FROM <dataset>). Usecommandfor Flight servers that expect SQL commands (e.g., Dremio).
Usage Example
Reading data from a remote Arrow Flight server:
Inserting data into a remote Arrow Flight server:
Notes
- If columns are specified in the
CREATE TABLEstatement, they must match the schema returned by the Flight server. - If columns are omitted, the schema is inferred automatically from the remote server.
- Both reading (
SELECT) and writing (INSERT) are supported. - The
arrow_flight_request_descriptor_typesetting controls whether the dataset name is sent as a PATH descriptor or as a CMD descriptor wrapping aSELECT *query.