Databricks Setup Guide

Last updated: November 6, 2024

To connect Databricks and enable Aleph to query its data directly, Aleph needs a Databricks user or service principal that Aleph will use to read the tables and data.

We suggest creating a separate database user or service principal for Aleph, with only access to the tables you need to consume in Aleph. If you already have a database user or principal for the finance team, we recommend assigning that user/principal and the Aleph user/principal the same role, so that any table that is made available to finance is also available in Aleph.

Using a Personal Access Token

To connect Databricks to Aleph using a personal access token, you need to provide us the following information:

  • Host (e.g., dbc-a1b2345c-d6e7.cloud.databricks.com)

  • Path (e.g., sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh)

  • Token (e.g., dapi12345678901234567890123456789012)

  • Database

  • Schema

Using OAuth M2M

We also support service principal using OAuth M2M. For this we need the following information:

  • Host (e.g., dbc-a1b2345c-d6e7.cloud.databricks.com)

  • Path (e.g., sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh)

  • Client ID

  • Client Secret

If you have an IP whitelist, you must add our server IP 35.202.250.148 to the whitelist

To share this information securely, you can use Doppler Share or your preferred alternative. For security reasons, we recommend choosing an appropriate expiration time for the share link.

Important: the credential information is sensitive so only share it with authorized individuals.