Backend Setup
In order to have the data coming from the front end APIs go anywhere, you'll need to set up your back end. That'll allow you to:
- Watch the data flow through to the data warehouse
- Interact with the features using the Causal Web Tools.
This is a limited beta and we cannot guarantee that we will be able to provide logins for everyone. Accounts are provided on a first come first serve basis.
Please follow the instructions for your specific cloud provider below.
Setting up on AWS
For the beta, you need to create an AWS bucket and Glue database to which Causal will write data. We provide a Python PEX image that can do all this automatically.
Download the package like this:
wget https://tech.causallabs.io/causal-aws-setup.pex
In order to run the script, you'll need to provide AWS credentials. The recommended way to do that is to install the AWS command line interface:
- Download and install the CLI using the instructions found here
- Perform a quick setup
If you don't want to install the CLI, you can also provide the credentials to the script using the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
environment variables.
Run the command line below, replacing the variables as follows:
region
: The region where you want your data to reside. Should be in the same region where you do most of your data processing.environment_name
: A name to identify which environment the data comes from. For example, if you have a staging and a production development environment, you can CausalPolicy the command twice. Once with the name "staging" and once with the name "production".bucket_name
: What you'd like to call the S3 bucket. Since all S3 buckets share the same namespace, this will have to be a unique name that no one else on S3 has used before.db_name
: The name of the glue database where the warehouse metadata will be saved. It must be a database devoted only to Causal data, because non-Causal data will be dropped. If you don't specify this, it defaults to "Causal-environment_name
".
$ python3 causal-aws-setup.pex --region <region> --name <environment_name> --bucket <bucket_name> --glue-database <db_name>
## result should look like this:
## {'name': 'getting-started', 'bucket': 'getting-started-xxxx', 'region':' us-east-1', 'awsExtId': 'xxxx', 'role': 'yyy', 'gluedb': 'zzzz'}
Send that information for each environment to support@causallabs.io or your dedicated slack channel and we will setup your account and send you your credentials.
For each environment, the credentials include:
- An environment id, which uniquely identifies the getting started development environment, and
- A security token, which allows you to perform operations in this environment.
A security token can access subset of your environments. Typically you'd have one for reading and writing to staging/development environments, and another for reading and writing production. Contact support when you want to set up more than one environment and will will arrange the correct perimissions.
Setting up on Azure
Causal will write your warehouse data to a container in Azure Blob storage. In order to do so, we need the URL of the storage container, and a shared access token though which Causal's ETL processes will update the data.
For this reason, we highly recommend that you create a container specifically to hold Causal data.
In order to create a container for Causal:
- Go to your Azure storage account (or make a new one).
- Click on the "+ Container" button
- Give it a name and set the access level to "Private (no anonymous access)"
- Click the create button.
- Select the container in the resulting list view.
- Click on "Settings > Shared access tokens"
- Click "Permissions", Select Read, Add, Create, Write, Delete, and List
- Select an appropriate expiry time (you'll have to set up a new token when this one expires)
- Make sure "HTTPS only" is selected for Allowed Protocols
- Click "Generate SAS token and URL"
- Send the "Blob SAS URL" to your Causal support person.
After that, we will set up your environment so when you launch an impression server with the appropriate environment id and security token, it will stream your logged data to this container.