Skip to main content

Metrics Tables

When you define metrics and experiments in Causal, we store the metric data in your data warehouse so you can do exploratory analysis. You can use these tables to figure out how different subsets of your traffic are performing differently on your site.

Because Causal allows you to monitor a wide variety of metrics, the warehouse tables that store the data have slightly different formats for each type. This section describes the tables for each type of metric.

note

You do not need to use these tables to evaluate experiments in Causal. The tools UI does that analysis automatically (by reading these tables). These are here so that you can do further explorations, including integrating the other data in your data warehouse.

Per Metrics

For each session field that has a @per directive Causal creates a table of metric values that can be grouped by that field.

The table is named by concatenating the field name with _metrics. So for metrics based on the field userId, the metric table will be called user_id_metrics. There is one row per session, and the format is as follows:

  • ds (partition key) Matches the partition for the rows sessionid.
  • hh (partition key) Matches the partition for the rows sessionid.
  • metric_id: (partition key) The metric id, as specified in the Causal Tools UI.
  • @persistent_key field name: The persistant key value for the session.
  • session_id: Causal's internal session identifier. See Session Keys.
  • @per field name: The value for the @per field.
  • value: The sum of all the matching events that occurred in this session.

Session Metrics

The session_metrics table contains metric values that are grouped by session.

  • ds (partition key) Matches the partition for the rows sessionid.
  • hh (partition key) Matches the partition for the rows sessionid.
  • metric_id: (partition key) The metric id, as specified in the Causal Tools UI.
  • @persistent_key field name: The device on which the matching events occured.
  • session_id: The session where these events occured, or null.
  • count: The number of sessions this row represents.
  • value: The sum of all the matching events that occurred in this session.

The schema is basically the same as the Per Metrics above with one important difference. There is a single row that has a null sessionid that represents all the sessions in which the metric was never triggered.

Often metrics can be very sparse. For example, a click through metric may hover around 3%. With the above scheme, we store that metric data using about 97% less space than a simple row per session table.

So, it you have 100 sessions, and three hits in a partition, you'll get:

  • one row where count = 97 and value = 0 and sessionid is null
  • 3 rows where count = 1 and value > 0 and sessionid is not null

Impression Metrics

The feature_metrics table contains metric values that are grouped by feature impression. The table still has 1 row per session in order to reduce the required storage and improve performance, but we add some columns to support the fact that many impressions can occur in each session:

  • ds (partition key) Matches the partition for the rows sessionid.
  • hh (partition key) Matches the partition for the rows sessionid.
  • metric_id: (partition key) The metric id, as specified in the Causal Tools UI.
  • @persistent_key field name: The device on which the matching events occured.
  • session_id: The session where these events occured, or null.
  • count: The number off relevant impressions in this session.
  • sum: The sum of all the matching events that occurred in this session.
  • sumsq: The squared sum of all the matching events that occurred in this session. This can be used by your data scientists to estimate the variance of the statistic.

So, lets say we have a clicks per page metric defined in our system. One session sees 2 pages, getting 3 and 4 clicks respectively, and 100 sessions don't see the relevant page. The feature_metrics table will contain the following in that partition:

  • one row where sessionid is not null, count is 2, sum is 7, and sumsq is 25.
  • one row where sessionid is null, count is 100, sum is 0 and sumsq is 0.