Skip to main content

Overview

The Problem

Data maintenance can be a major time sink in any fast moving company. As your product evolves, the data that you collect around your product's features is going to change often. This is especially true if you have a new product or if you are doing a lot of experimentation. As you learn how visitors use your product, you'll improve it and will want to collect data around those improvements. Current solutions require a team of data engineers and support software to manage this change.

On top of that, maintaining backwards compatibility in your data warehouse is critical in determining how your product behaves over time. This dramatically increases the effort required. As the data you collect changes, it can be difficult to be sure that the older partitions in your data warehouse are still compatible. Failure to do so can result in warehouse queries failing with obscure serialization errors, and making large chunks of past data inaccessible.

Furthermore, because of the huge amount of effort required to maintain your data, the data structures in your warehouse will either lag behind or drift away from what is actually happening in your application. In some cases, since the maintenance is manual, data processing errors can get introduced.

Our Solution

Causal was designed with these difficulties in mind. We solve your data maintenance problem using several elegant techniques unique to the Causal tool set.

As mentioned before, the Causal compiler creates both the front end data collection API and the back end data warehouse tables from the same specification. That means they are always synchronized and accurate.

When you remove old obsolete data from your front end data collection, it automatically gets removed from the data warehouse.1 When you collect new data on the front end, it automatically appears in the data warehouse. This is all without extra work from data engineers. You save engineering effort and you avoid programming errors in the ETL.

While doing this, the compiler also keeps track of the current state of your data warehouse to make sure that you don't accidentally break things. Because FDL is a type safe language, the compiler can automatically compare your data spec with the tables in your warehouse and determine if you are about to make a breaking change. It will give you a compile time error so that you can fix the problem before it causes breakage into your warehouse.

The following sections will walk through the process of how you manage changes to your data collection process with Causal.


  1. While still being available in older views.