The Kimball API is a self-hosted analytics service that takes events from your application and generates predictions about user behaviors.
Talk to us about how we can help with your analytics/prediction problems!
See our documentation about how to install and use the Kimball API. The remainder of this file contains information related to developing the Kimball API itself.
Use nix!
The shell.nix file contains the required development tools. Use nix-shell to get started
make deps app shell
make live-js
export TAG=kimball
docker build -t $TAG .
docker run -v ${PWD}/config/example.config:/kimball/app.config $(PWD)/config/example.config:/kimball/app.config -it -e LOG_LEVEL=debug $TAG
An Erlang configuration file will be loaded from /kimball/app.config as part of the Release/Dockerfile. A user configuration file can be placed there to provide more complex configuration described below.
Information can be included about the deployment to help in future diagnosis of problems. Right now this is the "site" and "cluster" information.
[{features, [
{site, "Site name"},
{cluster, "Cluster name"}
]}].
The initial bloom filter type, size, and error probability can be configured for counters. This is configured as a list with a regular expression matching the name of the filter.
pattern- The regular expressiontype-bloom_fixed_sizeorbloom_scalabledate_cohort-weeklyor omitted. Whether to automatically generate counters for each week to track events over time.size- The fixed size or initial size (forbloom_scalablefilters)error_probability- Bloom filter error probability
Example
[{features, [
{counters, #{
init => [
#{pattern => ".*",
type => bloom_fixed_size,
date_cohort => weekly,
size => 10000,
error_probability => 0.01}
]
}}
]}].
The application can forward event streams via GRPC to external services. This is currently experimental, not well tested, and may incur significant performance concerns.
Proto file available at src/proto/features_proto.proto.
Configure such as
[{features, [
{external_grpc_event_targets, [{"127.0.0.1", 8079}]}
]}].
API requests for the prediction API can also ask an external service for predictions and include them in the Kimball API response.
Proto file available at src/proto/features_proto.proto.
Configure such as
[{features, [
{external_grpc_prediction_targets, [{"service name", "127.0.0.1", 8079}]}
]}].
ADDITIONAL_NAMESPACES- A comma separated list of namespaces to sync feature config to. This should include any namespaces where you intend to run sidecarsANALYTICS_HOST- Where to forward analytic events to if this process isn't storing them directly. This is used by the sidecar mode to know where to forward to an api-mode process.API_PORT- (default8080) Port where the HTTP API will be available.AWS_ACCESS_KEY_ID- Credentials for interacting with AWSAWS_SECRET_ACCESS_KEY- Credentials for interacting with AWSFEATURES_MODE- Which mode to start the application inapi(default) - Fully feature API server, storing state in configmapssidecar- Read only API meant to be deployed as a sidecar. Features features from/features/datavolume in Kubernetes.
KUBERNETES_MEMORY_LIMIT- The container limit in bytes, used for computing the metricmemory_remaining_bytesNAMESPACE- Namespace to use for reading/writing in KubernetesS3_BUCKET- AWS S3 Bucket to use for storageS3_HOST- AWS S3 Host to use for storage. This will attempt to auto configure when running in AWS.GCS_BUCKET- Google Cloud Storage Bucket to use for storageGOOGLE_APPLICATION_CREDENTIALS- Path to a JSON Service Account KeySTORAGE_PATH_PREFIX- Path prefix to use when storing files in S3/GCS. Defaults to the installation namespace.
More advanced/less-likely used configuration can be done via additional app.config options
counter_startup_delay - int - Milliseconds between starting each counter when the application begins/resets. Used to rate limit things to prevent thundering herds as the application starts.
[{features, [
{counter_startup_delay, 1}
]}].
System metrics are available at /metrics
Some important ones:
kimball_counters- The number of counters registered with the router. Equivalent to the number of events tracked.kimball_persist_counters_managed- The number of counters the persistence manager triggered in the last run. Should track, but lag,kimball_counters.
Metrics for each counter are available at /metrics/counters
-
kimball_counter- Event counters -
kimball_counter_weekly- Per week counters ifdate_cohort => weeklyis set.
Metrics for goal/event predictions are available at /metrics/predictions
kimball_bayes_prediction- Prediction that users who completeeventlabel will completed thegoallabel
git checkout trunk
git pull
TAG=$(date +"%Y.%m.%d")
git tag ${TAG}
git push origin ${TAG}
- Find outdated dependencies with
npm outdatedand update in `package.json - Run
npm update
... I think
Apache 2.0. Copyright Get Kimball Inc. 2020