CTFp - CTF Pilot's CTF Platform

Tip

If you are looking for how to build challenges for CTFp, please check out the CTF Pilot's Challenges Template and CTF Pilot's Challenge Toolkit repositories.

CTFp (CTF Pilot's CTF Platform) is a CTF platform designed to host large-scale Capture The Flag (CTF) competitions, with a focus on scalability, resilience, and ease of use.
The platform uses Kubernetes as the underlying orchestration system, where the management, scoreboard, and challenge infrastructure are deployed as Kubernetes resources. It then leverages GitOps through ArgoCD for managing the platform's configuration and deployments, including the CTF challenges.

CTFp acts as the orchestration layer for deploying and managing the platform, while utilizing a variety of CTF Pilot's components to provide the full functionality of the platform.

CTFp provides a CLI tool for managing the deployment of the platform, but it is possible to use the individual Terraform components directly if desired. To manage the platform after initial deployment, you will primarily interact with the Kubernetes cluster using kubectl, ArgoCD, and the other monitoring systems deployed.

Important

In order to run CTFp properly, you will need to have a working knowledge of Cloud, Kubernetes, Terraform/OpenTofu, GitOps, and CTFd.
The platform is designed to work with CTF Pilot's Challenges ecosystem, to ensure secure hosting of CTF challenges.

This platform is not intended for beginners, and it is assumed that you have prior experience with these technologies and systems.
Incorrect handling of Kubernetes resources can lead to data loss, downtime, and security vulnerabilities.
Incorrectly configured challenges may lead to security vulnerabilities or platform instability.

This platform deploys real-world infrastructure and will incur costs when deployed.

Features

CTFp offers a wide range of features to facilitate the deployment and management of CTF competitions. Below is an overview of the key features:

Infrastructure & Deployment
- Multi-environment support with isolated configurations for Test, Dev, and Production
- Component-based architecture with four deployable components: Cluster, Ops, Platform, and Challenges
- Infrastructure as Code using Terraform/OpenTofu with automated state management and S3 backend
- Multi-region Kubernetes clusters on Hetzner Cloud with configurable node types and auto-scaling
- Custom server images generation using Packer
- Cloudflare DNS integration for management, platform, and CTF zones
Operations & Monitoring
- GitOps workflow powered by ArgoCD for automated deployments
- Comprehensive monitoring with Prometheus, Grafana, and metrics exporters
- Log aggregation via Filebeat to Elasticsearch
- Traefik ingress controller with SSL certificate management (cert-manager)
- Discord webhook notifications for platform events
- Automated descheduling for optimal resource distribution
Scoreboard
- Customizable CTFd scoreboard deployment allowing for bring-your-own CTFd configuration
- Auto deployment of CTFd configuration providing a ready-to-use CTFd instance
- Flexible CTF settings supporting a large portion of CTFd's configuration options
- S3 storage configuration for challenge files and user uploads in CTFd
- Clustered database setup with MariaDB operator and automated backups to S3
- Redis caching with Redis operator for ease of use
- Automatic deployment of CTFd pages from GitHub
Challenge Management
- Full support for CTF Pilot's Challenges ecosystem, including KubeCTF integration
- Support for three challenge deployment modes: Isolated, Shared, and Instanced
- Git-based deployment with branch-specific configurations
- IP whitelisting for challenge access control
- Custom fallback pages for errors and instancing states
CLI Tool
- Simple command-line interface for managing the deployment and lifecycle of the platform
- Modular commands for initializing, deploying, destroying, and managing components
- Environment management for handling multiple deployment environments (Test, Dev, Prod)
- State management with automated backend configuration, with states stored in S3
- Plan generation and review before applying changes
- Under 20 minutes deployment time for the entire platform (excluding image generation)
- Fully configured through configuration files for easy setup and management

Quick start

Tip

This is a quick start guide for getting the platform up and running, and acts as a quick reference guide.
If it is your first time working with CTFp, we recommend going through the full documentation for a more in-depth understanding of the platform and its components.

To use the CTFp CLI tool, you first need to clone the repository:

git clone https://github.com/ctfpilot/ctfp
cd ctfp

First, you need to initialize the platform configuration for your desired environment (test, dev, prod):

./ctfp.py init

Note

You can add --test, --dev or --prod to specify the environment you want to initialize.
The default environment is test (--test).

Used in all commands except the generate-images command, as it asks for the Hetzner Cloud project to use when generating images.

Next, you need to fill out the configuration located in the automated.<env>.tfvars file.

In order to deploy, ensure you have SSH keys created, and inserted into your configuration:

./ctfp.py generate-keys --insert

To create the server images used for the Kubernetes cluster nodes, run:

./ctfp.py generate-images

To use the Terraform modules, you need to generate the backend configuration for each component.

./ctfp.py generate-backend cluster <bucket> <region> <endpoint>
./ctfp.py generate-backend ops <bucket> <region> <endpoint>
./ctfp.py generate-backend platform <bucket> <region> <endpoint>
./ctfp.py generate-backend challenges <bucket> <region> <endpoint>

Replace <bucket>, <region>, and <endpoint> with your S3 bucket details.

Finally, you can deploy the entire platform with:

./ctfp.py deploy all

To destroy the entire platform, run:

./ctfp.py destroy all

all can be replaced with any of the individual components: cluster, ops, platform, challenges.

To interact with the cluster, run the following command to configure your kubectl context:

source kubectl.sh [test|dev|prod]

source is required to set the environment variables in your current shell session.

How to run

Pre-requisites

In order to even deploy the platform, the following software needs to be installed on your local machine:

OpenTofu (Alternative version of Terraform)
Packer - For initial generation of server images
Kubectl - For interacting with the Kubernetes cluster
hcloud CLI tool - For interacting with the Hetzner Cloud API (Otherwise use the Hetzner web interface)
SSH client - For connecting to the servers
Python 3 - For running the CTFp CLI tool
Python package python-hcl2 - Required by the CTFp CLI tool for parsing Terraform configuration files

And the following is required in order to deploy the platform:

Hetzner Cloud account with one or more Hetzner Cloud projects
Hetzner Cloud API Token - For authenticating with the Hetzner Cloud API
Hetzner S3 buckets - For storing the Terraform state files, backups, and challenge data. We recommend using 3 separate buckets with separate access keys for security reasons
Cloudflare account
Cloudflare API Token - For authenticating with the Cloudflare API
3 Cloudflare-managed domains - For allowing the system to allocate a domain for the Kubernetes cluster. Used to allocate management, platform, and challenge domains.
SMTP mail server - To allow CTFd to send emails to users (Password resets, notifications, etc.). The system is set up to allow outbound connections to Brevo SMTP on port 587.
Discord channels to receive notifications. One for monitoring alerts and one for first-blood notifications.
GitHub repository following CTF Pilot's Challenges template for CTF challenges and CTFd pages - A Git repository containing the CTF challenges to be deployed. This should be your own private repository using the CTF Pilot Challenges Template as a base. This may also contain the pages to be used in CTFd.
GitHub repository containing the CTFd configuration - We recommend forking CTF Pilot's CTFd configuration repository.
Access tokens to access the GitHub repositories and container registry - Fine-grained personal access token and Personal Access Tokens (PAT) with read access to the repositories containing the CTF challenges and CTFd configuration and GitHub container registry. We recommend setting up a bot account for this purpose.
Elasticsearch endpoint - Elasticsearch instance with an endpoint and user credentials for log aggregation. Used to connect Filebeat to Elasticsearch.

Environments

CTFp supports three different environments for deployment:

Test: Intended for testing and experimentation. This environment is suitable for trying out new features, configurations, and updates without affecting the production environment. It is recommended to use smaller server sizes and fewer nodes to minimize costs.
Dev: Intended for development and staging purposes. This environment is suitable for testing new challenges, configurations, and updates before deploying them to production. It should closely resemble the production environment in terms of server sizes and configurations, but can still be scaled down to save costs.
Prod: Intended for hosting live CTF competitions. This environment should be configured for high availability, performance, and security. It is recommended to use larger server sizes, more nodes, and robust configurations to ensure a smooth experience for participants.

The environments are configured through separate automated.<env>.tfvars files, allowing for isolated configurations and deployments.

In the CLI tool, you can specify the environment using the --test, --dev, or --prod flags in the commands. If no flag is provided, the default environment is test.

Configuring the platform

Tip

To understand the full configuration options and their implications, please refer to the documentation in the automated.<env>.tfvars or template.automated.tfvars file.

To configure the platform, you need to configure the automated.<env>.tfvars file located in the root of the repository.

It contains a number of configuration options for the platform.
Each configuration option is within the file, explaining and listed with its possible values.

An automated check, checks if all values are filled out correctly when running the CLI tool.
Therefore, be sure to fill out all required values before attempting to deploy the platform.
Non-required values are per default commented out, and can be left as is if the default value is acceptable.

The configuration file is the single source of truth for the platform's configuration, and is used by the CLI tool to deploy and manage the platform.
If configuration in the configuration file is changed, the changes will be applied to the platform during the next deployment.
If the platform is manually changed outside of the CLI tool, the changes will be reverted during the next deployment.

Important

The template.automated.tfvars file is git tracked, and MUST NOT be changed in the repository to include sensitive information.
Instead, copy the file to automated.<env>.tfvars and fill out the values there.
The automated.<env>.tfvars files are git ignored, and will not be tracked by git.

The file can be initialized using the ./ctfp.py init command.

Each component is not fully configurable, and may in certain situations require advanced configuration. These configurations are not included in the main configuration file. These options are either intended to be static, or require manual configuration through the individual Terraform components.
Changing these options may lead to instability or data loss, and should be done with caution.

CLI Tool

The CTFp CLI tool is a Python script that can be executed directly from the command line, and manages the deployment and lifecycle of the CTFp platform.

Prerequisites:

Install required Python dependencies:
```
pip install -r requirements.txt
```
This installs python-hcl2, which is required for parsing Terraform configuration files.
Ensure the script has executable permissions:
```
chmod +x ctfp.py
```

Running the CLI tool:

You can now run commands directly:

./ctfp.py <command> [options]

Alternatively, you can always run it explicitly with Python:

python3 ctfp.py <command> [options]

Both methods are functionally equivalent. The direct execution method (first example) is more convenient for regular use.

Commands

Tip

You can run any command with the --help flag to get more information about the command and its options.
For example: ./ctfp.py deploy --help

Available commands:

init - Initialize Platform Configuration
generate-keys - Generate SSH Keys
insert-keys - Insert SSH Keys into Configuration
generate-images - Generate Custom Server Images
generate-backend - Generate Terraform Backend Configuration
deploy - Deploy Platform Components
destroy - Destroy Platform Components

Below is a detailed overview of each available command:

`init` - Initialize Platform Configuration

Initializes the platform configuration for a specified environment by creating an automated.<env>.tfvars file based on the template.

Syntax:

./ctfp.py init [--force] [--test|--dev|--prod]

Options:

--force: Force overwrite the configuration file if it already exists (by default, the tool prompts before overwriting)
--test: Initialize TEST environment (default)
--dev: Initialize DEV environment
--prod: Initialize PROD environment

Example:

./ctfp.py init --test
./ctfp.py init --prod --force

Output: Creates automated.test.tfvars, automated.dev.tfvars, or automated.prod.tfvars in the repository root.

`generate-keys` - Generate SSH Keys

Generates SSH keys (ed25519) required for accessing the cluster nodes. Optionally inserts the base64-encoded keys directly into the configuration file.

Syntax:

./ctfp.py generate-keys [--insert] [--test|--dev|--prod]

Options:

--insert: Automatically insert the generated keys into the automated.<env>.tfvars file
--test: Generate keys for TEST environment (default)
--dev: Generate keys for DEV environment
--prod: Generate keys for PROD environment

Example:

./ctfp.py generate-keys --insert --test
./ctfp.py generate-keys --dev

Output: Creates keys/k8s-<env>.pub (public key) and keys/k8s-<env> (private key) in the keys/ directory.

`insert-keys` - Insert SSH Keys into Configuration

Manually inserts previously generated SSH keys into the configuration file. Useful if keys were generated separately or if you need to update existing keys.

Syntax:

./ctfp.py insert-keys [--test|--dev|--prod]

Options:

--test: Insert keys for TEST environment (default)
--dev: Insert keys for DEV environment
--prod: Insert keys for PROD environment

Example:

./ctfp.py insert-keys --test
./ctfp.py insert-keys --prod

Prerequisite: Keys must already exist in the keys/ directory.

`generate-images` - Generate Custom Server Images

Generates custom Packer images for Kubernetes cluster nodes. These images are used when provisioning the cluster infrastructure on Hetzner Cloud.

Syntax:

./ctfp.py generate-images

Note

The generate-images command does not use environment flags. It requires you to select the Hetzner Cloud project interactively during execution.

Output: Packer creates and uploads custom images to your Hetzner Cloud project.

Time: This is typically the longest-running operation, taking 5-15 minutes.

`generate-backend` - Generate Terraform Backend Configuration

Generates the Terraform backend configuration file (backend.tf) for the specified environment. This file configures the S3 backend for storing Terraform state files.

Syntax:

./ctfp.py generate-backend <component> <bucket> <region> <endpoint>

Arguments:

<component>: Component for which to generate the backend configuration: cluster, ops, platform, or challenges
<bucket>: Name of the S3 bucket to use for storing the Terraform state
<region>: Region where the S3 bucket is located
<endpoint>: Endpoint URL for the S3-compatible storage. For example nbg1.your-objectstorage.com for Hetzner Cloud Object Storage in nbg1 region.

Example:

./ctfp.py generate-backend cluster ctfp-cluster-state nbg1 nbg1.your-objectstorage.com
./ctfp.py generate-backend platform ctfp-platform-state fsn1 fsn1.your-objectstorage.com

Output: Creates a HCL configuration for the specified component's Terraform backend in the backend/generated/ directory.

See more about this command in the backend directory.

`deploy` - Deploy Platform Components

Deploys one or more components of the platform to the specified environment. Can deploy individual components or the entire platform at once.

Syntax:

./ctfp.py deploy <component> [--auto-apply] [--test|--dev|--prod]

Arguments:

<component>: Component to deploy: cluster, ops, platform, challenges, or all
- cluster: Provisions Kubernetes infrastructure on Hetzner Cloud
- ops: Deploys operational tools (ArgoCD, monitoring, logging, ingress)
- platform: Deploys CTFd scoreboard and associated services
- challenges: Deploys CTF challenges infrastructure
- all: Deploys all components in sequence

Options:

--auto-apply: Automatically apply Terraform changes without interactive prompts (use with extreme caution)
--test: Deploy to TEST environment (default)
--dev: Deploy to DEV environment
--prod: Deploy to PROD environment

Example:

./ctfp.py deploy all --test
./ctfp.py deploy cluster --prod
./ctfp.py deploy platform --dev --auto-apply

Deployment Order: When deploying all, components are deployed in this order: cluster → ops → platform → challenges. Each component must be successfully deployed before the next begins.

Output: Creates Terraform state files in the terraform/ directory and outputs deployment status and timing information.

`destroy` - Destroy Platform Components

Warning

Destroying the platform will delete all data associated with the environment, including databases, user data, and challenge instances. This action cannot be undone. Always ensure you have backups before destroying production environments.

Destroys one or more components of the platform. This is the reverse of deploy and tears down infrastructure, databases, and services.

Syntax:

./ctfp.py destroy <component> [--auto-apply] [--test|--dev|--prod]

Arguments:

<component>: Component to destroy: cluster, ops, platform, challenges, or all

Options:

--auto-apply: Automatically confirm destruction without interactive prompts (use with extreme caution)
--test: Destroy TEST environment (default)
--dev: Destroy DEV environment
--prod: Destroy PROD environment

Example:

./ctfp.py destroy all --prod
./ctfp.py destroy challenges --test --auto-apply

Destruction Order: When destroying all, components are destroyed in reverse order: challenges → platform → ops → cluster. This ensures dependencies are properly cleaned up.

Workflow Overview

The workflow for deploying and managing CTFp can be summarized in the following key phases:

Setup Phase:
- Clone the repository and generate backend configurations.
Preparation Phase:
- Generate custom server images (one-time setup per Hetzner project).
- Generate SSH keys.
- Create needed pre-requisites.
- Configure the platform using the automated.<env>.tfvars file.
Deployment Phase:
- Deploy components in sequence: Cluster → Ops → Platform → Challenges.
- Use deploy all for automated deployment or deploy components individually.
Live Operations:
- Monitor the platform using tools like ArgoCD, Grafana, and Prometheus.
- Manage challenges, and apply updates as needed.
Teardown Phase:
- Destroy components in reverse order: Challenges → Platform → Ops → Cluster.
- Use destroy all for automated teardown or destroy components individually.

Guides

Updating sizes of nodes in a running platform

Tip

When upgrading existing clusters, it is recommended to drain node pools before changing their sizes, to avoid disruption of running workloads.
Update one node pool at a time, to minimize the impact on the cluster.

When updating the sizes of nodes in an existing cluster, it is important to follow a specific procedure to ensure a smooth transition and avoid downtime or data loss.
Below are the steps to update the sizes of nodes in an existing cluster:

Drain the Node Pool: Before making any changes, drain the node pool that you intend to update. This will safely evict all workloads from the nodes in the pool, allowing them to be rescheduled on other nodes in the cluster.
```
# List nodes
kubectl get nodes

# Drain each node in the node pool
kubectl drain <node-name> --ignore-daemonsets --delete-local-data
```
You will need to repeat this for each node in the node pool. You can use tools such as draino to automate this process.
Update the Configuration: Modify the automated.<env>.tfvars file to reflect the new sizes for the nodes in the node pool. Ensure that you only change the sizes for the specific node pool you are updating.
Deploy the Changes: Use the CTFp CLI tool to deploy the changes to the cluster. This will apply the updated configuration and resize the nodes in the specified node pool.
```
./ctfp.py deploy cluster --<env>
```
Replace <env> with the appropriate environment flag (--test, --dev, or --prod).
Monitor the Deployment: Keep an eye on the deployment process to ensure that the nodes are resized correctly and that there are no issues. You can use kubectl get nodes to check the status of the nodes in the cluster.
Uncordon the Node Pool: Once the nodes have been resized and are ready, uncordon the node pool to allow workloads to be scheduled on the nodes again.
```
kubectl uncordon <node-name>
```
Repeat this for each node in the node pool.
Verify the Changes: Finally, verify that the workloads are running correctly on the resized nodes and that there are no issues in the cluster.
Repeat for Other Node Pools: If you have multiple node pools to update, repeat the above steps for each node pool, one at a time.

Warning

Changing node sizes can lead to temporary disruption of workloads.
Always ensure that you have backups of critical data before making changes to the cluster configuration.

Changes to the scale_type will only affect new nodes being created, and will not resize existing nodes, as the deployment of these nodes is done as resources are needed.

You may need to manually intervene to resize existing nodes if required, or delete them, forcing the system to create new nodes with the updated sizes. However, this may lead to downtime for workloads running on the nodes being deleted.

Note

Downscaling nodes may not be possible, depending on the initial size of the nodes and the new size.

Hetzner does not support downsizing nodes if they were initially created with a larger size.
In such cases, the nodes will need to be deleted, forcing the system to create new nodes with the desired size.

Deploying a new challenge

To deploy a new challenge, you will need to add the challenge to the configuration file, and then deploy the changes to the platform.

Challenges are split into three types:

static - Static challenge, often with a handout (files, puzzles, etc.).
shared - Challenge with a single instance for all teams to connect to.
instanced - Challenge with individual instances for each team.

The challenge should be formatted using the CTF Pilot's Challenges Template, and built using the CTF Pilot's Challenge Toolkit and CTF Pilot's Challenge Schema.

In the configuration file, you will need to add the challenge under the Challenges configuration section.

For static files, add the challenge under the challenges_static list:

challenges_static = {
  <category> = [
    "<challenge-slug>"
  ]
}

For shared challenges, add the challenge under the challenges_shared list:

challenges_shared = {
  <category> = [
    "<challenge-slug>"
  ]
}

For instanced challenges, add the challenge under the challenges_instanced list:

challenges_instanced = {
  <category> = [
    "<challenge-slug>"
  ]
}

An example of this, using CTF Pilot's Challenges example repository, would look like this:

challenges_static = {
  forensics = ["oh-look-a-flag"],
}
challenges_shared = {
  web = ["the-shared-site"],
}
challenges_instanced = {
  web  = ["where-robots-cannot-search"],
  misc = ["a-true-connection"],
}

In order to deploy the new challenge, you need to deploy the challenges component using the CLI tool:

./ctfp.py deploy challenges --<env>

To remove a challenge, delete it from the configuration file, and then deploy the challenges component again.

Challenge changes are automatically and continuously deployed through ArgoCD, so no manual intervention is required after the initial deployment.

Updating a challenge

Challenge updates are handled through the Git repository containing the challenges.

If a challenge's slug has been changed, you need to remove the old slug from the configuration file, and add the new slug. For this, follow the Deploying a new challenge guide.

Deploying a page

To deploy a new page to CTFd, you will need to add the page to a Git repository that should be formatted using the CTF Pilot's Challenges Template, and built using the CTF Pilot's Challenge Toolkit and CTF Pilot's Page Schema.

In the configuration file, you will need to add the page under the Pages configuration section.

For pages, add the page under the pages list:

pages = [
  "<page-slug>"
]

An example of this, using the CTF Pilot's Challenges example repository, would look like this:

pages = ["index"]

In order to deploy the new page, you need to deploy the platform component using the CLI tool:

./ctfp.py deploy platform --<env>

To remove a page, you need to remove it from the configuration file, and then deploy the platform component again.

Page changes are automatically and continuously deployed through ArgoCD, so no manual intervention is required after the initial deployment.

The CLI tool does not seem to support my setup

The CLI tool is designed to cover a wide range of deployment scenarios, but it may be that your specific setup requires some customization in each Terraform component.

Each component is located in its own directory, and can be deployed manually using OpenTofu/terraform commands.

However, be aware that the CLI tool also manages the Terraform backend configuration, and you will need to set this up manually if you choose to deploy the components manually.

Documentation is located within each component directory, explaining the configuration options and how to deploy the component manually.
A template tfvars file is also located in each component directory in tfvars/template.tfvars, explaining the configuration options available for that component.

Restoring the database from a backup

By default, the platform is set up to create automated backups of the database every 15 minutes, and store them in the configured S3 bucket.

You can restore the database from any available backup by timestamp.

To restore the database from a backup, follow these steps:

Identify the Backup: Determine the timestamp of the backup you want to restore from. Backups are stored in the S3 bucket specified in the configuration file, under the s3_bucket setting. You can list the backups using your S3 management tool or CLI.
Create a restore resource: The MariaDB operator provides an easy-to-use restore resource that can be used to restore the database from a backup.
Create a YAML file named mariadb-restore.yaml with the following content, replacing <timestamp> with the timestamp of the backup you want to restore from:
```
apiVersion: k8s.mariadb.com/v1alpha1
kind: Restore
metadata:
  name: restore
  namespace: db
spec:
  mariaDbRef:
    name: ctfd-db
    namespace: db
  backupRef:
    name: db-backup-ctfd-db
  targetRecoveryTime: 2025-07-17T20:25:00Z
```
Replace the targetRecoveryTime value with the desired timestamp in RFC 3339 format. The time does not need to be exact, as the restore operation will restore to the nearest available backup before the specified time.

This requires the platform to be running, with the database operator and platform component both deployed, as this will set up the necessary resources for the restore operation.
Apply the restore resource: Apply the restore resource using kubectl:
```
kubectl apply -f mariadb-restore.yaml
```
Monitor the restore process: Monitor the restore process by checking the status of the restore resource:
```
kubectl -n db get restore
```
Cleanup: Once the restore is complete, you can delete the restore resource:
```
kubectl -n db delete -f mariadb-restore.yaml
```

If you are restoring the full platform, you need to first deploy the cluster, ops, platform, and challenges components, before applying the restore resource.
After this, follow the "Restoring the CTFd-manager" guide to restore the CTFd-manager data.

If you want to restore the database to another MariaDB instance, you can copy the backup files from the S3 bucket, and use the MariaDB tools to restore the database manually.
The backup files are cleartext SQL dump files.

Restoring the CTFd-manager

The CTFd-manager is responsible for maintaining page and challenge states within CTFd, and has local configuration to keep track of what challenges are deployed and their IDs within CTFd.
To ensure there does not exist a disconnect, and the manager can correctly connect and manage the challenges, it is important to restore the CTFd-manager data alongside the database.

You must manually update the challenge IDs in the challenge manager. In order to do this, the following flow can be used:

Retrieve the current challenge-id mapping from the ctfd-manager

 kubectl -n challenge-config get configmap ctfd-challenges -o yaml > challenges.yaml

Open the challenges.yaml file and update the challenge ids. (See CTFd dashboard for challenge names and IDs)

Apply the updated challenge mapping:

kubectl -n challenge-config apply -f challenges.yaml

Generate new access token for the CTFd manager. This is done on the admin user in CTFd.

Update the access token in the secrets for the CTFd manager:

kubectl -n challenge-config edit configmap ctfd-access-token

Replace the token value with the new access token generated in step 4.
Restart the ctfd-manager to ensure it picks up the new configs:
```
kubectl -n challenge-config rollout restart deployment ctfd-manager
```
If it does not pick up the data, you can empty out the challenge-configmap-hashset configmap to force a reload.

The CTFd manager is now updated with the new challenge IDs and access token.
The system should therefore self-heal with files and missing elements of the challenges.

If you are restoring the full platform, you need to first deploy the cluster, ops, platform, and challenges components, before applying the restore resource.
You need to restore the CTFd-manager after restoring the database. You may restore the CTFd-manager before deploying the challenges component, but the configmap ctfd-challenges will then be empty, and you will need to manually format it.

Architecture

CTFp is composed of four main components, each responsible for different aspects of the platform's functionality:

Cluster: Responsible for provisioning and managing the underlying Kubernetes cluster infrastructure on Hetzner Cloud.
This includes setting up the necessary servers, networking, and storage resources required for the cluster to operate.
This can be found in the cluster directory, and as the cluster component in the CLI tool.
Ops (Operations): Focuses on deploying and managing the operational tools and monitoring systems for the platform.
This includes setting up ArgoCD, monitoring, logging, ingress controllers, and other essential services that ensure the smooth operation of the platform.
This can be found in the ops directory, and as the ops component in the CLI tool.
Platform: Handles the deployment and configuration of the CTFd scoreboard and its associated services.
This includes setting up the database, caching, and storage solutions required for the scoreboard to function effectively.
This can be found in the platform directory, and as the platform component in the CLI tool.
Challenges: Manages the deployment and configuration of the CTF challenges.
This includes setting up the necessary resources and configurations to host and manage the challenges securely and efficiently.
This can be found in the challenges directory, and as the challenges component in the CLI tool.

Each component is designed to be modular and can be deployed independently or together, allowing for flexibility in managing the platform's infrastructure and services.

Directory structure

The CTFp repository is structured as follows:

ctfp/
├── backend/                   # Terraform backend configurations
├── keys/                      # Generated SSH keys
├── terraform/                 # Terraform plans
├── tf-modules/                # Reusable Terraform modules
├── cluster/                   # Cluster component Terraform configurations
├── ops/                       # Ops component Terraform configurations
├── platform/                  # Platform component Terraform configurations
├── challenges/                # Challenges component Terraform configurations
├── ctfp.py                    # CTFp CLI tool
├── kubectl.sh                 # Script for configuring kubectl context
├── README.md                  # This README file
├── requirements.txt           # Python dependencies for the CLI tool
├── template.automated.tfvars  # Template for CTFp CLI configuration
└── ...                        # Other files and directories, such as license, contributing guidelines, etc.

Overview

The above figure, details how the different components come together to form the complete CTFp platform.
It highlights the central elements: CTFd, DB Cluster, Redis, CTFd-manager, KubeCTF, monitoring and deployment flow.

The figure serves as an overview of the platform's architecture, and does therefore not include all components and services involved in the platform.

Cluster

The Cluster component is responsible for provisioning and managing the Kubernetes cluster infrastructure on Hetzner Cloud.

It deploys a kube-hetzner cluster within the Hetzner Cloud environment, setting up the necessary servers, networking, and storage resources required for the cluster to operate.

Specifically, it handles:

Cluster provisioning: Creating and configuring the Kubernetes cluster using Hetzner Cloud resources.
Node management: Setting up and managing the worker nodes that will run the workloads.
This includes configuring node pools, scaling, and updating nodes as needed, along with setting up the node-autoscaler for automatic scaling based on demand.
Networking: Configuring the network settings to ensure proper communication between cluster components.
This includes setting up a private network, configuring VPN connectivity between the nodes and setting up Flannel CNI for pod networking.
It opens the required firewall rules to allow communication between nodes, and outbound connections to required services.
Storage: Setting up storage controller (CSI) to use Hetzner Block storage volumes.
Traefik proxy: Deploying Traefik as the ingress controller for managing incoming traffic to the cluster.

If an alternative cluster setup is desired, the Cluster component can be replaced with a different Kubernetes cluster, as long as it meets the requirements for running the platform.

Cluster requirements:

The Kubernetes cluster used for CTFp must meet the following requirements:

Kubernetes version 1.33 or higher
Traefik ingress controller, with correctly configured load balancer
Persistent storage support (CSI). You may use whatever storage solution you prefer, as long as it supports dynamic provisioning of Persistent Volumes, and is set as the default storage class.
Provides a kubeconfig file for the cluster, to allow the CLI tool to interact with the cluster. This config should have full admin access to the cluster.
Has at least a single node with the taint cluster.ctfpilot.com/node=scaler:PreferNoSchedule for running challenge instances.
May be skipped, if no instanced challenges are to be deployed, or you change the taints in the challenge deployment configuration.
Enough resources to run the platform components.
This depends on the CTFd setup, challenges and CTF size.
Has correct firewall rules to allow outbound connections to required services, such as logging aggregation, SMTP servers, Discord, Cloudflare API, GitHub, and reverse connections from challenges (if they need internet access).
Flannel CNI installed for networking.
Cert-manager is not installed, as it is managed by the Ops component.

Ops

The Ops component is responsible for deploying and managing the operational tools, services, and configurations required for the platform to function.

It deploys essential infrastructure components on top of the Kubernetes cluster, providing foundational services that other platform components depend on. This component must be deployed after the Cluster and before the Platform and Challenges components.

Specifically, it deploys the following:

ArgoCD: GitOps continuous delivery tool used to deploy and manage applications within the Kubernetes cluster. ArgoCD continuously synchronizes the cluster state with Git repositories, enabling declarative infrastructure management.
Cert-manager: Certificate management system for automating TLS/SSL certificate provisioning and renewal. It integrates with Cloudflare for DNS validation challenges.
Traefik configuration: Deploys additional Helm chart configuration for the Traefik ingress controller already present in the cluster, enabling advanced routing and middleware features, along with additional logging with Filebeat log aggregation.
Descheduler: Continuously rebalances the cluster by evicting workloads from nodes, ensuring optimal resource utilization and distribution across available nodes.
Error Fallback: Deploys CTF Pilot's Error Fallback page service, providing custom error pages for HTTP error responses (e.g., 404, 502, 503).
Filebeat: Log aggregation and forwarding system that sends logs to Elasticsearch or other log aggregation services, enabling centralized logging and analysis.
MariaDB Operator: Kubernetes operator for managing MariaDB database instances. Allows automated provisioning, scaling, and management of MySQL-compatible databases.
Redis Operator: Kubernetes operator for managing Redis cache instances. Enables automated deployment and management of Redis clusters for caching and data storage.
Prometheus & Grafana Stack: Comprehensive monitoring and visualization solution. Prometheus scrapes metrics from cluster components, while Grafana provides dashboards for monitoring cluster health, resource usage, and application performance. Custom dashboards for Kubernetes, CTFd, and KubeCTF are included.
Alertmanager: Alerting system integrated with Prometheus, used to send notifications based on defined alerting rules. Configured to send alerts to Discord channels for monitoring purposes.

Platform

The Platform component is responsible for deploying and managing the CTFd scoreboard and its associated services.

It handles the complete setup of the CTF competition's scoring system, database infrastructure, and management services. The Platform component must be deployed after both the Cluster and Ops components, as it depends on services provided by the Ops component.

Specifically, it deploys the following:

CTFd: The main CTF scoreboard application. This is deployed as a customizable instance that manages team registration, challenge submissions, scoring, and leaderboards. It deploys using the provided CTFd configuration from the defined GitHub repository. See CTF Pilot's CTFd configuration for more information.
CTFd-manager: A companion service for CTFd that provides automated configuration management and administrative functions. It handles initial setup of CTFd and continuous synchronization of pages and challenges.
MariaDB database cluster: A highly available database cluster for storing CTFd data, user accounts, challenge information, and competition state. Deployed using the MariaDB Operator with automated backups to S3.
Redis caching layer: A Redis cluster for caching CTFd data and improving performance.
S3 storage configuration: Integration with S3-compatible object storage for storing challenge files, user uploads, and other assets uploaded to CTFd.
Metrics and monitoring: Deploys metrics exporters and monitoring configurations specific to the CTFd instance for tracking performance and availability.
Pages deployment: Automatically deploys CTF-related pages (e.g., rules, schedule, information pages) from the defined GitHub repository using CTFd-manager.
Traefik ingress configuration: Sets up ingress routing rules to expose CTFd and related services through the Traefik ingress controller.
Initial CTFd setup: Configures initial CTFd settings, such as competition name, start/end times, and other global settings using CTFd-manager.

The Platform automatically sets up Kubernetes secrets and configurations for the components deployed, so that this information is not required to be tracked within Git.
This means that critical secrets are stored within Kubernetes secrets once the Platform component is deployed.

Backups of the database are automatically created and stored in the configured S3 storage, allowing for disaster recovery and data retention. Currently backups are configured to run every 15 minutes, and retained for 30 days.
Backups are stored as cleartext SQL dump files, so ensure that the S3 storage has proper access policies in place to prevent unauthorized access.

Challenges

The Challenges component is responsible for managing the deployment and configuration of CTF challenges within the platform.

It handles the infrastructure setup required to host, isolate, and manage challenges across the Kubernetes cluster. Challenge instances can be deployed in different modes (static, shared or instanced), and the component manages the networking, resource allocation, and lifecycle of challenge containers. The Challenges component must be deployed after the Cluster, Ops, and Platform components.

Specifically, it manages the following:

Challenge deployment infrastructure: Sets up the necessary Kubernetes resources for hosting challenges. This includes namespaces, network policies, and RBAC configurations for proper challenge isolation and access control.
KubeCTF integration: Integrates with KubeCTF to enable dynamic challenge instance management. KubeCTF handles the creation, scaling, and destruction of challenge instances.
Challenge mode support: Supports three deployment modes:
- Static challenges: Challenges that are deployed as static files (e.g., forensics challenges) and are only deployed to CTFd through CTFd-manager.
- Shared challenges: Challenges that have a single instance shared among all teams (e.g., web challenges). This is deployed through ArgoCD.
- Instanced challenges: Challenges that have individual instances for each team (e.g., dynamic web challenges). This is managed through KubeCTF.
IP whitelisting: Implements IP-based access control to challenges, allowing restrictions on which IPs or networks can access specific challenges. For public access, the 0.0.0.0/0 CIDR can be used.
Custom fallback pages: Deploys custom error pages for various challenge states (e.g., instancing fallback page for when a challenge is being provisioned).
Challenge deployment and configuration management: Deploys challenge deployment configurations through ArgoCD, allowing for GitOps-style management of challenge definitions and updates, controlling it through defined GitHub repository and defined challenge slugs to be deployed.

Challenges are deployed and managed through Git repositories, with configurations defined in challenge definition files. Use the CTF Pilot's Challenge Toolkit and CTF Pilot's Challenges Template for challenge development.

By default, the CTF Pilot's Challenge Toolkit deployment templates use taints to control which nodes challenge instances are scheduled on. Therefore, the cluster must have at least one node with the taint cluster.ctfpilot.com/node=scaler:PreferNoSchedule if using Instanced challenges, to ensure challenge instances are properly scheduled and deployed.

Challenge deployment

The challenge deployment system, utilizes a combination of GitOps principles and dynamic instance management to efficiently deploy and manage CTF challenges.

It is built to use CTF Pilot's Challenge Toolkit and CTF Pilot's Challenges Template for preparing the challenge definitions, and ArgoCD for deploying the challenge configurations to the Kubernetes cluster.
Here, ArgoCD continuously monitors the defined GitHub repository for changes, and automatically applies updates to the cluster.

Static challenges are deployed as configurations for CTFd through CTFd-manager, while Shared challenges are deployed as single instances through ArgoCD. Instanced challenges are managed through KubeCTF, where ArgoCD deploys deployment templates to KubeCTF.

Container images can be stored in any container registry, as long as the Kubernetes cluster has access to pull the images.
By default, pull secrets are configured for GitHub Container Registry, and are currently not configurable through the platform configuration.
Any additional pull secrets must be created manually in the cluster, and referenced in the challenge deployment configuration.

For more information on how to develop challenges, see the CTF Pilot's Challenge Toolkit and CTF Pilot's Challenges Template. An example challenges repository can be found at CTF Pilot's Challenges example repository.

Network

The following diagrams provide an overview of CTFp's cluster and challenge networking setups.

Cluster networking

CTFp requires three domains, as it configures different services under different domains:

Management domain: Used for accessing the management services, such as ArgoCD, Grafana, and Prometheus.
This domain should only be distributed to administrators.
Platform domain: Used for accessing the CTFd scoreboard and related services.
This domain is distributed to participants for accessing the CTF platform.
CTF domain: Used for accessing the challenges.
This domain is also distributed to participants for accessing the challenges.

The platform does not require you to allocate the full top-level domain (TLD) for CTFp, as subdomains for each of the three domains can be configured.

Management and Platform domains are configured to be proxied through Cloudflare, to take advantage of their CDN and DDoS protection services.
CTF domain is not proxied, as challenges often require direct access to the challenge instances.

Domain management is built into the system, and DNS entries are therefore automatically created and managed through Cloudflare's API.

Hetzner Cloud's Load Balancers are used to distribute incoming traffic to the Traefik ingress controllers deployed on each node in the cluster.
Within the cluster, Traefik handles routing of incoming requests to the appropriate services based on the configured ingress rules.
Network is shared between nodes using Hetzner Cloud's private networking, ensuring efficient and secure communication between cluster components.

Challenge networking

As described in the Cluster networking section, CTFp utilizes three main domains for different purposes.
Challenges are accessed through the CTF domain, which is specifically designated for hosting and serving challenge instances, and are therefore not proxied through Cloudflare; they point directly to the Hetzner Cloud Load Balancers.

This load balancer is set up to forward all incoming traffic to the Traefik ingress controllers deployed within the Kubernetes cluster.

Traefik supports TCP and HTTP(S) routing, allowing it to handle a wide range of challenge types and protocols.
However, a limited number of middleware options are available for TCP routing, so ensure that your challenges are compatible with the available features.

IP whitelisting is implemented at the ingress level, allowing challenges to restrict access based on IP addresses or CIDR ranges.

By default, HTTP(S) traffic is configured with fallback middleware, providing custom error pages for various HTTP error responses (e.g., 404, 502, 503).
When an instanced challenge is being provisioned, the custom error page will inform the user that the challenge is being started and automatically refresh the page until the challenge is ready.

Shared and Instanced challenges are deployed within either ctfpilot-challenges or ctfpilot-challenges-instanced namespaces, while static challenges are only deployed to CTFd through CTFd-manager.
The two namespaces are configured with network policies to restrict any outgoing local traffic, allowing only outbound internet access.

Challenges can therefore not talk to each other, nor communicate across multiple deployments.
If your challenge requires multiple containers, they need to be deployed within the same challenge deployment, and set up in a sidecar pattern.

Cluster DNS is not available for challenges, so any service discovery must be handled through external DNS services. Challenges allow for multiple endpoints to be defined, across both HTTP(S) and TCP protocols.

TCP endpoints are handled either through a custom Traefik port (only available for shared TCP challenges), or as an SSL TCP endpoint using SNI routing (recommended).
Hetzner limits the number of ports available for Load Balancers, so ensure that you plan accordingly when deploying challenges requiring TCP endpoints using custom ports.
Currently, configuring custom ports for TCP endpoints is not supported through the platform configuration, and must be set up manually after deployment, or manually in the cluster Terraform module.

SSL TCP connections can be made using one of the following command examples:

# Using openssl
openssl s_client -connect <challenge-domain>:443 -servername <challenge-domain>

# Netcat
ncat --ssl <challenge-domain> 443

The netcat command is the one displayed in the CTFd plugin for Kube-CTF.

We understand that this increases the complexity of challenge connection, but it provides a way to easily and dynamically allocate TCP endpoints without the need for managing multiple ports on the Load Balancer.

Getting help

If you need help or have questions regarding CTFp, you can reach out through the following channels:

GitHub Issues: You can open an issue in the CTFp GitHub repository for bug reports, feature requests, or general questions.
Discord: Join the CTF Pilot Discord server to engage with the community, ask questions, and get support from other users and contributors.

The project is delivered as-is, and we do not provide official support services. However, we encourage community engagement and collaboration to help each other out.
Contributors and maintainers may assist with questions and issues as time permits.

Contributing

We welcome contributions of all kinds, from code and documentation to bug reports and feedback!

Please check the Contribution Guidelines (CONTRIBUTING.md) for detailed guidelines on how to contribute.

CTFp is a dual-licensed project. To maintain the ability to distribute contributions across all our licensing models, all code contributions require signing a Contributor License Agreement (CLA).

You can review the CLA here. CLA signing happens automatically when you create your first pull request.
To administrate the CLA signing process, we are using CLA assistant lite.

A copy of the CLA document is also included in this repository as CLA.md.
Signatures are stored in the cla repository.

Background

CTF Pilot started as a CTF platform project, originating in Brunnerne.

The goal of the project is to provide a scalable, resilient, and easy-to-use CTF platform for hosting large-scale Capture The Flag competitions, starting with BrunnerCTF 2025.

The project is still in active development, and we welcome contributions from the community to help improve and expand the platform's capabilities.

License

CTFp is licensed under a dual license, the PolyForm Noncommercial License 1.0.0 for non-commercial use, and a Commercial License for commercial use.
You can find the full license for non-commercial use in the LICENSE.md file.
For commercial licensing, please contact The0Mikkel.

Without commercial licensing, the platform MUST NOT be used for commercial purposes, including but not limited to:

Hosting CTF competitions for profit
Hosting a CTF as a commercial organization, even if the CTF itself is free or only provided to internal users
Offering CTF hosting as a paid service
Using the platform in any commercial product or service

We encourage all modifications and contributions to be shared back with the community, for example through pull requests to this repository.
We also encourage all derivative works to be publicly available under PolyForm Noncommercial License 1.0.0.
At all times must the license terms be followed.

For information regarding how to contribute, see the contributing section above.

CTF Pilot is owned and maintained by The0Mikkel.
Required Notice: Copyright Mikkel Albrechtsen (https://themikkel.dk)

Code of Conduct

We expect all contributors to adhere to our Code of Conduct to ensure a welcoming and inclusive environment for all.

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
.github		.github
backend		backend
challenges		challenges
cluster		cluster
docs/attachments/architecture		docs/attachments/architecture
keys		keys
kube-config		kube-config
ops		ops
platform		platform
terraform		terraform
tf-modules		tf-modules
.gitignore		.gitignore
CLA.md		CLA.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
ctfp.py		ctfp.py
kubectl.sh		kubectl.sh
requirements.txt		requirements.txt
template.automated.tfvars		template.automated.tfvars

License

ctfpilot/ctfp

Folders and files

Latest commit

History

Repository files navigation

CTFp - CTF Pilot's CTF Platform

Table of Contents

Features

Quick start

How to run

Pre-requisites

Environments

Configuring the platform

CLI Tool

Commands

init - Initialize Platform Configuration

generate-keys - Generate SSH Keys

insert-keys - Insert SSH Keys into Configuration

generate-images - Generate Custom Server Images

generate-backend - Generate Terraform Backend Configuration

deploy - Deploy Platform Components

destroy - Destroy Platform Components

Workflow Overview

Guides

Updating sizes of nodes in a running platform

Deploying a new challenge

Updating a challenge

Deploying a page

The CLI tool does not seem to support my setup

Restoring the database from a backup

Restoring the CTFd-manager

Architecture

Directory structure

Overview

Cluster

Ops

Platform

Challenges

Challenge deployment

Network

Cluster networking

Challenge networking

Getting help

Contributing

Background

License

Code of Conduct

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

`init` - Initialize Platform Configuration

`generate-keys` - Generate SSH Keys

`insert-keys` - Insert SSH Keys into Configuration

`generate-images` - Generate Custom Server Images

`generate-backend` - Generate Terraform Backend Configuration

`deploy` - Deploy Platform Components

`destroy` - Destroy Platform Components

Packages