Skip to content

Environment Variables

The project is configured through environment variables defined in a .env file at the project root. Docker Compose reads this file and injects the variables into each service container.

Overview

An example configuration file is provided as .env.ml.example. The recommended setup path is:

just setup

This generates secrets and creates the .env file. You can also do it manually:

cp .env.ml.example .env
chmod 600 .env

Complete Reference

Quantum Pipeline Settings

Variable Default Description
LOG_LEVEL INFO Logging verbosity: DEBUG, INFO, WARNING, ERROR.
SIMULATION_METHOD statevector Qiskit Aer simulation method: statevector, density_matrix, automatic, stabilizer, unitary, extended_stabilizer, matrix_product_state, superop, tensor_network.
CONTAINER_TYPE unknown Label identifying the container hardware config (e.g. CPU, GPU_GTX1060_6GB). Set in Docker Compose per service, read by the performance monitor and VQE runner for tagging results.

IBM Quantum (Optional)

Environment variables for IBM Quantum cloud backend. Only needed if running VQE on real hardware or IBM cloud simulators via QiskitRuntimeService.

Variable Default Description
IBM_RUNTIME_CHANNEL - IBM Quantum channel: ibm_quantum, ibm_cloud, or local.
IBM_RUNTIME_INSTANCE - IBM Quantum service instance identifier.
IBM_RUNTIME_TOKEN - IBM Quantum API token.

Kafka Configuration

Variable Default Description
KAFKA_VERSION 4.2.0 Apache Kafka Docker image tag.
KAFKA_SERVERS kafka:9092 Kafka bootstrap servers. Also read by the CLI entrypoint at runtime - when set, overrides the --kafka-bootstrap-servers flag.
KAFKA_EXTERNAL_HOST_IP localhost IP/hostname for external Kafka access.
KAFKA_EXTERNAL_PORT 9094 External Kafka listener port.
KAFKA_INTERNAL_PORT 9092 Internal Kafka listener port.
KAFKA_NODE_ID 1 KRaft node ID.

Schema Registry

Variable Default Description
SCHEMA_REGISTRY_VERSION 8.2.0 Docker image tag.
SCHEMA_REGISTRY_TOPIC _schemas Internal Kafka topic for schema storage.
SCHEMA_REGISTRY_HOSTNAME schema-registry Schema Registry hostname.
SCHEMA_REGISTRY_PORT 8081 HTTP listener port.
SCHEMA_REGISTRY_URL http://schema-registry:8081 Full URL for Schema Registry. Used by batch simulation containers in Docker Compose. Not in .env.ml.example (defaults in compose).

Garage Configuration (S3-compatible Storage)

Variable Default Description
GARAGE_VERSION v2.2.0 Garage Docker image tag.
GARAGE_S3_API_PORT 3901 S3 API port.
GARAGE_RPC_PORT 3900 Internal RPC port.
GARAGE_ADMIN_PORT 3903 Admin API port.
GARAGE_WEB_PORT 3902 Web endpoint port.
GARAGE_RPC_SECRET - 32-byte hex RPC secret (generated by just setup).
GARAGE_ADMIN_TOKEN - 32-byte hex admin token (generated by just setup).
S3_ACCESS_KEY - S3 access key (generated by just setup after first start).
S3_SECRET_KEY - S3 secret key (generated by just setup after first start).
S3_REGION garage S3 region identifier.
S3_ENDPOINT http://garage:3901 S3 endpoint URL.
S3_RAW_BUCKET raw-results Bucket for raw VQE results.
S3_FEATURES_BUCKET features Bucket for processed feature datasets.
S3_ICEBERG_BUCKET warehouse Bucket for Iceberg warehouse data.

Airflow Configuration

Variable Default Description
AIRFLOW_VERSION 3.1.8 Airflow base image tag.
AIRFLOW_POSTGRES_USER airflow PostgreSQL username.
AIRFLOW_POSTGRES_PASSWORD airflow-password PostgreSQL password.
AIRFLOW_POSTGRES_DB airflow PostgreSQL database name.
AIRFLOW_FERNET_KEY - Fernet encryption key for connection credentials. See Security Considerations.
AIRFLOW_WEBSERVER_SECRET_KEY - Secret key for webserver session signing. See Security Considerations.
AIRFLOW_JWT_SECRET - JWT secret for Airflow 3.x API authentication. See Security Considerations.
AIRFLOW_WEBSERVER_PORT 8084 Host port for the Airflow web UI.
AIRFLOW_DAGS_PAUSED_AT_CREATION True Whether new DAGs start paused.
AIRFLOW_LOAD_EXAMPLES False Load Airflow example DAGs.
AIRFLOW_ADMIN_USERNAME admin Initial admin username.
AIRFLOW_ADMIN_PASSWORD admin Initial admin password.
AIRFLOW_ADMIN_FIRSTNAME Admin Admin user first name.
AIRFLOW_ADMIN_LASTNAME User Admin user last name.
AIRFLOW_ADMIN_EMAIL admin@example.com Admin user email.
POSTGRES_VERSION 16-alpine PostgreSQL Docker image tag.

Spark Configuration

Variable Default Description
SPARK_VERSION 4.0.2 Spark version (used in image tags).
SPARK_MASTER_HOST spark-master Spark master hostname.
SPARK_MASTER_PORT 7077 Spark master RPC port.
SPARK_WORKER_MEMORY 2G Memory allocated per Spark worker.
SPARK_WORKER_CORES 2 CPU cores allocated per Spark worker.
SPARK_DEFAULT_QUEUE default Default resource queue for Spark jobs.

MLflow Configuration

Variable Default Description
MLFLOW_VERSION v3.10.1 MLflow Docker image tag.
MLFLOW_PORT 5000 MLflow tracking server port.
MLFLOW_TRACKING_URI http://localhost:5000 MLflow tracking server URI. Used by the quantum_pipeline.ml.tracking module to locate the tracking server. Not set in .env.ml.example (defaults to http://localhost:5000 in code).

Monitoring Configuration

These variables control the performance monitoring subsystem. The code default for MONITORING_ENABLED is false (in settings.py), but .env.ml.example sets it to true so monitoring is on by default in Docker Compose deployments.

Variable Default Description
MONITORING_ENABLED false (code) / true (.env example) Enable performance metric collection. Accepts true, 1, yes, on.
MONITORING_INTERVAL 10 Metric collection interval in seconds.
PUSHGATEWAY_URL http://localhost:9091 Prometheus PushGateway URL for pushing performance metrics.
MONITORING_EXPORT_FORMAT prometheus Export format: json, prometheus, or both (comma-separated, e.g. json,prometheus).

Docker Build Configuration

Variable Default Description
DOCKER_GID 970 GID of the docker group on the host. Used to build the Airflow image so it can access the Docker socket for batch generation DAGs. Check with stat -c '%g' /var/run/docker.sock.
CUDA_ARCH 8.6 (Dockerfile) / 6.1 (.env.ml.example) CUDA compute capability for the GPU image build. The Dockerfile.gpu default is 8.6 (Ampere); .env.ml.example overrides to 6.1 (Pascal) to match the project's test hardware. 6.1 = Pascal, 7.5 = Turing, 8.6 = Ampere, 8.9 = Ada Lovelace.
QUANTUM_PIPELINE_HOST_ROOT /home/zweiss/code/quantum-pipeline Absolute path to the quantum-pipeline repo on the host. Used by docker-compose.ml.yaml and the batch generation script (scripts/generate_ml_batch.py) to resolve volume mount paths. The compose file defaults to the maintainer's local path; override this for your own system.

R2 Sync Configuration

Optional - only needed if syncing data to Cloudflare R2.

Variable Default Description
R2_ACCOUNT_ID - Cloudflare account ID.
R2_ACCESS_KEY_ID - R2 access key.
R2_SECRET_ACCESS_KEY - R2 secret key.
R2_BUCKET qp-data R2 bucket name.

Security Considerations

Never commit .env to version control and set restrictive permissions (chmod 600 .env). Generate strong keys for Airflow:

# Fernet key
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"

# Webserver secret key
python -c "import secrets; print(secrets.token_hex(32))"

# JWT secret (Airflow 3.x)
python -c "import secrets; print(secrets.token_hex(32))"

The just setup script generates these automatically.

For tighter security, restrict service port exposure by binding to 127.0.0.1 (e.g., "127.0.0.1:8081:8081") and use unique passwords for each service.