Examples¶
Real-world usage examples demonstrating common Quantum Pipeline workflows, from simple simulations to full data pipeline integration.
Example 1: Simple H2 Simulation¶
Goal: Run a minimal VQE simulation for a hydrogen molecule (H2) to verify the pipeline is working.
Command¶
python quantum_pipeline.py \
-f data/molecules.json \
--basis sto3g \
--max-iterations 50 \
--optimizer L-BFGS-B
Expected Output¶
QuantumPipeline - INFO - Loading molecule data from data/molecules.json
VQERunner - INFO - Processing molecule 1:
Molecule:
Multiplicity: 1
Charge: 0
Unit: Angstrom
Geometry:
H [0.0, 0.0, 0.0]
H [0.0, 0.0, 0.74]
Masses:
H 1.008
H 1.008
VQERunner - INFO - Generating hamiltonian based on the molecule...
VQERunner - INFO - Hamiltonian generated in 0.088 seconds.
VQERunner - INFO - Mapping fermionic operator to qubits
VQERunner - INFO - Problem mapped to qubits in 0.023 seconds.
VQERunner - INFO - Running VQE procedure...
VQESolver - INFO - Initializing Aer simulator backend...
VQESolver - INFO - Aer simulator backend initialized.
VQESolver - INFO - Initializing the ansatz...
VQESolver - INFO - Ansatz initialized.
VQESolver - INFO - Optimizing ansatz and hamiltonian...
VQESolver - INFO - Ansatz and hamiltonian optimized.
VQESolver - INFO - Starting the minimization process with max iterations equal to 50.
VQESolver - INFO - Simulation via Aer completed in 5.23 seconds and 50 iterations.
VQERunner - INFO - VQE procedure completed in 5.32 seconds
VQERunner - INFO - Result provided in 5.43 seconds.
Notes¶
- The
sto3gbasis set is the smallest and fastest, suitable for quick tests. - H2 maps to 4 qubits with
sto3g, making it an ideal test case. - The expected ground state energy for H2 at 0.74 Angstrom bond length is approximately -1.137 Hartree.
- Use this pattern to verify your installation before running larger simulations.
Example 2: Multi-Molecule Batch Processing¶
Goal: Process multiple molecules (H2, LiH, H2O) in a single run using the same simulation configuration.
Molecule Data File¶
Create a file data/molecules.json containing multiple molecules:
[
{
"symbols": ["H", "H"],
"coords": [[0.0, 0.0, 0.0], [0.0, 0.0, 0.74]],
"multiplicity": 1,
"charge": 0,
"units": "angstrom",
"masses": [1.008, 1.008]
},
{
"symbols": ["Li", "H"],
"coords": [[0.0, 0.0, 0.0], [0.0, 0.0, 1.6]],
"multiplicity": 1,
"charge": 0,
"units": "angstrom",
"masses": [6.941, 1.008]
},
{
"symbols": ["O", "H", "H"],
"coords": [[0.0, 0.0, 0.0], [0.757, 0.586, 0.0], [-0.757, 0.586, 0.0]],
"multiplicity": 1,
"charge": 0,
"units": "angstrom",
"masses": [15.999, 1.008, 1.008]
}
]
Command¶
python quantum_pipeline.py \
-f data/molecules.json \
--basis sto3g \
--max-iterations 100 \
--optimizer L-BFGS-B \
--ansatz-reps 2 \
--shots 1024 \
--report
Expected Behavior¶
- The pipeline processes each molecule sequentially.
- For each molecule, it generates the Hamiltonian, maps to qubits, and runs VQE.
- A PDF report is generated in
gen/containing results for all molecules. - Processing time varies by molecule complexity: H2 (seconds), LiH (tens of seconds), H2O (minutes).
Notes¶
- Larger molecules require more qubits: H2 uses 4, LiH uses 12, H2O uses 14 (with
sto3g). - Use
--reportto generate a consolidated PDF with convergence plots for each molecule. - For large batches, consider adding
--enable-performance-monitoringto track resource usage.
Example 3: GPU-Accelerated Research¶
Goal: Run a high-accuracy simulation using GPU acceleration with the cc-pvdz basis set for publication-quality results.
Command¶
python quantum_pipeline.py \
-f data/molecules.json \
--basis cc-pvdz \
--convergence \
--threshold 1e-8 \
--optimizer L-BFGS-B \
--ansatz-reps 5 \
--shots 4096 \
--optimization-level 3 \
--gpu \
--simulation-method statevector \
--report \
--enable-performance-monitoring
Expected Behavior¶
- GPU acceleration is enabled, offloading statevector operations to the NVIDIA GPU.
- The
cc-pvdzbasis set provides high accuracy but requires significantly more qubits (e.g., 58 qubits for H2O). - Convergence mode runs until the energy change falls below \(10^{-8}\) Hartree.
- Performance monitoring tracks CPU, GPU, and memory usage throughout the simulation.
- A detailed PDF report is generated with convergence curves and energy analysis.
Notes¶
- GPU acceleration provides 10-100x speedup for large circuits compared to CPU-only execution.
- L-BFGS-B is the recommended optimizer for GPU workloads due to its efficient convergence.
- The
cc-pvdzbasis set generates many more qubits thansto3g. Ensure you have sufficient GPU memory (6+ GB recommended). - Monitor GPU utilization with
nvidia-smiin a separate terminal during the run. - For circuits exceeding GPU memory, consider switching to
--simulation-method tensor_network.
Example 4: Full Data Pipeline¶
Goal: Run a VQE simulation with Kafka streaming enabled, demonstrating the complete data pipeline from simulation to storage.
Prerequisites¶
Start the infrastructure stack using Docker Compose:
This starts Kafka, Schema Registry, Kafka Connect, MinIO, Airflow, and Spark.
Command¶
python quantum_pipeline.py \
-f data/molecules.json \
--basis sto3g \
--max-iterations 150 \
--optimizer L-BFGS-B \
--gpu \
--simulation-method statevector \
--kafka \
--servers kafka:9092 \
--topic vqe_decorated_result \
--acks all \
--retries 3
Expected Behavior¶
- The VQE simulation runs with GPU acceleration.
- Upon completion, results are serialized using Avro format.
- The Avro schema is registered with the Schema Registry (auto-generated from result structure).
- The serialized result is published to the Kafka topic.
- Kafka Connect picks up the message and writes it to MinIO as an Avro file.
- The result is stored in a path like:
Topic Naming Convention¶
The Kafka topic is automatically suffixed with experiment metadata:
This encodes: molecule index (mol0), atoms (HH), iterations (it150), basis set (sto3g), and backend (aer_simulator_statevector_gpu).
Notes¶
- The
--kafkaflag is required to enable streaming. Without it, other Kafka parameters are ignored. - Use
--acks allin production to ensure data durability across Kafka replicas. - The Schema Registry caches schemas for repeated simulations with the same configuration.
- After Kafka Connect transfers data to MinIO, Apache Airflow can trigger Spark processing for feature engineering.
Example 5: Convergence-Based Optimization¶
Goal: Use convergence-based stopping instead of a fixed iteration count, allowing the optimizer to run until the energy stabilizes.
Command¶
python quantum_pipeline.py \
-f data/molecules.json \
--basis sto3g \
--convergence \
--threshold 1e-6 \
--optimizer L-BFGS-B \
--shots 2048
Expected Behavior¶
- The optimizer runs until the energy change between consecutive iterations falls below \(10^{-6}\) Hartree.
- The total number of iterations is determined automatically based on convergence.
- For well-behaved problems (H2 with L-BFGS-B), convergence typically occurs within 30-80 iterations.
- For more complex molecules, convergence may require hundreds of iterations.
Threshold Guidelines¶
| Threshold | Precision Level | Typical Use Case |
|---|---|---|
1.6e-3 |
Chemical accuracy (~1 kcal/mol) | Fast prototyping |
1e-6 |
Standard (1 microHartree) | Production runs |
1e-8 |
High precision (10 nanoHartree) | Publication-quality research |
Notes¶
- Convergence mode (
--convergence) and fixed iterations (--max-iterations) are mutually exclusive. Using both will raise aValueError. - Convergence mode is recommended for production runs because it adapts to the problem difficulty.
- Pair with
--reportto visualize the convergence curve in the generated PDF. - The convergence threshold is passed to
scipy.optimize.minimizeas thetolparameter.
Example 6: Configuration Save and Load¶
Goal: Save a simulation configuration for reproducibility and reload it later for repeated experiments.
Step 1: Run and Save Configuration¶
python quantum_pipeline.py \
-f data/molecules.json \
--basis cc-pvdz \
--convergence \
--threshold 1e-6 \
--optimizer L-BFGS-B \
--ansatz-reps 2 \
--shots 2048 \
--gpu \
--simulation-method statevector \
--dump
This creates a configuration file in run_configs/ with a timestamped filename:
{
"file": "data/molecules.json",
"basis": "cc-pvdz",
"convergence_threshold_enable": true,
"convergence_threshold": 1e-6,
"optimizer": "L-BFGS-B",
"ansatz_reps": 2,
"shots": 2048,
"backend": {
"local": true,
"optimization_level": 3,
"method": "statevector",
"gpu": true,
"noise_backend": null
}
}
Step 2: Reload Configuration¶
Step 3: Reload with Overrides¶
# Load saved config but change the optimizer
python quantum_pipeline.py \
--load run_configs/config_20250615.json \
--optimizer COBYLA
# Load saved config but switch to a different basis set
python quantum_pipeline.py \
--load run_configs/config_20250615.json \
--basis sto3g
Notes¶
--dumpand--loadare mutually exclusive. You cannot save and load in the same run.- CLI arguments override loaded configuration values, allowing selective parameter changes.
- Use
--dumpfor every production run to maintain an audit trail of experiment configurations. - Configuration files are stored in
run_configs/with timestamps for easy identification.
Example 7: Custom Python Script¶
Goal: Use the VQERunner API programmatically in a custom Python script for advanced workflows, automated parameter sweeps, or integration with external tools.
Script¶
from quantum_pipeline.runners.vqe_runner import VQERunner
from quantum_pipeline.configs.module.backend import BackendConfig
# Configure the backend
backend_config = BackendConfig(
local=True,
optimization_level=3,
min_num_qubits=None,
filters=None,
simulation_method='statevector',
gpu=False,
gpu_opts=None,
noise=None,
)
# Initialize the VQE runner
runner = VQERunner(
filepath='data/molecules.json',
basis_set='sto3g',
max_iterations=100,
optimizer='L-BFGS-B',
ansatz_reps=2,
default_shots=1024,
report=False,
kafka=False,
backend_config=backend_config,
)
# Run the simulation
runner.run()
# Process results
for result in runner.run_results:
print(f"Molecule: {result.molecule.molecule_data.symbols}")
print(f"Ground state energy: {result.vqe_result.minimum:.6f} Ha")
print(f"Iterations: {len(result.vqe_result.iteration_list)}")
print(f"VQE time: {result.vqe_time:.2f} seconds")
print(f"Total time: {result.total_time:.2f} seconds")
print("---")
Parameter Sweep Example¶
from quantum_pipeline.runners.vqe_runner import VQERunner
optimizers = ['L-BFGS-B', 'COBYLA', 'BFGS', 'SLSQP']
results_summary = []
for opt in optimizers:
runner = VQERunner(
filepath='data/molecules.json',
basis_set='sto3g',
max_iterations=100,
optimizer=opt,
ansatz_reps=2,
default_shots=1024,
)
runner.run()
for result in runner.run_results:
results_summary.append({
'optimizer': opt,
'energy': result.vqe_result.minimum,
'iterations': len(result.vqe_result.iteration_list),
'time': result.vqe_time,
})
# Compare optimizer performance
for entry in results_summary:
print(f"{entry['optimizer']:12s} | "
f"Energy: {entry['energy']:12.6f} Ha | "
f"Iterations: {entry['iterations']:4d} | "
f"Time: {entry['time']:.2f}s")
Notes¶
- The
VQERunnerclass accepts the same parameters as the CLI but as constructor arguments. - Use
BackendConfig,ProducerConfig, andSecurityConfigfor structured configuration. - Programmatic usage is ideal for parameter sweeps, automated benchmarks, and integration with analysis notebooks.
- Results are returned as
VQEDecoratedResultobjects containing the full simulation data, including per-iteration energy values.
Example 8: Thesis Experiment Reproduction¶
Goal: Reproduce the experiments from the thesis using the provided environment configuration file.
Step 1: Set Up Environment¶
Copy the thesis environment example and customize it for your system:
Review and update the key settings in .env:
# Quantum Pipeline Settings
MAX_ITERATIONS=50
LOG_LEVEL=INFO
SIMULATION_METHOD=statevector
BASIS_SET=sto3g
# Kafka Configuration
KAFKA_SERVERS=kafka:9092
# MinIO Configuration
MINIO_ROOT_USER=quantum-admin
MINIO_ROOT_PASSWORD=quantum-secret-key
# Performance Monitoring
QUANTUM_PERFORMANCE_ENABLED=true
QUANTUM_PERFORMANCE_COLLECTION_INTERVAL=10
QUANTUM_PERFORMANCE_PUSHGATEWAY_URL=http://monit:9091
QUANTUM_PERFORMANCE_EXPORT_FORMAT=json,prometheus
Step 2: Start the Full Stack¶
This starts the entire infrastructure:
- Quantum Pipeline container (with GPU support)
- Apache Kafka and Schema Registry
- Kafka Connect (S3 sink to MinIO)
- MinIO object storage
- Apache Airflow (DAG orchestration)
- Apache Spark (feature engineering)
- Prometheus and Grafana (monitoring)
Step 3: Monitor the Experiment¶
The thesis configuration runs the pipeline with convergence-based optimization, GPU acceleration, and Kafka streaming enabled. Monitor progress through:
# View pipeline logs
docker logs -f quantum-pipeline
# Check GPU utilization
nvidia-smi -l 1
# View Kafka topics
docker exec -it kafka kafka-topics --list --bootstrap-server localhost:9092
Step 4: Access Results¶
- MinIO Console:
http://localhost:9002- browse stored Avro files - Airflow UI:
http://localhost:8084- view DAG execution status - Grafana: Access dashboards for real-time performance metrics
Resource Allocation Reference¶
The thesis experiments used the following resource distribution:
| Component | CPUs | RAM | GPU |
|---|---|---|---|
| CPU Pipeline | 3 | 16 GB | -- |
| GPU Pipeline | 3 | 16 GB | GTX 1060 |
| Infrastructure | 1 | 24 GB | -- |
| Monitoring | 0.5 | 8 GB | -- |
Notes¶
- The
.env.thesis.examplefile contains the exact settings used in the thesis experiments. - The thesis compared CPU vs. GPU performance across multiple molecules and optimizers.
- Three separate pipeline containers were used: one CPU-only and two GPU-accelerated (each with a different GPU).
- Results were streamed via Kafka to MinIO, then processed by Spark through Airflow DAGs for feature engineering.
- All IBM Quantum credentials in the example file are placeholders and must be replaced with valid tokens if cloud backend comparison is desired.
Quick Reference¶
| Example | Use Case | Key Flags |
|---|---|---|
| 1. Simple H2 | Installation test | --basis sto3g --max-iterations 50 |
| 2. Batch Processing | Multi-molecule runs | --report |
| 3. GPU Research | High-accuracy GPU | --gpu --basis cc-pvdz --convergence |
| 4. Data Pipeline | Kafka streaming | --kafka --servers kafka:9092 |
| 5. Convergence | Adaptive stopping | --convergence --threshold 1e-6 |
| 6. Save/Load | Reproducibility | --dump / --load |
| 7. Python API | Programmatic usage | VQERunner class |
| 8. Thesis | Full reproduction | .env.thesis.example |
Next Steps¶
- Configure parameters in detail: Configuration Reference
- Understand optimizer choices: Optimizers
- Choose a simulation method: Simulation Methods
- Deploy with Docker: Docker Compose