Configure in-transit encryption
This topic describes how to enable or disable in-transit encryption based on msgr2 for ACP distributed storage.
TOC
OverviewLimitations and prerequisitesEnable in-transit encryption for a new clusterEnable in-transit encryption after deploymentStep 1. Confirm node kernel versionsStep 2. Enable encryption in CephClusterStep 3. Wait for the configuration to take effectDisable transport encryptionDisable encryption on an existing clusterVerificationCheck the CephCluster settingsCheck client compatibilityCheck workload mountsTroubleshooting suggestionsPerformance impactOverview
Ceph msgr2 is the second generation of the Ceph messenger protocol. It supports two connection modes:
crc: authenticates the peers and validates data integrity, but does not encrypt payload data.secure: encrypts traffic on the wire and provides cryptographic integrity protection.
In ACP distributed storage, in-transit encryption is controlled by CephCluster.spec.network.connections.encryption.enabled.
Limitations and prerequisites
Before enabling this feature, pay attention to the following restrictions:
-
ACP Version
- v4.3.0 and later.
-
OS and Kernel(ceph daemon and client nodes)
- kernel
5.11and later. Ubuntu 22.04and later.
- kernel
In-transit encryption increases CPU overhead and may reduce throughput or increase latency, especially on busy storage nodes or low-frequency CPUs. Evaluate the impact in a staging environment first.
Enable in-transit encryption for a new cluster
If the storage cluster has not been created yet, add the following fields to the CephCluster manifest before creation:
After the cluster is created, verify that:
- CephFS PVCs can still be mounted successfully
- RBD and CephFS workloads on all nodes use supported kernel versions
Enable in-transit encryption after deployment
If the cluster is already running, changing only the encryption switch is the lowest-risk approach.
Step 1. Confirm node kernel versions
Run the following command on all Kubernetes nodes that mount Ceph volumes and confirm that the kernel version meets the prerequisite:
If some worker nodes do not meet the requirement, do not enable transport encryption on a production cluster until those nodes are upgraded.
Step 2. Enable encryption in CephCluster
Step 3. Wait for the configuration to take effect
After the configuration is updated:
- Check whether related Pods restart normally
- Recreate a test Pod that mounts a CephFS or RBD PVC
- Confirm I/O works as expected
Disable transport encryption
Disable encryption on an existing cluster
To disable only transport encryption and keep msgr2 available:
Verification
After enabling the feature, verify the cluster from both the Kubernetes side and the Ceph side.
Check the CephCluster settings
Confirm that the output contains:
Check client compatibility
After in-transit encryption is enabled:
- clients using
msgr2 secureshould connect normally - clients configured with non-encrypted modes such as
legacyorcrcwill fail to connect
Check workload mounts
Create or restart a test workload that mounts:
- a CephFS PVC
- an RBD PVC
Then verify:
- the Pod starts successfully
- the filesystem can be read and written
- no mount-related errors appear in CSI or workload logs
Troubleshooting suggestions
If enabling encryption causes mount failures or service interruptions, check the following items first:
- Node kernel version does not satisfy the requirement.
- Some nodes or external clients do not support
msgr2 secure, or are still configured withms_mode=legacyorms_mode=crc. - Network policies, firewalls, or security groups do not allow port
3300. - CPU resources are insufficient after encryption is enabled.
If the change affects production workloads, disable encryption first and then investigate compatibility and performance bottlenecks.
Performance impact
ACP cannot provide a fixed percentage for the overhead of msgr2 secure. The actual impact depends on CPU model, whether the CPU provides AES acceleration, network bandwidth, I/O size, and whether the workload is CPU-bound or network-bound.
In practice:
- latency usually increases slightly, and the increase is often more visible on small I/O or latency-sensitive workloads
- CPU usage usually increases on both clients and Ceph daemons because traffic must be encrypted and integrity-protected
- the impact is typically more noticeable on high-throughput workloads, slower CPUs, or environments without strong AES acceleration
As an operational estimate, when modern x86 CPUs with AES-NI are used, a reasonable starting expectation is:
- average latency increase: about
5%to15% - CPU usage increase on storage and client nodes handling heavy I/O: about
10%to30%
These values are an engineering estimate rather than a product guarantee. Before enabling encryption in production, benchmark a representative workload in a staging environment and compare at least the following metrics:
- average and P99 read/write latency
- client node CPU usage
- OSD node CPU usage
- throughput and IOPS