The CIA triad – Cloud Security Fundamentals

Not to be confused with the central intelligence agency of the same acronym, CIA stands for confidentiality, integrity, and availability. It is a widely popular information security model that helps an organization protect its sensitive critical information and assets from unauthorized access:

Figure 1.6 – The CIA triad (https://devopedia.org/images/article/178/8179.1558871715.png)

The preceding diagram depicts the CIA triad. Let’s understand its attributes in detail.

Confidentiality

Confidentiality ensures that sensitive information is kept private and accessible only to authorized individuals. This attribute focuses on keeping sensitive information private and accessible only to authorized individuals or entities. It aims to prevent unauthorized disclosure of information, protecting it from being accessed or viewed by unauthorized users. Let’s understand this by looking at an example of the payroll system of an organization. The confidentiality aspect of the payroll system ensures that employee salary information, tax details, and other sensitive financial data is kept private and accessible only to authorized personnel. Unauthorized access to such information can lead to privacy breaches, identity theft, or financial fraud.

Integrity

Integrity maintains the accuracy and trustworthiness of data by preventing unauthorized modifications. The integrity aspect ensures that information remains accurate, trustworthy, and unaltered. It safeguards against unauthorized modifications, deletions, or data tampering efforts, ensuring that the information’s integrity is maintained throughout its life cycle. Let’s understand integrity using the same example of the payroll system of an organization. The integrity aspect of the payroll system ensures that the data remains accurate and unchanged throughout its life cycle. Any unauthorized modifications to payroll data could lead to incorrect salary payments, tax discrepancies, or compliance issues.

Availability

Availability ensures that information and services are accessible and operational when needed without disruptions. This aspect emphasizes ensuring that information and systems are available and operational when needed. It focuses on preventing disruptions or denial of service, ensuring that authorized users can access the information and services they require without interruptions. Let’s understand availability by using the same example of the payroll system of an organization. The availability aspect of the payroll system ensures that it is accessible and functional when needed. Payroll processing is critical for employee satisfaction and business operations, and any disruptions to the system could result in delayed payments or other financial issues.

Overall, the CIA triad provides a framework for organizations to develop effective cybersecurity strategies. By focusing on confidentiality, integrity, and availability, organizations can ensure that their systems and data are protected from a wide range of threats, including cyberattacks, data breaches, and other security incidents.

The Zero Trust model – Cloud Security Fundamentals

With exponential growth in cloud technology and the mobile workforce, the corporate network perimeter has been redefined. The traditional perimeter-based security approach is found to be ineffective as the resources are hosted in multi-cloud and hybrid scenarios. Today, organizations need a new security model that can provide secure access to their resources, irrespective of where they are accessed from and regardless of user or application environment. A Zero Trust security model helps in embracing the mobile workplace and helps in protecting identities, devices, apps, and data wherever they are located.

The Zero Trust model operates on the principle of “trust no one, verify everything, every time.” This means that all users, devices, applications, and data that flow within an organization’s network should be verified explicitly before being granted access to resources:

Figure 1.8 – The Zero Trust model (https://www.itgovernance.co.uk/blog/wp-content/uploads/2015/07/PPT-Diagram-Blog.png)

Zero Trust guiding principles

The Zero Trust model has three principles based on NIST guidelines:

  • Verify explicitly: The “verify explicitly” principle of Zero Trust means that access should be granted only after a user or device’s identity and security posture have been verified and authenticated. This requires the use of strong authentication mechanisms, such as MFA, that require users to provide additional forms of authentication beyond just a password, such as a fingerprint scan, facial recognition, or a one-time code. In the case of devices, they must be assessed and verified before they are granted access to resources within an organization’s network. This involves evaluating the device’s security posture to ensure that it meets a minimum set of security standards, such as having the latest security patches, running up-to-date antivirus software, and having strong passwords or other authentication mechanisms in place. Devices that do not meet these security standards are either denied access or granted limited access until they can be remediated and brought up to the required security standards.
  • Least privilege access: Least privilege access refers to Just-in-Time (JIT) access, which means elevating the permission as and when required to perform some tasks and then bringing back the default access with Just Enough Administration (JEA) to perform day-to-day tasks.
  • Minimize the blast radius: This refers to the assume breach mindset, where you build your defense while keeping the worst-case scenario in mind so that even if some external or internal breach occurs, there is a minimum impact on the organization. Network segmentation, end-to-end encryption, advanced threat detection, and deeper analytics visibility are some practices to minimize the blast radius.

These guiding principles help us in understanding the baseline on which we define the conditions for the Zero Trust model. Now, let’s understand which guidelines apply to which pillars.

Hybrid cloud configuration – Appendix: Preflight before Onboarding

If your design requires establishing a connection on-premises, several configuration changes have to be made to enforce the connection. If you also need to configure HCX for migration, it adds some complexity to the deployment. The following table lists the relevant configuration items to be considered for the hybrid cloud deployment:

Configuration sectionConfiguration itemDescription
Network configurationVPNPolicy-based or route-based. See the networking section in Chapter2 for more details on VPNs
 AWS DX (see Figure 12.5)You can choose to use the AWS DX service to gain predictable latency and possibly higher throughout for your workload. You can leverage the following: AWS DX provisioned as a private VIF to your SDDC.AWS DX VIF connected to an AWS DX Gateway (DXGW). You will use an SDDC group and a vTGW to connect your SDDC(s) to a DXGW.Cloud connector service providers – cloud connector service providers can offer an alternative by sharing cloud connectivity lines. From the SDDC perspective, the connection still would be in the form of a private VIF or a connection to a DXGW.
 Dynamic routing supportVMware Cloud on AWS supports only the BGP dynamic routing protocol. You can filter incoming/outcoming routes and/or announce 0.0.0.0./0 to route all SDDC traffic through the selected connection. If you have multiple connections from on-premises to the cloud, it is important to synchronize the routing information (e.g., avoid announcing 0.0.0.0/0 through DX and specific subnets through a route-based VPN)
SDDC managementvCenter ServerReconfigure to use a private IP
(see Figure 12.6)NSX managerReconfigure to use a private IP
 HCX managerReconfigure to use a private IP
FirewallManagement Gateway FirewallEnsure your on-premises CIDRs required access to vCenter/NSX Manager/HCX Manager is included in the management firewall rules.
 Compute Gateway FirewallEnsure you add on-premises CIDRs and map them to the DX/VPN interface.
Migration ServiceActivate HCXHCX Enterprise is included with VMware Cloud on AWS SDDC.
 Pair HCX managersConfigure a pairing between on-premises and the cloud. You can have multiple site pairs if needed.
 Configure a network profile. (See Figure 12.7.)Configure HCX on VMware Cloud on AWS to use the “directConnectNetwork1” network profile. Add a non-overlapping private CIDR (different from the SDDC management network). HCX will use this network to establish connectivity between the appliances. The SDDC workflow will automatically add the subnet to the BGP route distribution and create the required firewall rules.
 Create a service meshOverride the network uplink configuration to use the directConnectNetwork1 network profile while configuring the service mesh.
 Configure network extensionThe HCX network extension service can extend vSphere vDS VLAN-based port groups to the cloud. You can enable high availability for your NE appliances (you need to configure an HA group before extending a VLAN).
Migrate workloadsIdentify VMs to be migratedIdentify VMs building an application and migrate them as a part of the same migration group.
 Select migration typeSelect between the following: vMotionbulk migrationreplication-assisted vMotion (RAV) See Chapter 3, which covers HCX migrations in great depth for more details.
 Configure scheduleUse this option to define the switchover/start of vMotion. If using bulk or RAV, you need to make sure HCX has enough time to replicate virtual machine data.

Table 12.2 – Hybrid Cloud configuration details

You can review the Direct Connect configuration in Figure 12.5.

Figure 12.5 – AWS DX VIF attached to an SDDC

You can review the FQDN configuration in Figure 12.6:

Figure 12.6 – Configure vCenter Server, HCX, and NSX FQDN resolution

You can review the configuration of HCX to leverage AWS Direct Connect (DX) connection in Figure 12.7:

Figure 12.7 – VMware Cloud on AWS HCX network profile: uplink over AWS DX

Next steps

Now that you have completed the basic SDDC setup and connected the SDDC to on-premises, you can use the following list to get further information about the services and next steps:

Accessing and configuring the VMware Cloud Console – Appendix: Preflight before Onboarding

There are a couple of steps required before you can start consuming VMware Cloud on AWS. You use the VMware Cloud Console to provision VMware Cloud on AWS SDDC. If you are already using any of the VMware Cloud services, you can just log in to the VMware Cloud Console and look for VMware Cloud on AWS in the Services inventory:

Figure 12.1 – VMware Cloud Console Services inventory

However, if it’s the first time you’re using VMware Cloud services, you should get access to the VMware Cloud Console.

The following steps outline the procedure to get started with the VMware Cloud Console:

  1. Receive a welcome email: Upon processing your purchase, VMware will send an email with an activation link. Use this link to log in to the VMware Cloud Console.
    NOTE
    VMware will use the email address designated as the “Fund owner’s” to send the activation link.
  2. Setup an Organization. An Organization provides authentication boundaries for your VMware Cloud services. Each Organization can be entitled to different services. A user can access multiple Organizations and switch between them in the VMware Cloud Console.
  3. Setup VMware Cloud service accounts: After gaining initial access to the VMware Cloud Console and creating an Organization, you can entitle user accounts to access to VMware Cloud on AWS. You can use manual assignment, or you can federate VMware Cloud Console with your identity provider. If your design includes federation for the VMware Cloud Console, it’s important to configure the federation feature before you deploy VMware Cloud on AWS SDDC.
  4. Create a term subscription. If you purchased a term subscription, it’s important to create a subscription object in the VMware Cloud Console before you deploy an SDDC. Creating a subscription matching your purchase is a organization’s responsibility – VMware does not pre-create a subscription in your VMware Cloud Organization. Make sure you have all the details of your purchase contract before creating a subscription, including the following:
    • AWS Region
    • Host count and host type
    • Subscription type – flexible or standard
    • Subscription duration – 1 year or 3 years

Figure 12.2 – Creating a subscription for VMware Cloud on AWS
NOTE
You can deploy a VMware Cloud on AWS SDDC without creating a subscription. In this case, VMware will use on-demand prices for billing. If you purchased a subscription but did not create a subscription object in the VMware Cloud Console, on-demand prices will be applied. If you deploy your SDDC using a different AWS Region or host type, or use more hosts, on-demand prices will be applied as well.

Purchasing and onboarding – Appendix: Preflight before Onboarding

In this chapter, we will cover the most important configuration items you need when you deploy the SDDC and configure a hybrid cloud environment.

You will find a detailed description of the configuration steps and items from previous chapters of this book.

Purchasing and onboarding

When purchasing the service and preparing for the first SDDC deployment, you need to choose a couple of options. These options may have a large impact on the further operations of the service, so make sure your choices are well thought out, as you will not be able to change some of them moving forward.

Purchasing and funding

When purchasing the service, you can select one of the following options:

  • A direct VMware purchase
  • AWS resell
  • Purchasing through a Managed Service Provider (MSP)

VMware Cloud on AWS supports all three routes to the market. Depending on your purchase strategy, you may find one or other better suited to your needs.

Note

Some services available for VMware Cloud on AWS can only be purchased directly from VMware, for example, Microsoft host-based licenses for workloads on VMware Cloud on AWS.

When purchasing from VMware, you can choose how you want to pay for the service:

  • VMware Purchasing Programs: You can select from a different range of programs, most of them offering so-called Credits. You can use credits toward payment for VMware Cloud on AWS. Consult a VMware sales representative to get more details about available programs. (More details on VMware Purchasing Programs can be found here: https://customerconnect.vmware.com/web/vmware/spp-landing.)
  • Pay by invoice: You can activate pay by invoice using the VMware Cloud Console.
  • Pay with a credit card: Applicable for small purchases up to $25,000.

Consumption options

When deploying VMware Cloud on AWS SDDC, you have a choice between the following:

  • Subscription: Your commitment to buy a certain amount of host capacity for a defined period. When purchasing a subscription, you select the AWS Region, host type, and the number of hosts. You can pay upfront or monthly. If purchasing from VMware or AWS, you can select the following:
    • Flexible subscription: The terms of the subscription (number of hosts, region, host types) can be changed over time (limitations apply)
    • Standard subscription: The terms of the subscription are fixed and cannot be changed
  • On-demand: You can run VMware Cloud on AWS SDDC using on-demand prices. You are free to select the region, host type, and the number of hosts.

Typically, a standard 3-year term subscription is the most cost-effective option, while on-demand prices are the highest. Depending on your use case, one or another option might work better. In our experience, a flexible subscription is the right balance between flexibility and cost savings.

Networking – Knowing the Best Practices, FAQs, and Common Pitfalls

The network communication with workloads deployed on your SDDC is a key part of the overall user experience and, probably, one of the most complex design sections. Network configuration is under the organization’s control; VMware only provides underlying network connectivity with the hardware AWS infrastructure.

Let’s highlight the most common network misconfigurations:

  • Insufficient connection between on-premises and the VMware Cloud on AWS SDDC

It’s a common practice to initially configure an IPSec VPN over the internet to achieve basic connectivity between on-premises and the SDDC and to secure the traffic flow. However, a VPN tunnel over the internet is not suitable for a mass migration of the workload. Live vMotion over the internet is not supported. Unpredictable bandwidth and latency affect the migration timeline making it unpredictable. For a large-scale migration and/or a hybrid cloud use case, you need to plan for a dedicated private connection to your SDDC.

  • Underestimating Level 2 network extension complexity

HCX and/or NSX Standalone Edge provide a unique feature – the ability to stretch a Layer 2 broadcast domain for a selected VLAN and allow the workload to retain the original IP addresses. This feature enormously helps to seamlessly migrate applications without an impact on the client configuration. On the other hand, this feature has several trade-offs, impacting workload availability and/or performance:

  • For workloads deployed on a Layer 2 extended segment (even with the MON feature enabled), all traffic sent to destinations residing outside of the SDDC network will first reach the default gateway, located on-premises. It may cause unexpected high latency when accessing workloads residing in native AWS VPC, including the connected VPC.
    • Workloads have a clear dependency on the on-premises default gateway. If the link between on-premises and the SDDC stops functioning, the workload on the extended leg of the segment would not be able to reach the default gateway and communicate with the external destination.
    • Undersized HCX Layer 2 extension appliances: All broadcast traffic within the VLAN must traverse the extension appliances on both sides of the tunnel. If the appliance is overloaded and/or does not have enough resources, the workload residing in the SDDC drops all external connections. This scenario is often observed with entry-level clusters based on the i3.metal host type. You can scale out and deploy multiple extension appliance pairs and distribute extended segments between appliances.
    • Extension appliance availability: As mentioned earlier, the Layer 2 extension has a direct dependency on the HCX appliance. If the appliance stops working, becomes corrupted, or restarts, the network communication is affected. If you plan to maintain the extension after the migration is complete, use the HA feature of HCX extension appliances. Bear in mind that for a complex environment with a lot of extended VLANs, configuring HA will reduce compute and storage resources on both sides of the environment, including the SDDC. You may need to scale out the vSphere cluster hosting appliance on the SDDC side, incurring additional costs.
    • Security concerns: Many security teams tend not to allow a Layer 2 extension over the public internet as it poses security risks and exposes sensitive broadcast traffic to the internet. When not properly addressed in the design phase, it might drastically affect your migration plans if you were planning to live migrate and retain the IP addresses. The best solution is to use a dedicated DX line and pass the extension traffic over the DX, which must address most of the concerns of the security team.
  • Identify network dependencies after migration

Many organizations claim that performance suffers after migrating workloads to the cloud. Some of these concerns are due to not following the best practices while migrating; however, in many cases, it has nothing to do with the SDDC. For a complex distributed application when not all components were properly identified and migrated to the cloud, the traffic may have additional hops traversing the WAN link(s), adding not foreseen latency to the application. An example of this is a migration of a SQL Server database warehouse, where the centralized integration service (SSIS) was left on premises, causing all the data to be first moved back to on-premises and then retransmitted to the SDDC. The impact of this configuration on the application was measured at a 300% increase in the OLAP cube generation time. The troubleshooting and search for affected traffic flows may be a complex and time-consuming task. VMware Aria Operations for Networks can help you visualize the traffic flow for a selected application.

Storage – Knowing the Best Practices, FAQs, and Common Pitfalls

Storage resources are crucial for storing an application’s data. You should encompass both capacity and performance requirements while designing, implementing, and operating the infrastructure. We will review the most common misconfigurations and/or suboptimal design choices:

  • Sizing

Storage resources define two different dimensions of resources – storage capacity and storage performance. While sizing an environment, very often only one of these dimensions, in most cases capacity, will be considered. This approach is a direct path to failure. Even if your SDDC will have enough storage to host your workload, the resulting performance in many cases is inadequate and will lead to lengthy and costly escalations.

When sizing storage, make sure to follow the recommendation of VMware Cloud Sizer (https://vmc.vmware.com/sizer) both in terms of capacity and performance. Double-check your sizing assumptions and tweak them using the advanced sizer if needed.

Figure 11.5 – VMware Cloud Sizer – Sizing Assumptions

  • Storage policies

vSAN is very easy and intuitive to manage with storage policies directly in vCenter. There’s no need to work with the storage team, and it’s easy to make changes. However, it could work against you. You could be tempted to use RAID5 for all your workloads and free up more space than you’d get with RAID1. RAID5 has a known performance implication, especially for workloads with predominantly small writes, causing a lot of overhead with RAID5. If the initial sizing has been done with RAID5 configuration, you may not have enough hosts to switch to RAID1 if needed. If you find yourself in this situation, decide whether you can split some of the VMDKs and dedicate small VMDKs to some particular data type – the database transaction log and tempdb are good candidates for such optimizations.