暗夜星空: Cloud Performance, Scalability, and High Availability

We often use Performance, Scalability, and High Availability interchangeably. There are however differences between these three items. In this lesson, we’ll look at their differences regarding the cloud.

evolving technologies performance scalability ha 1080p

Performance

Performance is the throughput of a system under a given workload for a specific time. For an application this could be:

The time it takes for an application to finish a task. For example, running a query on a database server to fetch all staff records.
The response time for an application to act upon a user request. For example, a user that requests a webpage.
A load of a system, measured in the volume of transactions. For example, a web server that processes 500 requests per second.

In the cloud, we validate performance by measurement and testing scalability. Here are examples of items you should measure:

Resource usage:
- CPU load
- Memory usage
- Disk I/O
- Read/write database queries
Application statistics
- number of requests
- response time

Performance measurement is an ongoing process, it never ends. You can use the cloud provider‘s tools or external tools.

Performance requirements change when there are new business requirements or when you add new features to your application.

If you use a public cloud, you also need to consider the bandwidth and delay of your WAN connection to the public cloud.

Networklessons.com runs on Amazon AWS. Here are two screenshots of how we measure the performance.

Aws Cloudwatch Ec2 Cpu Utilization — This screenshot shows the average CPU utilization of the EC2 instances (virtual machines) that host the networklessons.com website.

Aws Cloudwatch Rds Mysql Write Iops — This screenshot shows the average write IOPS of the networklessons.com website‘s database.

Scalability

Scalability is the ability of a system to handle the increase in demand without impacting the application’s performance or availability.

When the demand is too high and there are not enough resources, then it impacts performance. There are two types of scalability:

Vertical: scale up or down:
- Add or remove resources:
  - CPU
  - Memory
  - Storage
Horizontal: scale out or in:
- Add or remove systems

For example, we can increase the number of CPU cores and memory in a web server (vertical) or we can increase the number of web servers (horizontal).

We can scale up horizontally or vertical to prevent lack of resources affecting our performance and availability. Here is a screenshot of the Amazon AWS auto scaling policy we use for networklessons.com web servers:

Aws Ec2 Spot Fleet Autoscaling — Networklessons.com runs on three EC2 instances (virtual machines). When there is more demand and the average CPU utilization exceeds 50%, we scale out to a maximum of 10 EC2 instances.

Elasticity

What if you scale up or out and demand decreases? The advantage of the cloud is that you can scale down or in whenever you want. You pay for the resources you need. We call this elasticity and most cloud providers call it autoscaling. For example:

Amazon AWS Auto scaling
Microsoft Azure Autoscale
Google Cloud Platform Autoscaling

Public cloud providers seem to have an infinite capacity of compute and storage resources. Cloud providers like Amazon AWS, Azure, and Google Cloud need to have enough resources in reserve for their customers. You can bid on their unused capacity with spot instances and save money. However, when someone bids more than you, you lose the instance.

High Availability

High availability (HA) means the application remains available with no interruption.

We achieve high availability when an application continues to operate when one or more underlying components fail. For example, a router, switch, firewall, or server that fails.

We affect HA by implementing the same components on multiple instances (redundancy). For example:

Running two web servers instead of one
Running the same database on two servers, a master and slave.

We also need a fail-over mechanism aware of other components. For example:

A load balancer that detects that a web server is offline.
Slave database server that detects that the master database server is offline.

In cloud computing, there are two things to consider:

Cloud Provider HA
Customer HA

Cloud Provider HA

When you run a virtual machine on a cloud provider (IaaS) then the cloud provider offers HA for all underlying layers:

Networking
Storage
Servers
Virtualization

The cloud provider ensures that the failure of one component (for example a physical server) does not take your virtual machine down.

Cloud providers offer multiple regions and availability zones. Worldwide, they have different regions. Within a region, there are multiple availability zones.

Cloud Provider Region Availability Zone

Customer HA

It’s up to the customer to use the cloud provider’s services to build an HA solution. You have zero redundancy if you install an application on a single virtual machine in a single region.

Take down the virtual machine and the application is unavailable. Instead of installing a database server on a virtual machine yourself, you might be better off with a PaaS solution that offers redundant database servers. For example:

Google Cloud SQL
Amazon AWS RDS

Conclusion

You learned about the differences between cloud performance, scalability, and high availability. I hope you enjoyed this lesson. If you have questions, please leave a comment.

暗夜星空

Sunday, June 28, 2020

Cloud Performance, Scalability, and High Availability