AWS Well-Architected Framework Summary

What is the AWS Well-Architected Framework?

  • The AWS Well-Architected Framework is a structured approach to help architects, developers, and cloud professionals build secure, high-performing, resilient, and efficient infrastructure on Amazon Web Services (AWS).
  • It helps you understand the pros and cons of decisions you make while building systems on AWS.
  • It provides a way for you to consistently measure your architectures against best practices and identify areas for improvement.
  • The AWS Well-Architected Tool (AWS WA Tool) provides recommendations for making your workloads more reliable, secure, efficient, and cost-effective.

Why Use the AWS Well-Architected Framework?

  • Using the framework helps you learn architectural best practices for designing and operating secure, reliable, efficient, cost-effective, and sustainable workloads in the AWS Cloud.
  • Having well-architected systems greatly increases the likelihood of business success.

Six Pillars of the AWS Well-Architected Framework

The AWS Well-Architected Framework is based on six pillars:

  1. Operational Excellence: Focuses on running and monitoring systems, and continually improving processes and procedures.
    • Key functions include automating changes, responding to events, and defining standards to manage daily operations.
    • Design principles:
      • Perform operations as code.
      • Make frequent, small, reversible changes.
      • Refine operations procedures frequently.
      • Anticipate failure.
  2. Security: Focuses on protecting data, systems, and assets.
    • Design principles:
      • Implement a strong identity foundation.
      • Enable traceability.
      • Apply security at all layers.
      • Automate security best practices.
      • Protect data in transit and at rest.
      • Keep people away from data.
      • Prepare for security events.
  3. Reliability: Encompasses the ability of a workload to perform its intended function correctly and consistently when it’s expected to.
    • Design principles:
      • Automatically recover from failure.
      • Test recovery procedures.
      • Scale horizontally to increase aggregate workload availability.
      • Stop guessing capacity.
      • Manage change in automation.
  4. Performance Efficiency: Focuses on using computing resources efficiently to meet system requirements, and to maintain that efficiency as demand changes and technologies evolve.
    • Design principles:
      • Democratize advanced technologies.
      • Go global in minutes.
      • Use serverless architectures.
      • Experiment more often.
      • Consider mechanical sympathy.
  5. Cost Optimization: Involves running systems to deliver business value at the lowest price point.
    • Design principles:
      • Implement cloud financial management.
      • Adopt a consumption model.
      • Measure overall efficiency.
      • Stop spending money on undifferentiated heavy lifting.
      • Analyse and attribute expenditure.
  6. Sustainability: Addresses the long-term environmental, economic, and societal impact of your business activities.
    • Design principles:
      • Understand your impact.
      • Establish sustainability goals.
      • Maximize utilization.
      • Anticipate and adopt new, more efficient hardware and software offerings.
      • Use managed services.
      • Reduce the downstream impact of your cloud workloads.

AWS Design Principles

The sources also outline some general AWS design principles:

  • Scalability: Increasing resources to meet demand.
    • Scaling horizontally: Increasing the number of resources.
    • Scaling vertically: Increasing the specifications of an individual resource.
  • Disposable Resources Instead of Fixed Servers: Automating the setup of new resources along with their configuration and code.
    • Infrastructure as Code: Applying techniques, practices, and tools from software development to make your whole infrastructure reusable, maintainable, extensible, and testable.
  • Automation: AWS handles the details of resource management, such as resource provisioning, load balancing, auto-scaling, and monitoring, allowing you to focus on resource deployment.
    • Serverless Management and Deployment: Shifting your focus to automation of your code deployment, with AWS handling management tasks.
    • Alarms and Events: Continuous monitoring of resources, initiating events when certain metrics or conditions are met.
  • Loose Coupling: Reducing interdependencies in a system by allowing components to interact through specific interfaces.
    • Well-Defined Interfaces: Using interfaces such as RESTful APIs to reduce interdependencies.
    • Service Discovery: Allowing applications deployed as smaller services to be consumed without prior knowledge of their network topology details.
    • Asynchronous Integration: Using intermediate durable storage for interacting components that do not need an immediate response.
    • Distributed Systems Best Practices: Building applications that handle component failure gracefully.
  • Services, Not Servers: Using managed services and serverless architectures to reduce operational complexity.
    • Managed Services: Providing building blocks for developers, such as databases, machine learning, analytics, queuing, search, email, and notifications.
    • Serverless Architectures: Enabling the building of both event-driven and synchronous services without managing server infrastructure.
  • Databases: Choosing the right database technology for each workload, including relational databases, NoSQL databases, data warehouses, and graph databases.
    • Search Functionalities: Enabling querying of datasets that are not precisely structured, with features like customizable result ranking, faceting for filtering, synonyms, and stemming.
  • Managing Increasing Volumes of Data: Using a data lake approach to store massive amounts of data in a central location.
  • Removing Single Points of Failure: Using redundancy, failure detection, and durable data storage to ensure high availability.
    • Introducing Redundancy: Using standby redundancy and active redundancy to ensure availability.
    • Detect Failure: Using health checks and log collection.
    • Durable Data Storage: Using synchronous replication, asynchronous replication, and quorum-based replication.
    • Automated Multi-Data Centre Resilience: Utilizing AWS Regions and Availability Zones (Multi-AZ Principle).
    • Fault Isolation and Traditional Horizontal Scaling: Using shuffle sharding.
  • Optimize for Cost: Using right-sizing, elasticity, and various purchasing options to minimize costs.
  • Caching: Using application data caching and edge caching to improve performance and reduce costs.
  • Security: Using AWS features for defence in depth, sharing security responsibility with AWS, reducing privileged access, implementing security as code, and using real-time auditing.

Additional Best Practices

The sources highlight these additional best practices for building applications in the AWS cloud:

  1. Decouple your components (loose coupling): Build components that do not have tight dependencies on each other so that if one component fails, the others will continue to work.
  2. Think parallel: Implement parallelization whenever possible and automate processes.
  3. Implement elasticity: Automate the deployment process and streamline configuration and build processes to allow the system to scale in and out to meet demand without human intervention.
  4. Design for failure: Assume that components will fail and design your architecture for high availability.

Scalability and Elasticity

  • Cloud Elasticity: The ability of a cloud to automatically expand or compress infrastructural resources based on demand.
  • Cloud Scalability: Handling a growing workload where good performance is needed.

High Availability and Fault Tolerance

  • High Availability: A system that almost always maintains uptime, though sometimes in a degraded state.
  • Fault Tolerance: A system that almost always maintains uptime with no noticeable difference to users during an outage.