From: AWS Cloud Practitioner Essentials
- ran out of interest in taking notes about 60% of the way through

Terminology

Cloud Computing: on-demand delivery of IT resources, managed over the internet
AWS APIs

all AWS services have APIs for you to use to manage everything

options for interacting with AWS (these all use AWS APIs)
- AWS Management Console (browser based)
- AWS Command Line Interface (CLI)
- AWS Software Development Kit (SDKs)
- other tools are available
- AWS Elastic Beanstalk (for EC2 environments)
- AWS Cloudformation (has broader support)
- use JSON or YML config files (cloudformation templates) to declare the setup you want
    - update a template and AWS will handle the API calls for all the setup

Cloudtrail

aws api logs
Amazon Elastic Compute Cloud (EC2)

for computations

compute-as-a-service model

can share host server with other instances (virtual machines)
- multitenancy
- hypervisor manages the server running multiple instances
- instances are secure and separate from each other

vertically scaling an instance
- take more or less resources (memory, cpus)

instance types
- each type in grouped under an instance family, and is optimized for different tasks
- families
- general purpose: balanced resources, ok for web services and small/medium databases
- compute optimized: intensive computing, ok for gaming servers, an intensely used web services, and batch processing
- memory optimized: for processing large datasets in memory
- accelerated computing: hardware accelerators, ok for graphics, data pattern matching, floating point calculations
- storage optimized: for high sequential read/write access to large datasets, ok for distributed file systems, data warehousing, and OLTP (online transaction processing)

auto-scaling of instances
- scale up: add more power/resources to the existing instances
- scale out: add more instances
- set min and max capacity, plus desired capacity (the amount to use by default)
Container Orchestration Tools

uses Docker containers
- the container includes the code, any dependencies, and configuration
- containers are isolated from each other similar to a virtual machine

cluster: a collection of containers

these can run on EC2, but don't have to
- Amazon Elastic Container Service (ECS)
- Amazon Elastic Kubernetes Service (EKS)
- you can host these on Fargate instead

Elastic Load Balancing (ELB)

load balancer: directing/routing traffic among the available instances of a service

Elstastic Load Balancer
- AWS's flavor of load balancer that you can turn on
- a managed service
- a regional service
- cost of one ELB does not change when the number of servers it is balancing between changes

Amazon Simple Queue Service (SQS)

AWS messaging and queuing system

payload: the data contained in a message

queue: holds messages until they are processed

Amazon Simple Notification Service (SNS)

sends out messages
- can send a message to a specific place
- or publish a message and have subscribers

Serverless Compute Options

serverless: you cannot see or access the infrastructure (provisioning, scaling, maintenance is all handled by AWS)
- your code runs on servers, but you don't need to provision or manage them
AWS Lambda

a function with a trigger
- one instance per trigger that occurs
- runtime limited to 15 minutes

Fargate

a serverless way to host ECS/EKS

Amazon Cloudfront

Cloudfront is a Content Delivery Networks (CDN)

ex of a CDN: cache a copy of data physically close to a city so that users there can access it quickly

uses edge locations around the world to be close to users
- edge locations are different than regions

Storage

block level storage
- for files
- updates just the pieces of storage that have been edited
- ex: your harddrive

instance store volumes
- physical storage attached to the server your EC2 instance is running on
- this data is deleted if you stop your instance (because when it is restated, it will likely be on a different server)
- so it is for ephemeral data

Amazon Elastic Block Store (EBS)

attach EBS volumes to your EC2 instance
- this data persists when your EC2 instance stops and restarts
- allows incremental backups (snapshots)
- sized up to 16 Terabytes

an availability zone resource

Amazon Simple Storage Service (S3)

data is stored as objects in buckets
- unlimited storage
- max object size is 5 Terabytes
- can version objects (save previous versions)
- create multiple buckets
- security for who can access objects

tiers
- S3 standard: high durability, data stored in at least three facilities at a time
- S3 static web hosting: for static web pages hosted from an S3 bucket
- S3 infrequent access (S3 Standard-IA): for backups and disaster recovery
- S3 glacier: for long term archives, can lock "vaults" to stop edits

lifecycle policies
- can move data between tiers automatically

optimized for write once, read many

every object automatically has a url

object storage: each object is a complete file, each update will cause the entire object to be saved as a whole

Amazon Elastic File System (EFS)

a managed Linux file system

multiple EC2 instances can access it at once

automatically scales to give you more storage

a regional resource
Databases

Amazon Relational Database Services (RDS)

supports: mysql, postgresql, oracle, microsoft sql server, and others

"lift-and-shift" migration: copies your local database to the cloud
- Amazon Database Migration Service (DMS)
- homogeneous: from a database type to a matching type (such as between relationship databases)
- heterogeneous: use schema conversion tool to get from one db to a new schema type

build for business analytics

for realtime read/write functionality

Amazon Aurora

a database solution with mysql or postgres
- with data replication
- and continuous backups

for realtime read/write functionality

Amazon DynamoDB

a serverless database
- you don't need to manage the infrastructure
- create tables of items and attributes
- millisecond response time
- a nosql database, a nonrelational database
- so can only handle simple queries against one table at a time

for realtime read/write functionality

Amazon Redshift

data warehouse
- for historical analytics and querying multiple databases
- historical data refers to data that is set (no longer being edited)

Amazon DocumentDB

for content management

Amazon Neptune

a graph database, like for social networks

Amazon Quantum Ledger Database (QLDB)

100% immutability, to data can be removed from the audits

CloudWatch

TODO find out more about custom metrics and cloudwatch alarms and cloudwatch dasboards

Logs Insights

CloudWatch > Logs Insights

query example

fields @timestamp, StatusCode, @message
| sort @timestamp desc
| filter @message ~= "provisioning"
get fields:timestamp and message
sort: by timestamp descending
where: message includes the text "provisioning"

fields can be parsed with regex
each * part will be output as a new named field

 # with @message format 'stuff stuff,"@l":"Debug",stuff stuff'

fields @timestamp, @message
| parse @message '*"@l":"*"*' as message_before, loggingType, message_after
| filter loggingType = "Error"
| display loggingType, message_after #or comment out this line to see full messages as normal
| sort @timestamp desc
| limit 20
another way to get this filtering is

fields @timestamp as Timestamp, @@l as LogLevel, ServiceName, @@m
| filter LogLevel == "Error"
| sort @timestamp desc
| limit 20
or more specifically for current job

fields @timestamp as Timestamp, @@l as LogLevel, ServiceName, @@m
| filter LogLevel != "Error" and LogLevel != "Verbose" and LogLevel != "Debug"  and not isblank(LogLevel)
| sort @timestamp desc
| limit 20

useful fields

| filter StatusCode ~= '5'
looking for 5xx response codes

trying to find error responses

fields @timestamp, StatusCode, @message
| sort @timestamp desc
| filter not isblank(StatusCode)
Amazon Route 53

aws route 53: a domain name service (DNS): a DNS translates website names into ip addresses
- runs in edge locations
- can also register domain names

routing policies
- latency-based routing
- geolocation dns: routing is based on where the user is located
- geoproximity routing
- weighted round robin

Amazon Virtual Private Cloud (VPC)

your own private network within AWS
- define your ip range
- can define what traffic can enter the VPC

internet gateway (IGW): an access point for public traffic to enter your VPC
- (this is not automatic)

virtual private gateway: an access point for private traffic (encrypted internet traffic) to enter you VPC
- (this is not automatic)
- make a VPN connection between a private network and your VPC

AWS direct connect:
- a dedicated private connection (not shared with normal internet traffic)
- a dedicated fiber connection (wire) between your data center and an AWS data center

gateways are configured to be "traffic in" or "traffic out"

subnets: used to control access to the gateways
- public subnets have access to the internet gateways
- has a Network ACL

network access control lists (Network ACL)
- define which packets are allowed to cross the boundary? in either direction
- stateless: checks every single packet
- allows all in and out traffic by default

security groups
- ECS instance level access security
- what traffic is allowed to access a particular ECS instance?
- by default, no incoming traffic is accepted and all outgoing traffic is allowed
- stateful: has some memory for what to allow in
- will allow the responses to outgoing packets back in
Security

shared responsibility model: AWS manages security "of" the cloud, customer manages the security "in" the cloud

aws responsible
- physical layer
- network layer
- hypervisor layer

customer
- operating system (when on EC2, when not serverless)
- applications
- data

AWS root account user: the user will complete access to your whole system
- do not use this often, create other users with granular access

multifactor authentication (MFA)
- requires a randomized token to login

AWS Identity And Access Management (AWS IAM)

create new users
- by default, they have no permissions
- create an IAM policy and associate it to a user to grant them access
- IAM policy: a json document that lists the api calls the user can make
- can also use groups to connect users and policies
- can use roles to give temporary permissions for an amount of time
- an identity will assume a role, and ignore all its normal permissions for a time

least privileged principle: give users minimal access to what they need

AWS Organizations

manage multiple AWS accounts, separate resources
- centralized management
- consolidated billing
- hierarchical groupings of accounts
- organize into OU (Organizational Units)
- service and api action access control

AWS Global Infrastructure

About

Amazon has data centers located all over the world
- they are organized into regions
- this is helpful for providing high availability, because a disaster in one region does not block processing in other regions
- regions are built to be geographically close to high-demand population centers
- the regions are interconnected
- users can decide which region to run their services out of
- you data is only stored and handled in the region(s) you specify (for security)
- the AWS features available vary between regions
- pricing varies between regions

availability zone (AZ "ay-zed"): one or more discrete data center
- reach region is made up of one or more availability zones
- a single EC2 instance will only run in one data center
- run multiple instances so they can run in multiple data centers, and even in multiple availability zones
- the geographically distributed availability zones provide high availability

AWS Outposts

Amazon will install a little data center inside your building
- Amazon will manage this data center
- make use of AWS infrastructure and services in your on-premises data center