Friday, June 23, 2017

AWS : S3




What is Object Storage ?
     They have data as files
     They don’t have hierarchy , all objects at same level , no notion of directory but there are ways around that.
     Each object also has a valuable amount of metadata
     It also has a globally unique identifier , which allows application to retrieve the obj without knowing the physical location of data. All developer need to know the object identifier and S3 will do the rest.
     Objects are immutable : u can’t modify object… u can add (new version) / delete or replace it.
     No streaming : are alternate ways to it. [google]
 

https://aws.amazon.com/s3/faqs/
What is S3 object size?
  • The total volume of data and number of objects you can store are unlimited. 
  • Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 terabytes.
  •  Its stored as key value pairs.
Basically S3 stores as a key value value pair :
  • Name is the key of the object.
  • Value is the Data (sequence of bytes)
  • versionID is imp for versioning of object.
  • MetaData (data about what you are storing) : ex creation dates
  • SubResources : not cover in here.

Does S3 have a universal namespace ?
Yes
How can i upload large objects
  • The largest object that can be uploaded in a single PUT is 5 gigabytes.
  • For objects larger than 100 megabytes, customers should consider using the Multipart Upload capability.
How can I delete large numbers of objects?
You can use Multi-Object Delete to delete large numbers of objects from Amazon S3. This feature allows you to send multiple object keys in a single request to speed up your deletes. Amazon does not charge you for using Multi-Object Delete.


Storage Class


RRS : Can be used to store objects which may be lost. May be Photos -> thumbnails , so ThumbNails can be regenerated





1.    S3-IA : Infrequently Accessed :
a.    Same performance as standard
b.    Cheaper storage
c.    Extra retrieval fees
d.    So basically useful for backups / backup recovery.
e.    Infrequent access policy is set at object level : i.e. you can mix standard object and infrequent object in same S3 buckets


Where is my data stored?
You specify a region when you create your Amazon S3 bucket. Within that region, your objects are redundantly stored on multiple devices across multiple facilities.

First Step

Second Step



What is the number of Amazon S3 buckets that a user can provision?
By default, customers can provision up to 100 buckets per AWS account. However, you can increase your Amazon S3 bucket limit by visiting AWS Service Limits.

What's a US Standard region?The US Standard Region is renamed to US East (Northern Virginia) Region to be consistent with AWS regional naming conventions.

Does price dependent on region i choose ?
Yes, AWS charge less where there costs is less. For example, our costs are lower in the US East (Northern Virginia) region than in the US West (Northern California) region.

How much does Amazon S3 cost?
With Amazon S3, you pay only for what you use. There is no minimum fee
There is no Data Transfer charge for data transferred within an Amazon S3 Region via a COPY request. Data transferred via a COPY request between Regions is charged at rates specified on the pricing section of the Amazon S3 detail page.

Can i delete a bucket ?
Only if its empty. Once deleted the name becomes available for anyone to reuse.

How is billing done ?
Its done basis of
  • 'Byte-Hour' or 'GB-Months' , i.e. number of bytes stored for number of hours.
  • And also on number of GET / PUT / POST requests

If i use s3 versioning ... Will i be charged for both versions ?
yes


Example
1) Day 1 of the month: You perform a PUT of 4 GB (4,294,967,296 bytes) on your bucket.
2) Day 16 of the month: You perform a PUT of 5 GB (5,368,709,120 bytes) within the same bucket using the same key as the original PUT on Day 1.


Note
  1. There is also an option of requestor pays bucket configuration.
  2. Once Versioning is enabled it cannot be disabled it can only be suspended
  3. If you see Version id = null means no versioning enabled.
  4. Cross Region Replication can be enabled BUT  you need to enable versioning first
 

What is S3 Standard - Infrequent Access?
Amazon S3 Standard - Infrequent Access (Standard - IA) is an Amazon S3 storage class for data that is accessed less frequently, but requires rapid access when needed.

S3 Standard - Infrequent Access provide the same performance as S3 Standard storage.\


Standard - IA is designed for long-lived, but infrequently accessed data that is retained for months or years. Data that is deleted from Standard - IA within 30 days will be charged for a full 30 days.

Q. What charges will I incur if I change storage class of an object from Standard-IA to Standard with a copy request?
You will incur charges for an Standard-IA copy request and a Standard-IA data retrieval.

S3 : Glacier
  • Cost is $0.004 per gigabyte per month. Lifecycle transition requests into Amazon Glacier cost $0.05 per 1,000 requests
  • Use archival rules to put in Glacier.
  • You can retrieve 10 GB of your Amazon Glacier data per month for free
  • Amazon Glacier is designed for use cases where data is retained for months, years, or decades.  
  • Deleting data that is archived to Amazon Glacier is free if the objects being deleted have been archived in Amazon Glacier for three months or longer. If an object archived in Amazon Glacier is deleted or overwritten within three months of being archived then there will be an early deletion fee
What are Amazon S3 event notifications?
Amazon S3 event notifications can be sent in response to actions in Amazon S3 like PUTs, POSTs, COPYs, or DELETEs. Notification messages can be sent through either Amazon SNS, Amazon SQS, or directly to AWS Lambda.

There are no additional charges from Amazon S3 for event notifications. You pay only for use of Amazon SNS or Amazon SQS to deliver event notifications, or for the cost of running the AWS Lambda function.

Can i use S3 to host static websites ?
  • yes
  • It also supports website redirects
  • There is no additional charge for hosting static websites on Amazon S3. The same pricing dimensions of storage, requests, and data transfer apply to your website objects.



What data consistency model does Amazon S3 employ?
Amazon S3 buckets in all Regions provide
  1. Read-after-write consistency for PUTS of new objects 
  2. Eventual consistency for overwrite PUTS and DELETES.
What does this mean ?


     When you PUT something it's immediately available to READ.
     If Update or Delete , you will get eventual consistency … the update or delete will not be visible immediately (As S3 uses replication in backend and it takes time to replicate to all zones)
 

S3 & BitTorrent
any publicly available data in Amazon S3 can be downloaded via the BitTorrent protocol. Simply add the ?torrent parameter at the end of your GET request in the REST API.

What checksums does Amazon S3 employ to detect data corruption?

Amazon S3 uses a combination of Content-MD5 checksums and cyclic redundancy checks (CRCs) to detect data corruption. Amazon S3 performs these checksums on data at rest and repairs any corruption using redundant data. In addition, the service calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data.


How S3 achieves 11 9s of durability ?When processing a request to store data, the service will redundantly store your object across multiple facilities before returning SUCCESS. Amazon S3 also regularly verifies the integrity of your data using checksums


Can I setup a trash, recycle bin, or rollback window on my Amazon S3 objects to recover from deletes and overwrites?

You can use Lifecycle rules along with Versioning to implement a rollback window for your Amazon S3 objects. For example, with your versioning-enabled bucket, you can set up a rule that archives all of your previous versions to the lower-cost Glacier storage class and deletes them after 100 days, giving you a 100 day window to roll back any changes on your data while lowering your storage costs.

Does Amazon S3 support data access auditing?
Yes, details about the request, such as the request type, the resources specified in the request, and the time and date the request was processed.
Object tags

S3 Object Tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object. With these, you’ll have the ability to create Identity and Access Management (IAM) policies, setup S3 Lifecycle policies, and customize storage metrics.

Up to ten tags can be added to each S3 object and you can use either the AWS Management Console, the REST API, the AWS CLI, or the AWS SDKs to add object tags.

Object Tags can be replicated across regions using Cross-Region Replication.

Object Tags are priced at $0.01 per 10,000 tags per month.

Q. What is S3 Analytics – Storage Class Analysis?
With storage class analysis, you can analyze storage access patterns and transition the right data to the right storage class. This new S3 Analytics feature automatically identifies infrequent access patterns to help you transition storage to Standard-IA. You can configure a storage class analysis policy to monitor an entire bucket, a prefix, or object tag. Once an infrequent access pattern is observed, you can easily create a new lifecycle age policy based on the results.
Storage Class Analysis is updated on a daily basis on the S3 Management Console.

S3 Inventory
S3 Inventory provides a CSV (Comma Separated Values) flat-file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or a shared prefix.

S3 Inventory can be used as a ready-made input into a big data job or workflow application instead of the synchronous S3 LIST API, saving the time and compute resources it takes to call and process the LIST API response.

How do I get started with S3 CloudWatch Metrics?
You can use the AWS Management Console to enable the generation of 1-minute CloudWatch metrics for your S3 bucket or configure filters for the metrics using a prefix or object tag.


You can use CloudWatch to set thresholds on any of the storage metrics counts, timers, or rates and fire an action when the threshold is breached.



No comments: