The BucketClass CRD represents a structure that defines bucket policies relating to data placement, namespace properties, replication policies and more.
Note that placement-bucketclass and namespace-bucketclass both use the same CR, and the difference lies inside the bucket class' spec
section, more specifically the presence of either the placementPolicy
or namespacePolicy
key.
A placement bucket class defines a policy for standard buckets - i.e. NooBaa buckets that are backed by backingstores. The data placement capabilities are built as a multi-layer structure, here are the layers bottom-up:
- Spread Layer - list of backing-stores, aggregates the storage of multiple stores.
- Mirroring Layer - list of spread-layers, async-mirroring to all mirrors, with locality optimization (will allocate on the closest region to the source endpoint), mirroring requires at least two backing-stores.
- Tiering Layer - list of mirroring-layers, push cold data to next tier.
A namespace bucket class defines a policy for namespace buckets - i.e. NooBaa buckets that are backed by namespacestores. There are several types of namespace policies:
- Single - a single namespace store is used for both read and write operations on the target bucket
- Multi - a single namespace store is used for write operations, and a list of namespace stores can be used for read operations
- Cache - functions similarly to
Single
, except with an additionalTTL
key, which dictates the time-to-live of the cached data
Cache bucketclasses work by saving read objects in a chosen backingstore, which leads to faster access times in the future. In order to make sure that the cached object is not out of sync with the one in the remote target, an ETag comparison might be run upon read, depending on the TTL that the user chooses. The TTL can fall in one of three categories:
- Negative (e.g.
-1
) - when the user knows there are no out of band writes, they can use a negative TTL, which means no revalidations are done; if the object is in the cache - it is returned without an ETag comparison. This is the most performant option. - Zero (
0
) - the cache will always compare the object's ETag before returning it. This option has a performance cost of getting the ETag from the remote target on each object read. This is the least performant option. - Positive (denoted in milliseconds, e.g.
3600000
equals to an hour) - once an object was read and saved in the cache, the chosen amount of time will have to pass prior to the object's ETag being compared again.
It is possible to set a bucketclass-wide replication policy, that will be inherited and used by all future buckets created under that bucketclass.
A replication policy is a JSON-compliant string which defines an array of rules -
- Each rule is an object containing a
rule_id
, adestination bucket
, and an optionalfilter
key that contains aprefix
field. - When a filter with prefix is provided - only objects keys that match the prefix will be replicated
Replication policy:
A bucket-class will define a replication policy for all future NooBaa buckets who will utilize it. The policy is a JSON-compliant array of rules (examples are provided at the bottom of this section)
- Each rule is an object that contains the following keys:
rule_id
- which identifies the ruledestination_bucket
- which dictates the target NooBaa buckets that the objects will be copied to- (optional)
{"filter": {"prefix": <>}}
- if the user wishes to filter the objects that are replicated, the value of this field can be set to a prefix string - (optional, log-based optimization, see below)
sync_deletions
- can be set to a boolean value to indicate whether deletions should be replicated - (optional, log-based optimization, see below)
sync_versions
- can be set to a boolean value to indicate whether object versions should be replicated
In addition, when the bucketclass is backed by namespacestores, each policy can be set to optimize replication by utilizing logs (configured and supplied by the user, currently only supports AWS S3 and Azure Blob):
- (optional)
log_replication_info
- an object that contains data related to log-based replication optimization -- (optional on AWS)
endpoint_type
- this field can be set to an appropriate endpoint type (currently, only AZURE is supported) - (necessary on AWS)
{"logs_location": {"logs_bucket": <>}}
- this field should be set to the location of the AWS S3 server access logs
- (optional on AWS)
An example of an AWS replication policy with log optimization:
'{"rules":[{"rule_id":"aws-rule-1", "destination_bucket":"first.bucket", "filter": {"prefix": "a."}}], "log_replication_info": {"logs_location": {"logs_bucket": "logsarehere"}}}'
An example of an Azure replication policy with log optimization:
'{"rules":[{"rule_id":"azure-rule-1", "sync_deletions": true, "sync_versions": false, "destination_bucket":"first.bucket"}], "log_replication_info": {"endpoint_type": "AZURE"}}'
These policies can also be saved as files and passed to the NooBaa CLI. In that case, please note it's necessary to omit the outer single quotes.
- A backing store name may appear in more than one bucket class but may not appear more than once in a certain bucket class.
- The operator CLI currently only supports a single tier placement policy for a bucket class.
- Thus, YAML must be used to create a bucket class with a placement policy that has multiple tiers.
- Upon creating standard buckets, the user will first need to create a placement bucketclass which contains a placemant policy.
- Upon creating namespace buckets, the user will first need to create a namespace bucketclass which contains a namespace policy.
- A namespace bucket class of type cache must contain both a placement and a namespace policy.
- A namespace bucket class of type single/multi must contain a namespace policy.
- Placement policy is case sensitive and should be of value
Mirror
orSpread
when more than one backingstore is provided. - Namespace policy is case sensitive and should be of values
Single
,Multi
orCache
.
- The operator will verify that the bucket class is valid - i.e. that the backingstores and namespacestores exist and can be accessed and used.
- Changes to a bucket class spec will be propagated to buckets that were instantiated from it.
- Other than that the bucket class is passive, just waiting there for new buckets to use it.
It is possible to check a resource's status in several ways, including:
kubectl get bucketclass -A <NAME> -o yaml
(will retrieve bucketclasses from all cluster namespaces)kubectl describe bucketclass <NAME>
noobaa bucketclass status <NAME>
Below is an example of a healthy bucket class' status, as retrieved with the first command:
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: noobaa-default-class
namespace: app-namespace
spec:
...
status:
conditions:
- lastHeartbeatTime: "2019-11-05T13:50:50Z"
lastTransitionTime: "2019-11-07T07:03:58Z"
message: noobaa operator completed reconcile - bucket class is ready
reason: BucketClassPhaseReady
status: "True"
type: Available
- lastHeartbeatTime: "2019-11-05T13:50:50Z"
lastTransitionTime: "2019-11-07T07:03:58Z"
message: noobaa operator completed reconcile - bucket class is ready
reason: BucketClassPhaseReady
status: "False"
type: Progressing
- lastHeartbeatTime: "2019-11-05T13:50:50Z"
lastTransitionTime: "2019-11-05T13:50:50Z"
message: noobaa operator completed reconcile - bucket class is ready
reason: BucketClassPhaseReady
status: "False"
type: Degraded
- lastHeartbeatTime: "2019-11-05T13:50:50Z"
lastTransitionTime: "2019-11-07T07:03:58Z"
message: noobaa operator completed reconcile - bucket class is ready
reason: BucketClassPhaseReady
status: "True"
type: Upgradeable
phase: Ready
Please note that CLI (noobaa
) examples need NooBaa to run under app-namespace
, despite the fact bucketclasses are supported in all namespaces
Single tier, single backing store, Spread placement:
noobaa -n app-namespace bucketclass create placement-bucketclass bc --backingstores bs --placement Spread
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
placementPolicy:
tiers:
- backingStores:
- bs
placement: Spread
Single tier, two backing stores, Spread placement:
noobaa -n app-namespace bucketclass create placement-bucketclass bc --backingstores bs1,bs2 --placement Spread
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
placementPolicy:
tiers:
- backingStores:
- bs1
- bs2
placement: Spread
Single tier, two backing stores, Mirror placement:
noobaa -n app-namespace bucketclass create placement-bucketclass bc --backingstores bs1,bs2 --placement Mirror
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
placementPolicy:
tiers:
- backingStores:
- bs1
- bs2
placement: Mirror
Two tiers (only achievable by applying a YAML at the moment) - single backing stores per tier, Spread placement in tiers:
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
placementPolicy:
tiers:
- backingStores:
- bs1
placement: Spread
- backingStores:
- bs2
placement: Spread
Two tiers (only achievable by applying a YAML at the moment) - two backing stores per tier, Spread placement in first tier and Mirror in second tier:
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
placementPolicy:
tiers:
- backingStores:
- bs1
- bs2
placement: Spread
- backingStores:
- bs3
- bs4
placement: Mirror
Namespace bucketclass, a single read and write resource in Azure:
noobaa -n app-namespace bucketclass create namespace-bucketclass single bc --resource azure-blob-ns
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
namespacePolicy:
type: Single
single:
resource: azure-blob-ns
Namespace bucketclass, a single write resource in AWS, multiple read resources in AWS and Azure:
noobaa -n app-namespace bucketclass create namespace-bucketclass multi bc --write-resource aws-s3-ns --read-resources aws-s3-ns,azure-blob-ns
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
namespacePolicy:
type: Multi
multi:
writeResource: aws-s3-ns
readResources:
- aws-s3-ns
- azure-blob-ns
Namespace bucketclass, cache stored in noobaa-default-backing-store
, objects are read from and written to IBM COS:
noobaa -n app-namespace bucketclass create namespace-bucketclass cache bc --hub-resource ibm-cos-ns --ttl 36000 --backingstores noobaa-default-backing-store
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
namespacePolicy:
type: Cache
cache:
caching:
ttl: 36000
hubResource: ibm-cos-ns
placementPolicy:
tiers:
- backingStores:
- noobaa-default-backing-store
Namespace bucketclass with replication to first.bucket:
/path/to/json-file.json is the path to a JSON file which defines the replication policy
noobaa -n app-namespace bucketclass create namespace-bucketclass single bc --resource azure-blob-ns --replication-policy=/path/to/json-file.json
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: bc
namespace: app-namespace
spec:
namespacePolicy:
type: Single
single:
resource: azure-blob-ns
replicationPolicy: [{ "rule_id": "rule-1", "destination_bucket": "first.bucket", "filter": {"prefix": "ba"}}]
Bucket class in a namespace other than the NooBaa system namespace. <TARGET-NOOBAA-SYSTEM-NAMESPACE>
is the namespace where the NooBaa system is deployed:
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
labels:
noobaa-operator: <TARGET-NOOBAA-SYSTEM-NAMESPACE>
app: noobaa
name: bc
namespace: app-namespace
spec:
placementPolicy:
tiers:
- backingStores:
- noobaa-test-backing-store