Deleted Clusters

Metric

Target

RPO

1.5 hours. Backups happen every hour and can take tens of minutes to complete.

RTO

4 hours

Accidental cluster deletion via API

Either a Crate superuser or the customer themselves have accidentally delete a cluster that should not be deleted. Since the cluster was removed completely (including the backups), a special procedure will need to be followed to perform the restore.

Support article: https://github.com/crate/support/issues/283.

Accidental cluster deletion in Kubernetes

A cluster was deleted directly in Kubernetes. This is different from when a cluster is deleted via the API, as the cluster is still present in brain and the backups have not been removed from s3.

If the CrateDB CRD has not been lost:

Support article: https://github.com/crate/support/issues/271.

If the Project/CrateDB CRD has been lost

We do backup projects and cratedbs on all of our production clusters regularly. The manifests and partially also the secrets, are commited to the crate Github repository holzhammer. holzhammer also backs up some of cloud namespaces eg. cloud-app, templates.

As soon as crate-operator, k8s-operator, replicator are running the projects/cratedbs can be applied for the purpose of desaster recovery. Mileage varies depending on the D/R scenario you are seeing.

Support article: https://github.com/crate/support/issues/285.

A special note about brain

brain is a special case because of the chicken & egg problem. The information about this cluster is also held in brain itself, so it’s not possible to easily re-create the CrateDB CRD for it as mentioned in the instructions above.

To avoid a headache, the brain CRD is listed below, but also saved with the current settings at https://github.com/crate/holzhammer/blob/master/manifests/k8s.aks1.westeurope.azure/2cba0c6b-c390-443b-b007-9137a55018b6/cratedb-3b511b64-07c5-4926-86ee-99fbcf4eda15.yaml:

apiVersion: cloud.crate.io/v1
kind: CrateDB
metadata:
  labels:
    cloud.crate.io/channel: stable
    cloud.crate.io/organization-id: 21095b48-a276-46a7-95fd-64479f4d409d
    cloud.crate.io/organization-name: crate-io
    cloud.crate.io/project-id: 2cba0c6b-c390-443b-b007-9137a55018b6
    cloud.crate.io/project-name: brain-production
  name: 3b511b64-07c5-4926-86ee-99fbcf4eda15
  namespace: 2cba0c6b-c390-443b-b007-9137a55018b6
spec:
  backups:
    aws:
      accessKeyId:
        secretKeyRef:
          key: access-key-id
          name: aws-backup-credentials
      basePath: 2cba0c6b-c390-443b-b007-9137a55018b6
      bucket:
        secretKeyRef:
          key: bucket
          name: aws-backup-credentials
      cron: 17 * * * *
      region:
        secretKeyRef:
          key: region
          name: aws-backup-credentials
      secretAccessKey:
        secretKeyRef:
          key: secret-access-key
          name: aws-backup-credentials
  cluster:
    externalDNS: brainprod.aks1.westeurope.azure.cratedb.net.
    imageRegistry: crate
    license:
      secretKeyRef:
        key: license-key
        name: cratedb-license
    name: brainprod
    settings:
      cluster.routing.allocation.awareness.attributes: zone
      cluster.routing.allocation.awareness.force.zone.values: 1,2,3
    ssl:
      keystore:
        secretKeyRef:
          key: keystore.jks
          name: keystore-letsencrypt-3b511b64-07c5-4926-86ee-99fbcf4eda15
      keystoreKeyPassword:
        secretKeyRef:
          key: keystore-key-password
          name: keystore-passwords
      keystorePassword:
        secretKeyRef:
          key: keystore-password
          name: keystore-passwords
    version: 4.6.4
  nodes:
    data:
    - name: hot
      replicas: 3
      resources:
        cpus: 2
        disk:
          count: 1
          size: "237000000000"
          storageClass: crate-premium
        heapRatio: 0.25
        memory: "16106127360"
  users:
  - name: prod
    password:
      secretKeyRef:
        key: password
        name: user-password-3b511b64-07c5-4926-86ee-99fbcf4eda15-0

And the project:

apiVersion: cloud.crate.io/v1
kind: Project
metadata:
  name: 2cba0c6b-c390-443b-b007-9137a55018b6
spec:
  credentials:
    aws:
      accessKeyId: [CREATE A NEW ONE]
      bucket: cratedb-backup-westeurope-azure
      region: eu-west-1
      secretAccessKey: [CREATE A NEW ONE]
  organization:
    id: 21095b48-a276-46a7-95fd-64479f4d409d
    name: crate-io
  project:
    id: 2cba0c6b-c390-443b-b007-9137a55018b6
    name: brain-production