Setting Up a Cluster

While running Zenko on a single machine is desirable for certain use cases, a clustered operating environment is required for high-availability deployments. If you can set up a Kubernetes cluster on your own, review the General Cluster Requirements and skip to Install Zenko. Otherwise, download the latest stable version of MetalK8s and follow its instructions to establish a working Kubernetes instance on your cluster.

Most of the complexity of installing Zenko on a cluster involves deploying the cluster istelf. Scality supports MetalK8s, an open source Kubernetes engine optimized for the Zenko use case. The following section describes general cluster requirements that have been tested on MetalK8s. Because MetalK8s is designed to operate without support from public cloud resources, the following sizing requirements are assumed sufficient for hosted cloud Kubernetes clusters, where such resources are available on demand.

General Cluster Requirements

Setting up a cluster requires at least three machines (these can be VMs) running CentOS 7.4 or higher. The recommended mimimum for a high-availability Zenko production service is the Standard Architecture, a five-node server with three masters/etcds. The Compact Architecture, a three-node configuration, is also supported. The cluster must have an odd number of nodes to provide a quorum. You must have SSH access to these machines and they must have SSH access to each other.

Important

Three-server clusters can continue to operate without service disruption or data loss if one node fails. Five-server clusters can operate without disruption or loss if two nodes fail.

Each machine acting as a Kubernetes node must also have at least one disk available to provision storage volumes.

Once you have set up a cluster, you cannot change the size of the machines on it.

Service and Component Architecture

Zenko consists of the following stateful and stateless services.

Stateful Services

For the following stateful services, each node has a copy of the data. Though their terminology varies, each service employs the same strategy for maintaining availability. A primary service acts on data and transfers it to replica instances on the other nodes. If the service running as the primary fails (either due to internal error or node failure), the remaining replica services elect a primary to continue. If this occurs on a three-node cluster, no data is lost unless two nodes fail. On a five-node cluster, no data is lost unless three nodes fail.

If a replica node fails, the primary continues operation without interruption or an election.

The following stateful services conform to this failover strategy:

  • MongoDB
  • Redis
  • Kafka
  • ZooKeeper

Stateless Services

The following stateless services are based on a transactional model. If a service fails, Kubernetes automatically reschedules the process on an available node.

Lifecycle Services

  • Lifecycle Bucket Processor
  • Lifecycle Conductor
  • Lifecycle Object Processor
  • Garbage Collection (GC) Consumer

Replication Services

  • Replication Data Processor
  • Replication Populator
  • Replication Status Processor

APIs

  • CloudServer API
  • Backbeat API

Monitoring Services

  • Prometheus
  • Grafana

Out-of-Band Services

  • Ingestion Consumer
  • Ingestion Producer
  • Cosmos Operator
  • Cosmos Scheduler

Orbit Management Layer

  • CloudServer Manager