Kubernetes Pets and PetSets
Kubernetes is a container orchestration platform that runs and manages containers. It changes the focus of container deployment to an application-level, not the machine. The shift of focus point enables an abstraction level and removal of dependencies between the application and its physical deployment. This act of decoupling services from the details of low-level physical deployment enables better service management.
For anything to scale you need to provide some kind of abstraction. We have seen this in the overlay world with underlay and overlay abstraction enabling networks to support millions of tenants. Kubernetes allows the deployment of applications to a “sea of abstracted compute” enabling a self-healing orchestrated infrastructure. While this type of scaling and deployment has been useful for stateless services, it falls short in the stateful world.
Contrasting Application Models
As applications serve a growing user base around the globe single cluster and data centre solutions no longer satisfy. Clustered applications and federations enable workload to spread across multiple locations and container clusters for improved efficiency and scale.
When you examine the different application types and scaling modes ( example : scale up & scale out clustering ) there are contrasting deployment models. There are substantial differences deploying and running single applications to applications that operate within a cluster. Different application architectures require different deployment solutions and network identities. For example, a database node requires persistent volumes or node within a cluster uses specific elections where identity is important.
PetSets Alpha Resource
Kubernetes has recently ramped up with the introduction of a new Kubernetes object called PetSets. PetSets is geared towards improving stateful support and is currently an alpha resource available in Kubernetes release 1.3. PetSets is an object that holds a group of Pets aka stateful applications that require stable hostnames, persistent disks, structured lifecycle process and group identity.
PetSets are used for non-homogenous sets of instances where each POD has a stable distinguishable identity. A distinguishable identity is viewed in terms of stable network and storage.
A ) Stable network identities such as DNS and hostname.
B ) Stable storage identity.
Previous to PetSets stateful applications were supported but exceedingly difficult to deploy and manage, especially when it came to distributed stateful clusters. POD’s had random names that could not be relied upon. Now, with the introduction of PetSets controllers and Pets, Kubernetes has sharpened its support for stateful and distributed stateful applications.
PODs and their shortcomings
Kubernetes enables the specification of applications as a POD file, expressed either in YAML or JSON format. The file specifies what containers are to be in a POD. POD’s are the smallest deployment unit in Kubernetes and present a number of challenges for some stateful services.
They do not offer a singleton pattern and are ephemeral by design. Their constructs are mortal as they get born and die but never resurrected. When a POD dies, it’s permanently gone and gets replaced with a new instance and fresh identity. This model of operation may suit some applications but falls short for others that want to retain identity and storage across restart / reschedule.
Replication Controllers and their shortcomings
If you want to a POD to resurrect then you need what’s known as a Replication Controller ( RC ). The Replication Controller enables a singleton pattern with the ability to set replication patterns to a specific number of PODs. The introduction of RC is certainly a step in the right direction as it made sure the correct number of replicas were always running at any given time. The RC works alongside services that sit in front of the RC, using labels to map inbound requests to certain PODs. Services provide a level of abstraction so that the application endpoint never changes.
RC are good for application deployments that require week uncoupled identities and when the naming of individual PODs doesn’t matter to the application architecture. But they lack in certain functionalities that the new PetSet controller provides. You could say that a PetSet controller is an enhanced RC controller in shiny new clothes.
“Pets and Cattle”
The best way to understand PetSets is to perceive the cloud infrastructure with the “Pets and Cattle” metaphor. A “Pet” is a special snowflake you have emotional ties towards and requires special handling for example when it’s sick. Unlike “Cattle” that is viewed as an easily replaceable commodity. Cattle are similar enough to each other so you can treat them all as equals. The application does not get hurt too much if a cattle dies or needs to be replaced.
Cattle refers to stateless application and Pets refer to stateful, “build once, run anywhere” applications.
A stateless application takes in a request and responds but nothing is left behind to fulfill subsequent connections. The stateless pattern derives from the fact that another independent system can fulfill subsequent request / response. Stateful applications store data for further use. Stateful applications are grouped into what’s known as a PetSet object. The PetSet controller has a family orientated approach, opposed to the traditional RC mostly concerned with the number of replicas. PODs are viewed as stateless disposable units that can be removed and interchanged without affecting the application too much. The Pets, on the other hand, are groups of stateful PODs requiring stronger distinguishable identities.
Within a PetSet, Pets ( stateful applications ) have a unique distinguishable identity. The identities stick and do not change on restart / reschedule. They have an explicit purpose / role in life that is known throughout the family – definitive startup carried out in a structured ordered that fits within its responsibility in the applications framework. Initially, the cattle approach forced us to view cloud components as anonymous resources. However, this style of thinking does not fit all application requirements. Stateful applications require us to rethink to the new Pet style approach.
Workloads and Application types suitable for Pets
Stateful application within a PetSet object require unique identities such as :
- Stable hostname
- Ordinal index
The PetSet object supports clustered applications that require stricter membership and identity requirements such as :
- Discovery of peers for quorum
- Startup / teardown ordering
Workloads that benefit from PetSets include, for example :
- NoSQL databases – clustered software like Cassandra, Zookeeper, and etcd requiring stable membership.
- Relationshional Databases – MySQL or PostgreSQL requiring persistent volumes.
Applications have different roles and responsibilities requiring different deployment models. A Cassandra cluster has strict membership and identity requirements, certain nodes are designated as Seeds nodes that are used during startup to discover the cluster. They come up first and act as the connection points for all other nodes contact to get information about the cluster. All nodes require one seed node and all nodes within a cluster must have the same seed node. Without a seed node, no node can join the cluster meaning their role is vital for the application framework.
Zookeeper or etcd requires the identification of peers and instances clients should contact. Other databases have a master / slave model where the master has unidirectional control over the slave. The “primary” server has a different role, identity requirements to that of the “slave”. To properly run these types of services requires more complex features in Kubernetes.