The absolute statement is false.

FaaS & Kubernetes & ?

(NOTE: this article includes content translated by a machine)

After scanning the basic concepts and various components of k8s, it is very large and “beautiful”. It is something that requires experience. I thought about learning k8s through practice. The concept of Serverless is very popular right now, and the current general manifestation is FaaS? This thing is very mysterious and does not have a very clear and precise definition. The goal of this article is to create a useless “FaaS” framework. Many things about Serverless/FaaS in the article are hearsay and then self-supplemented, and the use of k8s probably does not conform to best practices.

Ideas

FaaS

One

It is probably the routine of the microservices framework, but it completely abstracts the business logic, declares various functional requirements through the means of configuration files, such as the dependency between functions is through declaration, and then the framework handles communication interactions and even governance-related matters. When packaging the image, the framework and functions are integrated and compiled together. But there is a problem with this, every language needs to have such a framework, probably through some form of sidecar, similar to ServiceMesh, or translate other language functions into a common language form.

Two

The execution of functions is driven by event flow, and interaction becomes a message bus method. The process is very different from the RPC method, and the framework layer may be slightly simplified.

For example, using Kafka as a message system, each Function has two Topics: Input and Output, the externally exposed Gateway is responsible for stuffing events into the Kafka InputTopic, and then consuming the processing results from the OutputTopic and returning them to the user. The framework consumes the InputTopic to pass the input to the function and stuffs the output into the OutputTopic. This method does not seem suitable for simple WebService scenarios. For example, the Gateway needs to generate a unique ID or use other methods to match the Input/Output Message. This requires each Gateway and Function instance to have its own Topic or Partition, otherwise the corresponding Output may be consumed by other instances or a lot of filtering work is needed to do some useless consumption.

When using a message system like Kafka may need to use some way to decouple it from the specific function. For example, if you use Kafka’s Consumer Group mechanism, then a Partition cannot be consumed by multiple Consumers in the Consumer Group at the same time. Binding the Consumer and function together will limit the processing power of the function. Increasing the number of Partitions will trigger Kafka’s rebalance and data migration, etc., bringing potential problems and bottlenecks. If you do not use Consumer Group, you need to manage and schedule the consumption of Partition yourself. Apache Bookkeeper or Apache Pulsar based on it, the storage model is superior to Kafka and may be a better choice.

k8s

I saw an implementation that adds a component called a Trigger Controller to consume Kafka and forward inputs to corresponding Functions. Typically, the setup for Controllers is one primary with multiple backups to ensure High Availability (HA). However, it doesn’t seem to exist as a component that can be horizontally scaled to improve processing capabilities? I didn’t look into the details of how it is used, but of course, this event-driven approach is not specifically designed for Web Services.

Purely for learning k8s, the routine is simpler, and the usage of k8s is as follows:

  • The Function is represented by CRD.
  • The custom Controller is used to track and manage the creation and destruction of Function resources, etc.
    • Provide HA, avoid single point of failure and status conflicts brought by multiple masters, through k8s provided lock and election API(based on etcd)
    • Need to manually compile and package the Function image, the runtime includes a simple HTTP Server, just inject the function..
    • After listening to the creation of new Function resources:
      • Create a Service to do service discovery and load balancing related work (ignore Ingress for now..)
      • Create a Deployment containing the Function
      • Create HorizontalPodAutoscaler
    • If you delete it, just get rid of the corresponding resources. Using ownerReferences can simplify this step

Then k8s does not support scaling down to zero, and thinking about it carefully seems unrealistic..

Implementation

It is worth mentioning that if the resources are too few, minikube may fail indefinitely, here is just for reference:

minikube start --cpus=2 --memory=2048 --disk-size=2g

The specific process of implementation will not be expanded, see https://github.com/damnever/useless

The API of Kubernetes (k8s) feels somewhat messy because there are so many. Golang is simple but feels like it’s missing something. There’s no code length limit, and the basic flow control is only if/else, and it requires eight spaces for indentation (probably other languages have their own issues, this is purely for the sake of ranting haha)… That said, although I’ve only had a simple experience with it, overall it’s quite good. Essentially, all things are defined as resources and can be CRUD (Create, Read, Update, Delete). As a platform, the approach is pretty commendable. Many built-in/basic/core components and functionalities are implemented based on the core framework and processes.

Postscript

The original intention of Serverless/FaaS is very good, and its main purpose is similar to microservices frameworks and ServiceMesh, which is to clean up the ‘dirty work’ more cleanly, while fully utilizing the various advantages of cloud computing. The problem to be solved is probably how to elegantly do the “dirty work”, and if it can achieve self-bootstrapping and self-governance, then there wouldn’t be much for programmers to do…

If you want to implement a production-ready “FaaS” framework product, there are many problems to solve, including but not limited to:

  • In what form do functions execute? If it’s not “Fire-and-Forget”, would there be reliance on global shared state within the process?!
  • Support for basic functionalities like multi-language, dependency package management, environment variables, logs, metrics, etc.
  • Configuration management
  • CI/CD
  • A Web IDE or other similar tools are essential
  • Trimming and converting function inputs/outputs
  • Full-featured gateway support, such as HTTP CORS, and even other protocols
  • Auto-scaling
  • Minimizing additional resource overhead
    • Reduce resource consumption due to container and language tool startups, initialization, and destruction
      • For example, using local cache or P2P to quickly pull corresponding containers for functions
      • Need to avoid overhead other than function execution as much as possible, because the granularity of functions is already very small
    • Many problems similar to the CGI programming model
      • For example, how to elegantly resolve dependencies on external services like databases?
      • If accessing databases directly within functions, there will be a huge overhead, like inability to reuse connections
      • Or use another persistent Worker/Service for database operations, the output of the function is Pub’d into the message bus, and the Worker/Service Subs the corresponding database operation? Ahh.. communication is needed in some way..
    • Maybe various user-mode software and high-end hardware can solve some problems
  • Caching features: dimensions, consistency requirements
  • Organizing function sets through Namespace/Group
  • Many workflow engine features
    • Scheduled or even custom scheduling engines
    • Function sets with dependency and priority relationships
    • Lifecycle and state machine management, etc.
  • Various active and passive Hooks or other ways to conveniently export states and interact with external components
  • The granularity of functions is smaller than microservices? It will definitely be more chaotic, and before better standards and processes are established, you will surely step into more pitfalls, making governance indispensable

Upon careful consideration, Serverless should be viewed as a vision or an ideal state. The “FaaS” mentioned here should not be rigidly applied to unsuitable application scenarios.