
The edge computing solution proposed by AIknow is a system used for scheduling the execution of processes on production systems with unused resources.
Context
Context Imagine this situation: you have a fleet of one hundred gateways (industrial PCs connected to a network that run monitoring, data collection and forwarding applications). Each of these gateways carries out the processes necessary to keep active the gateway and the set of applications for which it was installed in the field.
In the situation described, imagine that the workload on the gateways is between 5% (minimum) and 20% (maximum) of CPU and RAM resources used (variable for each gateway). However, the use of less performing computers is not an acceptable option: in fact, there are situations for which it is crucial that the resources available to the gateway can carry out sudden workloads without generating disruptions or application down-time.
We find ourselves in the situation in which there are computational resources available and unused. We could say, in economic terms, that there is a supply of computational resources.
Now imagine that you have a list of applications to be run, but without the necessary computers, either because the cost of provisioning adequate resources would exceed the revenue obtained from the applications themselves, or because these tend to be inexpensive in terms of resources, reason which to avoid keeping a system active just to run one of these applications.
This time, we find ourselves in a situation of demand for computational resources: we need resources that we do not have available.
How can previous supply and demand be met ? The answer we propose is edge computing.
Edge computing solutions
The idea is essentially simple: use the resources available by the fleet of gateways already in use to run additional applications.
However, a problem immediately arises: what can go wrong if new processes are assigned to a resource that already runs some? The workload increases, exposing the resource to the risk of crashing: this should be avoided in the situation previously described. The gateways available to us must remain active. In other words, there must be no excessive impact on the computational resources (let’s say processor and memory, to simplify) available to each individual computer (gateway).
The solution proposed for this last problem is to “intelligently” assign the applications to the various gateways, so as to:
- Take into account the current resource usage of each gateway
- Balance the workload as much as possible, so as to keep each gateway safe from crashes.
One possible solution: plan the assignment of applications to fleet nodes (the gateways) using Artificial Intelligence.
But what type of Artificial Intelligence?
The tool
OptaPlanner seems to provide the solution to the needs presented.
OptaPlanner is a Constraint Solver, or better, Artificial Intelligence specialized in particularly complex problems from a computational point of view: when the problem scales (i.e., when the number of nodes and applications involved increases), the difficulty finding the solution increases dramatically. Our problem is precisely this.
Furthermore, OptaPlanner introduces the possibility of diversifying the type of constraints in the assignment of applications to nodes. In fact, we talk about:
- hard constraint: a constraint that must not be violated in the planning, or better, if violated, makes the planning itself unusable – for example: in the assignment of applications (each of which requires a certain amount of memory) to a certain node, the memory available to the node itself must not be exceeded;
- soft constraint: a constraint that one would prefer not to violate, but a violation does not preclude the use of the solution found – for example, it is preferable to balance the workload on the fleet nodes as much as possible, but some oscillation in this sense is acceptable.
Similar use cases
A quick search of the possibilities offered by OptaPlanner leads us to discover an interesting use case: Cloud optimization. The use case requires assigning a series of processes to a series of PCs, optimizing the use of resources and minimizing costs. However, looking beyond the similarities of the model, the case of edge computing analyzed here presents significant differences, for example:
- There is no need to minimize the cost of resources since they are already available;
- It is not necessary, it is actutally counterproductive, to concentrate the workload on a few processing units; on the contrary, it is advisable to distribute the applications as uniformly as possible among the various nodes.
Therefore, beyond the inspiration that the use case can provide, it is necessary to provide a re-modelling of the problem and a re-implementation of the rules envisaged for planning.
The model
Let’s describe the key points of the model used to represent the problem we are considering.
Moving forward, let’s use “application” to describe a process that we must assign to one of the computational resources available to us. Likewise, “node” will be used to describe the computational resources themselves , i.e. a fleet gateway
Each application requires a certain amount of resources to run. To simplify, we can say that an application needs:
requiredCpuPower
RequiredCpuPower: The computing power required for execution (we are not interested in specifying a measure for this here)requiredMemory
RequiredMemory: The amount of memory needed to run the application.node
Node: The fleet node on which it must be performed (for modeling reasons using OptaPlanner, it is more convenient to talk about the node of an application rather than the applications of a node).
Each node has:
cpuPower
CpuPower: Amount of maximum computing powermemory
Memory: The maximum amount of memoryusedCpuPower
UsedCpuPower: Computing power already committed to the node itself – updated periodicallyusedMemory
UsedMemory: Memory already committed to the node – this is also updated periodically.
An initial formalization of the problem can be given as follows: Assign the various applications to the fleet nodes while maintaining the workload balance between the nodes as much as possible while guaranteeing a margin of resources available for any sudden workloads for the fleet itself.
In other words, we must try to:
- Balance the processor and memory load as much as possible
- Guarantee a free quota of resources for each gateway
Note
One possibility is that applications are containerized, that is, set up to run via Docker. Sometimes, it is the case that the available gateways run systems complex enough to include Docker itself. In this case, each application must be considered “by itself”, i.e. there are no orchestration needs between different applications. In other words, we don’t need orchestration systems like Docker Swarm or Kubernetes to integrate applications together: each of them is independent.
This is why it is necessary to implement an AI planner that distributes the workload.
The project
We have implemented a PoC for the proposed edge computing solution.
The project includes various components: a first component, run in the cloud, is a Spring Boot backend (with persistence on PostgreSQL databases); an Angular frontend is then envisaged that allows for interaction with the application, to schedule planning and to launch the execution of the applications themselves on the fleet. There is also a component that implements communication between the backend and the various nodes, each of which runs an agent for exchanging information with the cloud.
For the transmission of messages between the backend and the nodes, it was decided to use the MQTT protocol: the cloud backend therefore includes a client that interfaces it with an MQTT broker, installed in the cloud, which forwards the messages to the various nodes. Each node runs an MQTT client in its agent that receives messages and implements an execution procedure for the applications assigned to it. Once the execution is finished, the agent on the node notifies the cloud backend of the outcome via MQTT.
The backend
The backend includes various components:
- A classic Spring Boot application that exposes REST APIs consumed by possible frontends
- A planning component using OptaPlanner
- An MQTT client for communicating with the fleet of gateways
Let’s briefly delve into how the planning component is implemented.
The planner
OptaPlanner uses heuristic algorithms to search for solutions.x The problems of the class to which ours also belongs have the following characteristics:
- Given a candidate solution, it is easy to check, within a reasonable time, whether it is compliant with the necessary requirements;
- However, it is very complicated to look for optimal solutions in a reasonable time
The trick then is the following: carry out assignments in an “intelligent” way, proposing solutions to the problem, to then check how compliant these are with the necessary requirements. To this end, we use the notion of a solution’s score which represents how well the solution adapts to the hard and soft constraints. The score is calculated by evaluating the solution against different rules which can be implemented in different ways with OptaPlanner. It could be the case that there are several solutions with the same score for the same problem.
Problem Definition
The planning component is based on different entities foreseen in the problem.
- A fleet node is described with a
Node
. A node has certain quantities ofcpuPower
andmemory
available; furthermore, a node has certain quantities of usedCpuPower(computing power used) andusedCpuPower
(memory used) that are not available in theusedMemory
: the latter are periodically updated by the backend through communication with the various nodes. - An
App
is represented with an App that must be executed on the fleet of nodes. OneApp
predicts the amount of CPU and memory required, represented byrequiredCpuPower
andrequiredMemory
; furthermore, oneApp
is assigned to aNode
by the planner.
Using the OptaPlanner language:
App
App is the @@PlanningEntity
, i.e. the object that OptaPlanner modifies during planning (the object on which it makes the assignments);- The @
@PlanningVariable
the propertyNode
of anApp
: the planner evaluates different assignments for this value and calculates the best solution among those proposed using a score.
Let’s take a look at the main rules used for planning.
Hard rules
Remember that these rules must not be violated in the solution proposed by the planner.
- The sum of the total amount of CPU requested from a node by the applications assigned to it and the amount of CPU already used on the node itself must not exceed the total amount of CPU available to the node.
- The sum of the total amount of memory requested from a node by the applications assigned to it and the amount of memory already used on the node itself must not exceed the total amount of memory available to the node;
- A margin of 1/10 of each of the resources is expected to be left free on each node
Soft rules
- It is preferable to assign all applications from the list provided (but it is acceptable for some applications to remain unassigned if the alternative involves violating the previous rules).
- It is preferable to balance the workload between the various nodes, avoiding situations of excessive overloading on some particular nodes.
In particular, the last rule requires fine tuning of the model and the proposed implementation and is the point on which it is most necessary to work, perhaps considering different benchmarks.
Edge computing demo
Here, we include a GIF that presents a frontend demo of scheduling with some applications running on a fleet of gateways.
The demo shows the two phases, launched by two buttons present on the right of the screen: Launch allocation and Launch execution.
The first phase runs the scheduler, which assigns applications to the various nodes available. This phase, depending on the data and rules introduced, can also have a significant duration.
The second phase, however, launches the execution of the applications on the gateway fleet; once the execution of an application is finished, the cloud is notified of the outcome which removes the application from the displayed screen.
Deployment
The deployment of the edge computing solution can take place in different ways, depending on the needs and installation opportunities.
Let’s first take a look at the possible deployment solutions that involve installation directly on the server system, i.e. that do not use particular deployment services. In this case, you have a server computer (whether it is in cloud or on premises is of little importance here) and you proceed with configuring the system.
Direct installation
A first obvious solution is to install the application components on the operating system of the server computer directly. In this case, you must plan to install a PostgreSQL database server, an MQTT broker (for example, Mosquitto is a lightweight and versatile solution) and a server component that can run the Spring Boot application with the backend (for example, a JAR
). The frontend can be served by the same server used for the backend, or by a specific server, for example, Nginx, which can also be configured as a reverse proxy for the backend, thus managing TLS encryption as well.
Containerization
A second, more modern solution involves the containerization of individual services, which are then performed by a container management service, such as Docker. This solution offers some advantages from the point of view of independence, reproducibility and scalability of the various services. It is possible to use a more advanced tool, for example, you can configure a Docker swarm (see the Docker Swarm engine) or a Kubernetes cluster .
If you decide instead to install the solution via a managed service, you can opt for Elastic Container Service or Elastic Kubernetes Service, both provided by Amazon Web Services; other cloud providers provide similar solutions.
Conclusions
The need to run applications without incurring the costs of provisioning the necessary infrastructure can be satisfied if the the availability of unused computational resources that can arise from a fleet of gateways installed in production is found.
The use of Artificial Intelligence software like OptaPlanner provides a solid framework for implementing an application planning and scheduling solution, which presents many possibilities for configuration and insight.
The PoC introduced here takes a first step in the implementation of an edge computing solution, even on a large scale.