Execution Resilience

Colonies Servers automatically monitor the progress of process execution. If a process isn’t completed within a designated time frame, as specified in the function description, it’s automatically reassigned to another Executor. This fail-safe ensures that meta-processes are always executed, even if an Executor crashes. This built-in resilience also opens the door for advanced testing approaches like Chaos Engineering, where Executors are deliberately terminated to validate the system’s robustness.

The ability to stop or destroy Executors without the risk of data losses significantly simplifies software updates and system management. For instance, if an Executor is deployed on a Kubernetes cluster and runs within a pod as part of a Kubernetes deployment, an administrator has the flexibility to scale the system up or down at any time. This can be done either to optimize resource usage or to roll out an upgraded version of an Executor. The built-in resiliency of the system ensures that ongoing meta-processes are reassigned and completed, thus offering a seamless operational experience.