ColonyFS

ColonyOS features a built-in Meta-Filesystem designed to make data transfer between Executors easier. Unlike regular filesystems that store data directly, Colony FS only contains metadata on how to access files, for example data location, credentials or configuration settings. The data itself is stored in various other places like Amazon S3 or IPFS. Executors then use the metadata to either download or upload files. They could also use the Colonies CLI that has built-in support for S3.

With Colony FS, Executors can interact with data in a uniform manner, significantly simplifying implementation of data-intensive workflows across platforms. Whether the platform is a HPC system or a Jupyter notebook, Colony FS offers a standardized approach to data management. Additionally, it improves data security by offering end-to-end encryption and data integrity checks.

Another difference between a traditional filesystem and ColonyFS is that ColonyFS is immutable. In ColonyFS, files can’t be modified once they’re uploaded; any modifications require uploading a new revision of the file. This design is intended to prevent unintended side effects, such as data associated with a meta-process being altered after submission, before it is executed. To maintain data consistency, a snapshot of a specific path in ColonyFS can also be added to a function specification. The snapshot includes a collection of all files along with their respective revisions. This not only ensures the integrity of data throughout its lifecycle but also provides full traceability, making it easier to debug a meta-process if it doesn’t perform as expected.