#(Idea) Implementing a simple batch-queueing cluster manager in Gleam

1 messages · Page 1 of 1 (latest)

night island
#

Wasn't sure whether to post this in sharing or here in questions, but given this is just an idea for a project and not actually WIP, I chose here.

Background, in a previous job (Research Software Engineer) we had an in-house compute cluster with a few hundred nodes running an archaic piece of software from the early 2000s called Sun Grid Engine (SGE) - a modernised fork of which has been released called Son of Grid Engine and later forked again as Some Grid Engine.

I would love to try and recreate something functionally similar as my first "proper' Gleam side-project but wanted to scope out if I am missing a huge gotcha that will haunt me or if this seems feasible for a Gleam/functional programming newbie (shoutout to Louis' cat).

The very brief outline is that there is/are grid master nodes that handle job submissions, queueing and quotas as well as communication with worker nodes. Worker nodes are largely dumb, but can be configured with CPU / RAM / Storage as consumable resources that are consumed and released as jobs run and finish. To maximise efficient utilisation of compute resources, each job should ideally be submitted with a max. amount of each resource needed otherwise a default value will be used.

From my reading and research it seems that the networking and communication between the various nodes will not be too difficult to achieve in Gleam with the stdlibs, especially so since I won't be considering any kind of security (i.e. encrypting communications, auth/auth) for the PoC version. @wooden seal 's awesome looking shellout library looks like it has everything else I need (I think 🤞) for starting/monitoring external process calls so I think that should be okay as well.

Does this sound like I am biting off too much in one go as a beginner or is this a case of get stuck in and don't fear failure?

wooden seal
#

Doesn't sound a tiny undertaking, but likely doable.