#Network Task Distribution

66 messages · Page 1 of 1 (latest)

frigid maple
#

I'm navigating resource allocation within a growing network of devices, distributing tasks across 14 weekly 12-hour windows, each with a set capacity. Our challenge involves modeling network growth (X) against these windows to optimally plan for new tasks, considering:

Predicting capacity expansion from historical growth rates.
Maximizing task coverage across windows without overreaching.
Dynamically adjusting to real device growth and performance.
Our method assigns a fixed number of devices (Y) to tasks with known operation hours (Z) per week, ensuring precise allocation. For example, a task running 01/04/2024 to 07/04/2024 on 5 devices, with each operating Z hours weekly, is allocated 5 * Z hours, regardless of total network size. This aims for efficient planning without straining capacity, though it may require adjustments as our device network evolves.

How can we model our capacity for the next month, incorporate variability in growth and window performance, and maintain robust planning? Insights, models, or references to similar challenges are welcome.

I just want to make sure that in any given time period when the window is getting the active tasks it needs to process it is Distributing the load fairly based on when the task was created what the assumptions were not what the networks real capacity is.
i.e. If I have a network capacity of 100 hours over 10 devices and a Task assumed locked in 5 devices it would be 50 hours worth of processing.

wind hollowBOT
#
  1. Ask your question and show the work you've done so far. If you've posted a screenshot of a question, specify which part you need help with.
  2. Wait patiently for a helper to come along.
  3. Once someone helps you, say thank you and close the thread with:
    +close
    
  4. Feel free to nominate the person for helper of the week in #helper-nominations
  5. Do not ping the mods, unless someone is breaking the rules.
  6. If you're happy with the help you got here, and the server overall, you can contribute financially as well:
modern mural
#

First of all, can a task run on multiple devices?

#

Like, in practice, training a deep learning model on multiple GPUs (which in itself, is a headache)

#

And second of all, is the number of tasks fixed from the beginning?

#

And thirdly: are all tasks similarly long in distribution?

frigid maple
#

We have some number of devices currently in a window. We expect those devices to run for some time based on historic data.

A individual task can run on as many devices as it needs so it can reach its processing for that window.
It can not run on a device multiple times in the same window.

When creating a Task I put number of devices you want to run it on as a way to kind of say % of the network you want to use assuming our network capacity over the period of your task will be Y.

I dont know if I'm making much sense haha

#

The number of tasks is indeed fixed for each window

#

To really put the problem simply its really. What data should I put in as an assumption for some future task. So when the task starts to run it gets a accurate percentage of processing time related to what it asked for

modern mural
#

I apologize, but to be fair I have no clue

#

But I can try to give you a couple of ideas

#

That I can remember from the top of my head

#

Though they may or may not be related at all

#

Do tell me if you find any correlation or think what I'm spouting is irrelevant

#

So think of a restaurant that opens at night. The restaurant is open for Z hours, and there are Y tables

#

So obviously, you can handle Y customers/group of customers at once

#

and if a customer arrives and all the tables are full, well, they've got to queue and wait until a table opens up

#

So there two things to consider here

#
  1. The time it takes for a table to open up
  2. The arrival of customers
#

Such a theoretical framework already exists and is called queuing theory, perhaps you're also already working on that

#

So here, you suppose that arrivals follow a poisson point process

#

(i.e. inter-arrival durations follow an exponential distribution)

frigid maple
#

Haha I actually didnt think about it in terms of a restraunt good point

modern mural
#

And depending on how you wanna go about it, the time it takes for a table to free up can also be modeled as random

#

following another exponential distribution

#

The reason I'm telling you about this is that the exponential distribution model can simplify computations by a lot

#

Why? Because the minimum of n iid exponentials is also an exponential

frigid maple
#

Could this way of thinking work in the context of booking X tables in the future assuming Y tables will be in the restraunt at some time. Ensuring that if we start bringing in new tables faster than expected the booking doesnt get more than X

modern mural
#

(i.e. if all tables are full, the time it takes for a table to free up is also following an exponential distribution)

#

There may be other more complex models, but I am not well-versed enough in this topic to tell you about them

frigid maple
#

No problem man, thanks for you insight ❤️

modern mural
#

Well, hold up, there are a couple more things that I want to mention

#

Your goal is to ultimately adapt your network in order to be more efficient in your task handling

#

which in turn would involve, guessing the distribution of the arrivals and the completion requirements for the tasks

#

in our example, it would mean to know the distribution of customer arrival and eating times

frigid maple
#

Yes

modern mural
#

so that you can know if you need more tables

#

or hire more staff

#

And this right here, can be viewed in two ways

#

The first one is purely statistical

#

parametric statistics, i.e. guess the parameter of the exponentials given their distribution

#

The second one is, and I'm not very proud of saying it, machine learning

#

which essentially does the same thing

#

but it can be more flexible depending on the model you adopt

#

As I mentionned earlier, knowing your probability distributions would mean that you have a complete model of your system and can do computations

#

on whether your network is large enough to handle all the tasks

#

But in practice you only have observations

#

so guessing the distribution from observation is what statistics are about

frigid maple
#

I think I already have a good way of determining the actual capacity of the network currently based on historic data. I suppose the issue I am really having is making sure a task only gets the processing it initially thought not extra just because the network has increase in size

#

So a task doesnt have a fixed percentage of the network

#

It has a percentage of the network assuming X network size so it needs to preportionatly shift based on actualy network size

#

At that point should I just provide the data directly into the task i.e. the assumption

modern mural
#

If a task only runs on one device at a time, and all tasks follow the same distribution, it may be possible to use this queue model. However, because you said that a task can run on multiple devices, it does challenge the model

modern mural
#

And see what happens after

#

I am sorry that I cannot offer you more helpful advice, unfortunately this kind of topic is not exactly my strong suit

#

I will still give you a couple of links you can look into regarding what I talked about earlier

#

In probability, statistics and related fields, a Poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space with the essential feature that the points occur independently of one another. The Poisson point process is also called a Poisson random measure, Poisson random point fiel...

#

Queueing theory is the mathematical study of waiting lines, or queues. A queueing model is constructed so that queue lengths and waiting time can be predicted. Queueing theory is generally considered a branch of operations research because the results are often used when making business decisions about the resources needed to provide a service.
...

frigid maple
#

No problem man cheers for your time

modern mural
#

If anyone has better insights on this topic, feel free to complement on what I've mentionned

modern mural
modern mural
# frigid maple No problem man cheers for your time

To further back-up my claims:
Assume that the restaurant has y tables. Suppose the arrivals N follow a Poisson point process of intensity alpha, and the eating times follow an exponential distribution of parameter beta.

The time it takes for a table to free up is the minimum of y exponential distributions, therefore an exponential distribution of parameter y*beta (https://math.stackexchange.com/questions/580279/how-to-prove-that-minimum-of-two-exponential-random-variables-is-another-exponen).=, which average is given by 1/(y * beta).

On the other hand, the inter-arrivals are given by an exponential distribution of parameter alpha, so the average time before another customer queues is 1/alpha.

Ideally, you want the next customer to wait as little as possible, while not increasing too much the number of tables y. So if in average, you want a table to free up before the next customer arrives, you need 1/(y * beta) < 1/alpha

#

This is possible because you have control over y > 0 the number of tables