#orbit-wars

1 messages · Page 1 of 1 (latest)

dry forum
#

hey oh! a question, how every often do the agents fight? I had my first submission that it did quiet a few in a short time but i had another submission doing less in a way longer time so i'm confused now , there are more participants but i had less fights in a longer period of time

ripe patrol
#

Hi! Can I we plz have code of the game?
It is really useful to optimize agents

unique niche
ripe sedge
#

Hi all

ripe sedge
#

Has anybody summitted any agents yet ?

spring owl
#

Yes

#

What I'm quite worried about is that my agents are not performing up to standard

#

Like for the same agent I submit now I got 200 points less than the agent I submitted previously

ripe sedge
#

I got my first agent submitted 🙂 I'm assuming they make it play itself as a validation ?

storm wadi
spring owl
lucid orbit
#

Hii @everyone 🤝

#

Myself Amber Agey from India

glossy schooner
#

Hello I’m seeking teammates dm me if you’re interested

opal spear
#

Hello all!

I'm wondering if someone can answer a question for me. How might you submit a bot that uses external libraries, e.g. numpy for this competition?

ripe sedge
#

I woul just test it Im sure numpyt would be available in the enviornment

opal spear
#

Makes sense! I just realized the kaggle_environments install includes a ton of stuff, numpy included.

drowsy ferry
#

hello. im happy to be the part of this competition. i have a question. what version of orbit wars that used in orbit_wars competition right now?

spring owl
#

The rankings will keep on changing based on the agents launched by the participants

drowsy ferry
#

sory ,i mean orbit wars version. becuase now im using orbit wars 1.09

spring owl
#

I don't think there are versions??

abstract junco
#

Yes, the leaderboard is running version 1.0.9. There should be an update with fixes for 4p incoming on Monday

viscid canopy
#

Has there been more 4P games recently?

drowsy ferry
sand orbit
#

did the 2500 rated guy get banned??

#

💀

pulsar barn
#

no, hes on 4th place now

#

anyone have any luck training rl agents?

river hedge
#

Now I'm experimenting with evolutionary algorithms because they seem more intuitive to implement (at least for me). It's really slow, but I got a score of 700~800 with 500 generations and just self play

pulsar barn
#

im trying to train an agent with ppo, but its really slow to converge

river hedge
#

What helped me was better selecting the features for the model (fewer and informative features make the model output faster, avoiding timeouts and speeding the training)

#

So my code looks like this:

  1. Make a list of N discrete possible fractions of ships to send
  2. Make K tuples of possible combinations of (source, destination, number of ships) as possible actions
  3. Calculate features based on each action and the current state
  4. NN should return probabilities of all actions given the current state
  5. Take the most probable actions
  6. For each action, calculate the angle between source and destiny numerically (I don't need the network to learn this parameter as it would overcomplicate the approach)
pulsar barn
#

i can understand using evolutinoary algos for architecture discovery, but you still have to train the net, right? like are you using the algo to optimize the net?

river hedge
pulsar barn
#

sounds strange to me, but if it works time will show

river hedge
#

And it's generating good results so far

pulsar barn
#

cool cool

#

i think RL is the way to go though. just gotta tune the hyper params, make the env easier to learn etc. then just let ppo converge at optimal policy

pulsar barn
misty oyster
pulsar barn
#

what kind of reward shaping are people using here?

hollow hollow
#

Main reward is the win/lose one,and i adopted some aux rewards like ship difference, fleet hit reward.(But i scaled it in order to make 0-sum) As far as i guess, aux reward will not be important after the model become strong enough. until then, aux rewards would help to improve faster, but after that, it would prevent learning strong strategy

river hedge
pulsar barn
earnest birch
#

how do i start implementing RL? im being stopped hard at ~1050 right now using rule based

pulsar barn
queen glade
#

Hello guys

queen glade
pulsar barn
#

use a framework like stable-baselines3 or similar, lots of resources in the docs also

#

the youtube series Foundations of Deep RL with among Pieter Abbeel etc is a good place to learn the basics

ember sinew
#

hot take: I think it's imprudent to attempt ML/RL without having a heuristics bot in the top 100

pulsar barn
#

I think it's imprudent to even make a heuristic bot for a kaggle challenge.

earnest birch
earnest birch
#

and that could translate over to ML/RL since you already know what to do, training it will be a little easier (though i cant say much about it since i havent tried it yet)

pulsar barn
ember sinew
pulsar barn
#

After a week of tuning and testing I finally have an RL bot that is actually learning. Still will take days to train a competent bot with my setup

ember sinew
#

Learning better abstractions / transformations of the game that let you build better feature + action spaces seem critical. Can make the learning task half as hard if you have a better feel for the game

ember sinew
pulsar barn
#

multi discreet masked action space

ember sinew
#

so each edge, 3 one-hots for 0/needed+1/all? and do you combine that with any search or just single inference -> play move

pulsar barn
#

launch, edge, ship fraction bin

#

i search for the best angle, too hard to learn the angle itself

#

my feature extractor builds candidate edges that policy chooses from, togheter will a launch flag and a ship fraction from a bin. this way i can use MaskedPPO from sb3. then i search the angle and convert to native action

pulsar barn
#

what is the strongest open source heuristic bot right now? want to do some benchmarking

thorny mason
#

probably orbit wars or something similar - you are not doing some form of self play and heuristic in compedium? i keep hitting a plateau and dont get anywhere beyond 10% v my heuristic

flat sage
#

Are you guys going for a 100% win rate against random bots before submitting for live gameplay, or just winging it?

misty oyster
flat sage
#

Good point!

pulsar barn
thorny mason
#

@pulsar barn how are you building your curriculum i have been trying with different ways - i wanted to try and do steps 0-50, 0-100, 0-200, 0-500 because its so sparse and i was hoping to get to like 70% v a good heuristic in each step before progressing but no chance 🙁

  • so i used ppo initially then added awr then curriculum - but i am just not getting above 65% or so on the self play - using the baseline and a lagged player he should beat and the heuristic not more than say 15% for the first set of steps

  • i am using a delta of fleet + prod * potential prod of mine - the opponent :- really not getting far

  • am more than open to ideas as i am a little stuck for how i can improve - there isnt really enough data i think hence my idea with self play as much as possible

pulsar barn
#

i also do pretraining with imitation learning to bootstrap early learning, saves me many steps

#

hope by next week i have a competent agent. self play is introduced much later because it kills my throughput significantly

#

ppo is hard to do because it only works within a sane range. i therefore made a pid controller to adjust learning rate to keep approximate kl divergence within a range

thorny mason
#

yeah i am watching the entropy and kl - any luck so far - i mean i am assuming the reward function is gonna be the killer, i assume you are using some sort of dense function to begin with, i tried vs random and a few others and it kinda learns to optimise fleet and planets but against the heuristics they are just too good - you are probably right that i will need to introduce a load of not so good agents for it to learn to beat progressively

pulsar barn
#

i use a dense reward yes. score is dense reward signal so you dont have to only do win/lose. but i cut back shaping trhoughout the curriculum

thorny mason
#

ok - sounds interesting 🙂 i may give your pid controller and the weaker to stronger opponents ago from my baseline - do you have any other tips? 🙂

pulsar barn
#

i think the main thing is the curriculum. because of the sample inefficiency of rl, trying to help the model as much as possible to have a smooth learning is crucial, unless you have unlimited compute.

#

also, rewrite the env in a high performance language like rust. it will really speed up throughput

thorny mason
#

it has dramatically improved! i have passed through plenty of gates already now onto the 200 steps against where i need to be beating my better heuristics! havent tried it on 4 player yet tho...but i have 2 very weak and 2 weak and they are being destroyed!

pulsar barn
#

nice man!

thorny mason
#

thanks for the input!
--- Eval checkpoint (ep 500, horizon=200) ---
[tier 1] vs bully: 4/4
[tier 1] vs prospector: 1/4
[tier 2] vs rage: 4/4
[tier 2] vs dual: 3/4
[tier 3] vs baseline: 0/4
[tier 4] vs shunlite: 0/4
[tier 4] vs v131_2p: 0/4
[tier 5] vs v131_denial: 0/4
[tier 5] vs v131_wave: 0/4
Eval total: 12/36 (33.3%) improving 🙂

pulsar barn
#

may i ask what framework for rl you are using? or is it a custom implementation?

thorny mason
#

custom

pulsar barn
#

not my github, but you find it in the kaglle discussion board

thorny mason
#

cool

#

they are in java and not exactly orbit wars but llm of choice can port them 🙂

#

he also has a great write up of his tactics!

pulsar barn
#

ill check it out

#

how many steps per secound in the env are you getting total?

thorny mason
#

i am getting about a game per second so currently 200 steps

#

and i am running a 2p and 4p training harness at the same time

#

4p sucks tho looks like it is learning nothing

pulsar barn
#

i sample 2p or 4p games at an increasing rate through the curriculum, so during a rollout i will get some 2p and some 4p based on a percentage

thorny mason
#

ah with just one game so 1 model that will play both? from my heuristic i have also split the logic because i find the tactics are so different - i figured also for the rl

pulsar barn
#

yes, my model plays both types of games. starting really low with only 5% 4p games, up to 50% which i believe is the true objective with reagards to 2p vs 4p

thorny mason
#

interesting...

thorny mason
#

the agents in java i believe

#

there is some lisp - but same same for a llm 🙂

#

his write up is gold tho

grizzled junco
#

@everyone so how do you train models to play this game

#

Like my basic strategy is make a function which calculates a distance from my planet to a target, and based on the number of turns, targets either rotational or static planets, high production, only settling for low production if all has been taken

#

And if I send ships, I send no more than 20 percent of the planets total quota

modest bear
#

does production change over time, for planets?

pulsar barn
pulsar barn
soft sage
#

Yes constant

grizzled junco
#

Like first you train the thing and then what? Like I coded a function the environment runs but i dont know what next

pulsar barn
#

PPO is reinforcement learning

#

you get the data from training...

#

every rollout generates data

loud lynx
short mist
#

guys is there any really good bots i can download to use as opponents for tests?

pulsar barn
#

enders fleet is pretty good, also marco dg v3.3

#

those are probably the best bots in the forums right now

humble cloak
#

Hi,
I wanted to ask something.
Each fleet is represented as [id, owner, x, y, angle, from_planet_id, ships].
If the owner here is enemy, is the angle leaked, or is it a placeholder only?

cosmic hatch
#

Further, It can help you find out the nearby enemy planets if you verify the coordinates and trajectory (mostly ships go directly towards the enemy)

humble cloak
cosmic hatch
# humble cloak But if that's provided for the enemy fleets, then shouldn't it be called leaked?...

Not leaked, think of it as modern day warfare.. Country A and B both have high tech machinery and as soon as Country B sends a missile towards A, the defence system of A detects the angle, coordinates, point of origin and what not to narrow down the threat

It's not leaked, but calculated by the defence (radar detector)

Your focus should be on to narrowing down the threat, the real game tbh.. threat detection is a work for radars now

cosmic hatch
#

Yup 🐣, welcome :)

pulsar barn
grizzled junco
cosmic hatch
# grizzled junco So when we figure out where to send ships, dont we multiply the angular velocity...

Umm, I don't quite understand what you're trying to convey or the formula you're describing

From my knowledge of kinematics and circular motion, I don't remember any formula where we multiply "angular velocity of thing with speed of any thing" to get a proper pair

I don't know about this formula, but upon doing dimensional analysis
Angular velocity(radian/second) * speed(metre/second) will give (m/s²) which is acceleration (i guess centripetal one) and upon integrating acceleration twice with respect to time we can get to position. But still I'm very unsure about your formula, can you please elaborate it or was it just an assumption?

And about finding new Cartesian co-ord, are you referring to (x = Rcos(theta' + omega.t), y = Rsin(theta' + omega.t))?

But anyways , here's my mathematical intuition:

First we determine the start(q) and destination(d) of ships:
Assume ship starts at: (x', y') and target is at (x'', y')

For any circular motion(here planet) we can say
x = x(original) + vt
or, in terms of circular motion:
Theta(final) = Theta(original) + omega*time
Here, theta = angular position & omega = angular velocity

And then the Cartesian co-ord would be:

 x'' = R.cos(theta(original) + omega*t) [horizontal component]
 y'' = R.sin(theta(original) + omega*t) [vertical component]```

And we know that `distance (d) = speed.time`

```distance with target t ≤ d```

i.e.

```underRoot[(x'' - x')² + (y'' - y')²] ≤ s.t```

This should give a minimum `t` i guess.

**Note that I still haven't properly tested this out, on large scale data, but it did work for a few tests. I might not be correct 100%)**
grizzled junco
#

New angle = initial angle + angular velocity × ship speed

#

Then x = r cos new angle

cosmic hatch
grizzled junco
#

Y = r sin angle or something

cosmic hatch
#

I get it, we have the same explanation

grizzled junco
#

But i dont know... there was this demo with this generic function returning moves where you just input into the environment variable

#

What is with these people training models and how do I implement it? 🤔

cosmic hatch
#

I don't think we can use someone else's model for this 🤔

grizzled junco
#

So you make ur own model woth pytorch or something?

cosmic hatch
# grizzled junco So you make ur own model woth pytorch or something?

It's like, they're giving us a bunch of data (a dataset basically) and values as mentioned in Komil's message #orbit-wars message

And then we're expected to do the math and build code to reach the highest score

Every win increases the score, and every defeat decreases it...

And yes we're training a model but not with PyTorch, we're making it accurate with maths

Check more here: https://www.kaggle.com/competitions/orbit-wars/overview

cosmic hatch
#

Ugh, I was finally able to attain a good score.

But how do I coordinate simultaneous attacks, anyone got any suggestions ;-;?

grizzled junco
grizzled junco
#

Like with parallel processing or async?

cosmic hatch
cosmic hatch
# grizzled junco Like with parallel processing or async?

Parallel processing might slow it down I doubt, I found something else, I'll have multiple planets attack one target that is closest then decide if they should attack same target or upcoming threats can be handled easily and individually

Basically pseudo group work

queen glade
queen glade
pulsar barn
#

id start by writing the a gym wrapper for the kaggle env, then pick a rl method from a framework and run it

queen glade
raven vigil
#

Anyone struggling in Consistency and want to learn together.
DM me.

bold harbor
#

hi

#

i am participating today, is it too late?

proper crystal
#

also, I remember seeing that someone made a byte-accurate faster simulator for the kaggle environment
I wonder if anyone has the code for it

brazen swallow
#

It shouldn't be too hard to recreate the simulation in c/cpp then bind it to python

cosmic hatch
proper crystal
#

I see
thanks ya'll

I might just write it in JAX, vectorizing what I can, and call it a day lol
hopefully that works

#

probably won't be exact but whatever

cosmic hatch
#

jax cool lol

proper crystal
#

yeah
I'd be able to choose my accelerator without bothering with reoptimization

#

or torch.compile

cosmic hatch
#

i understand that

#

it'll be helpful

#

but im trying to find more ways to improve the performance

proper crystal
#

collision checking on GPU might be significantly faster than on CPU

#

especially with RL environments when you're running a bunch of sims at once

#

I'm currently just building a JAX wrapper for the provided env and probably will work on a JAX sim reimplimentation

cosmic hatch
proper crystal
#

I vibe coded something a while ago and it had a 10^3 throughput speedup
but it made some approximations that I didn't like
I'm gonna do it myself later

proper crystal
proper crystal
#

and will transfer to Kaggle later

#

yeah it was crazy

cosmic hatch
#

thats some crazy vibe coding

#

what ai agent was used?

proper crystal
#

but I think it didn't model comets properly
among other things

#

so a realistic speedup from a GPU implementation of the sim might be less good

#

but even a 100x speedup is insane

proper crystal
#

it wrote a byte parity test and according to that the numba implementation was byte accurate, but idk how much I trust it

#

the numbers seem to good to be true

#

but I still believe that there is something of substance that can be achieved via GPU acceleration
even if numerical precision may be off

cosmic hatch
cosmic hatch
cosmic hatch
#

i dont like vibecoding, but tbh its cool at times if used properly

proper crystal
#

I found that LLMs struggle with JAX a lot

#

most of the time I end up just doing it myself lol

proper crystal
cosmic hatch
cosmic hatch
#

like refactor the env in rust

proper crystal
#

oh I see

cosmic hatch
#

yupp

proper crystal
#

I mean I know it’s claim is bs because I know exactly how to model comets along with planets

cosmic hatch
proper crystal
#

it’s just that I was feeling lazy

#

it claimed that a bunch of things weren’t vectorizable until I told it how

cosmic hatch
proper crystal
#

lmao fair enough

#

might try it
but honestly I’d rather have code that I can understand atp

#

so I can write better tests and such

cosmic hatch
#

well goodluck with your tests, i gtg

proper crystal
grizzled junco
proper crystal
brazen swallow
#

I've just finished hand writing the simulation into cpp + with rendering for debug

grizzled junco
#

Like any limitations here

#

In my code

#

Im thinking of making planets robust, only expending lots of ships for high production orbiting targets

#

Barbell strategy

#

Is this idea feasible or not

grizzled junco
brazen swallow
#

cpp + rendering*

#

I don't have a model yet, just rule based stuff

#

I wanted to rewrite the enviroment before starting models

proper crystal
# grizzled junco Is this idea feasible or not

I can’t speak for strategy but performance wise i’m personally trying to avoid loops when i can
the same approach might be able to speed things up for you

tho idk how much that would help as from what I understand you’re not exactly using an rl strategy

#

i’m currently at the environment setup phase moreso than the agent creation one
I plan on using RL with JAX so I want everything set up nicely before I start

grizzled junco
#

But i dont know how to use barbell strategy and rl

#

Like i have robust planets

#

But my attacks are always so terrible

#

Thats why I lose most times

#

And I thought rl allows for more dynamic and unconventional attacks

#

But im not sure

proper crystal
proper crystal
proper crystal
bronze pewter
#

code is very short as well imo

proper crystal
#

also your agent seems to really hate being put on the edge of the map

bold harbor
#

im thinking about using rl but confused about which algortihm to use

grizzled junco
#

So how can I attack more while meeting this conservative stance on ship accumulation

grizzled junco
#

New strategy

proper crystal
#

because it seems that your total ship product/step is pretty low

#

and gets lower relative to the opponent as the game goes on

proper crystal
grizzled junco
#

So ur saying that i produce less ships than the enemy and attacking broadly = more ships

proper crystal
#

yeah, because more planets means more total production

grizzled junco
#

But i noticed my code is really good at defense

#

But I'll figure out a way to accomodate

#

So improvements: frequency of planets

#

Better orbital computations

proper crystal
#

like the counter to stockpiling ships on one planet via that planet's production is just take many planets and just outproduce the defending planet

grizzled junco
#

Hmmmmm🤔 yet what about planets with low production

#

Those dont yield much

proper crystal
grizzled junco
#

But I cant understand why my planet has like 700 ships im each but the opposing player sometimes gets more ships

proper crystal
#

maybe not taking low production high defense

grizzled junco
#

Like i thought I had all the high production planets in the world

#

I'll revise the algorithm

#

More attacks, high production

#

But what about defense

#

Its not like you can take every planet

#

And how to replace the for loops

grizzled junco
#

I thought that meant having the most ships

proper crystal
#

isn't the win condition whoever has the most planets after 500 steps

grizzled junco
#

Then it changes everything

grizzled junco
#

Let's check the official rules

proper crystal
#

yeah im checking as well

#

Scoring and Termination

The game ends when:

Step limit reached: 500 turns.
Elimination: Only one player (or zero) remains with any planets or fleets.

Final score = total ships on owned planets + total ships in owned fleets. Highest score wins.

#

i think you're right

#

it is indeed ships

grizzled junco
#

Hence my current strategy of taking high production planets

#

But my defense is good

#

But offense is terrible

#

Ships fly into space and into the sun as if snipers drank too much vodka

#

Thats the problem

proper crystal
#

here I think I might just have a replication issue

grizzled junco
#

My computations functions did not account for home planet as rotational or so.e weird planet shooting ships into space

proper crystal
#

oh wait

#

I can't paste images

#

uh

#

here try opening that?

#

so I'm putting yours against the nearest planet sniper bot

#

so from what I can see, your agent's planets are the most well defended by far

#

but I'm not sure you're winning
unless I'm reading the output wrong

#

and then sometimes the nearest planet sniper just takes your planets early game

#

can you run yours against the sniper and see if you get the same thing

#

bc that's what I have on my end

proper crystal
proper crystal
# grizzled junco And how to replace the for loops

you don't need to iterate through items in a loop if one step in your loop doesn't depend on the previous ones
you can process everything in parallel via python multiprocessing
or just turn everything into vectors and do vector math via numpy or jax

proper crystal
# proper crystal you don't need to iterate through items in a loop if one step in your loop doesn...

kinda like what I'm doing with the comet code rn

original loop implementation:

# Discretize the continuous ellipse into a dense array of points
dense = []
num = 5000
for i in range(num):
    t = 0.3 * math.pi + 1.4 * math.pi * i / (num - 1)
    ex = c_val + a * math.cos(t)
    ey = b * math.sin(t)
    
    x = CENTER + ex * math.cos(phi) - ey * math.sin(phi)
    y = CENTER + ex * math.sin(phi) + ey * math.cos(phi)
    dense.append((x, y))

my vectorized jax implementation:

# Discretize the continuous ellipse into dense array of points
num = 5000
t = 0.3 * jnp.pi + 1.4 * jnp.pi * jnp.arange(num) / (num - 1)
ex = c_val + a * jnp.cos(t)
ey = b * jnp.sin(t)
x = CENTER + ex * jnp.cos(phi) - ey * jnp.sin(phi)
y = CENTER + ex * jnp.sin(phi) + ey * jnp.cos(phi)
dense = jnp.stack((x,y), axis=1)

EDIT: Pasted the wrong snippet from the original code

brazen swallow
#
Matchup: neutralCostAgent vs productionWeighted:20
Matches: 1000
Start seed: 0
neutralCostAgent wins: 597
productionWeighted:20 wins: 401
Draws: 2
neutralCostAgent win rate: 59.7%
productionWeighted:20 win rate: 40.1%
Average steps: 277.351
Wall time (s): 43.7999
Average final ships: neutralCostAgent=1627.64, productionWeighted:20=2043.3
Average final planets: neutralCostAgent=15.73, productionWeighted:20=10.618
#

Finally getting to mess around with stuff

proper crystal
#

nice!

cosmic hatch
#

my rust env broke ;-;

#

gotta fix it

proper crystal
#

o7

grizzled junco
#

But i tried writefile submissoon.py and I couldnt get a spot on the leaderboard 😭

grizzled junco
#

Why are we using rust here

cosmic hatch
cosmic hatch
#

i saw this suggestion and it worked

fading elbow
#

hi! are your agent rule-based, or have NN and RL involved?

grizzled junco
#

Rules based

fading elbow
# grizzled junco Rules based

yeah, it seems that rule-based models have a higher advantage over NN and RL. My origenal plan is to switch to NN after a few days, but now I'm considering not to switch

grizzled junco
#

Or neural net

grizzled junco
#

Cause it has a prepared good algorithm while models take forever or something

#

My algorithm has robust planets

#

But it is so bad at attacking

#

Like im cringing😭 😭 😭 😭

fading elbow
grizzled junco
fading elbow
fading elbow
grizzled junco
#

Hmmmmm🤔

#

Thats actually kind of flexible 😄

fading elbow
#

that’s the idea 😄 I’m trying to avoid using one fixed aggression value. Safe backend planets can act more like supply planets, whle frontier planets should be more conservative unlss they have enough advantage

bold harbor
#

i have reached

#

1390 rank

#

from 2440

#

lesss goooo

fading elbow
sweet vapor
#

Looking at the games it seems my agent wins it is exclusively when the other agent shoots ships into the void and not otherwise, that's around 650 ELO, so uh, anyone else particularities they otice at their ELO's?

just submitted my RL agent with 2x as much training so I'm hoping it's gotten any better 🥲

cosmic hatch
proper crystal
#

also do you guys have any theories as to why RL agents aren't performing as well as rule based agents

fading elbow
#

I think this is because each match is too short and with too little data. RL is better when dealing with long-term complex data. But for short term, rule-based is better

proper crystal
#

I see

#

makes sense

grizzled junco
#

But i swear the attacks are the worst part

#

Defense is so easy

#

But how to bypass the for loops though

cosmic hatch
proper crystal
grizzled junco
#

Compute distance rotational to compute distance and angle of attack

#

Ships needed exceeds safety floor and planet ships

brazen swallow
#

How long do the bots get per step to run?

vivid tide
#

does someone made a public rust version of the environment?

grizzled junco
#

Does kaggle even allow rust code

cosmic hatch
proper crystal
#

I don’t think I’ve seen anything in the sim’s source code that has a timer

proper crystal
#

@brazen swallow from what I understand this is the config for the game
{'episodeSteps': 500, 'actTimeout': 1, 'runTimeout': 1200, 'agentTimeout': 2, 'shipSpeed': 6.0, 'cometSpeed': 4.0, 'seed': None}

#

so 1 sec / agent for the actions and 2 seconds for init?

grizzled junco
#

Do all planets have the same angular velocity

proper crystal
misty oyster
proper crystal
#

wait what does the bank do?

austere patrol
brazen swallow
#

My agent is crushing 1v1s but almost never winning the 4p

fading vortex
#

sameeee

fading vortex
brazen swallow
#

Pure rule based

#

Hovering around 900~ elo

grizzled junco
fading vortex
#

like he'd tell you 😆

brazen swallow
grizzled junco
#

How'd people compute the angle from a moving planet to static, and static to moving and even moving to moving

#

Precisely

fading vortex
# grizzled junco How'd people compute the angle from a moving planet to static, and static to mov...

I've explored a bunch of approaches. In order of complexity:

  • Just simulate hundreds of launches in different angles and see what they hit
  • Draw straight lines from future positions of the target to current position of the origin - if they have the right length for the fleet speed, calculate angle from origin
  • Model the fleet as an expanding circle from the origin, and solve for tangency of expanding circle to target circle
grizzled junco
#

Well what about early game logic

#

Like im supposed to ideally find nearest high producing planets but i feel like im not expansive enough

fading vortex
#

I use RL

compact plaza
#

Hey, I wanted to ask if bigger bot system are causing more heuristics issues? Is a smaller bot better for Orbit Wars?

grizzled junco
#

But how are people using vectorized bumpy arrays

#

NUMPY I mean

grizzled junco
#

I thought of this

#

So basically I take the parametric equations of the planets orbit

#

Then I find the new angle after a guess of 10 turns

#

Then I get the distance

#

Divide by speed to get the actual time aka amount of turns

#

Is the difference between my guess and the actual close to 0

#

Then I can cinflate

#

And grab the x and y I had computed for my guess

#

Its like Epsilon delta proofs

proper crystal
#

or a tpu even maybe

#

or do they have to be cpu only

fading vortex
#

unless your model is weirdly big you should still be able to do at least 1 forward per turn, but search is difficult

proper crystal
#

no kernel launch overhead I guess…

brazen swallow
#

why doesnt obs expose num_agents to agents?

pulsar barn
#

because its not implemented

#

you can infer it from planets

sage mountain
#

hi, im new to agent competitions on kaggle

#

are the environment configurations fixed as default?

#

i mean this

Parameter    Default    Description
episodeSteps    500    Maximum number of turns
actTimeout    1    Seconds per turn
shipSpeed    6.0    Maximum fleet speed
sunRadius    10.0    Radius of the sun
boardSize    100.0    Board dimensions
cometSpeed    4.0    Comet speed (units/turn)
brazen swallow
#

Yes the constants are constant for the competition

sage mountain
upbeat flint
sage mountain
#

hi, in the combat section it says

When one or more fleets collide with a planet (either by flying into it or being swept by a moving planet), combat is resolved:

All arriving fleets are grouped by owner. Ships from the same owner are summed.
The largest attacking force fights the second largest. The difference in ships survives.
If there is a surviving attacker:
If the attacker is the same owner as the planet, the surviving ships are added to the garrison.
If the attacker is a different owner, the surviving ships fight the garrison. If the attackers exceed the garrison, the planet changes ownership and the garrison becomes the surplus.
If two attackers tie, all attacking ships are destroyed (no survivors).

so if there are 3 arriving fleets on the same turn (possible with 4 players), what would happen to the fleet with the least amount of ships? instantly destroyed and never be considered?

static wagon
#

Hello,

Could you please advise how learned (updated) weights are expected to be stored between episodes? Are we supposed to train the model locally first, and then, once it is uploaded, it no longer continues learning?

brazen swallow
#

So the process is:

Combine arriving fleets by owner.
Sort owners by total arriving ships.
Compare only the largest and second-largest totals.
Ignore all lower-ranked owners.
If one owner survives, that survivor then reinforces or attacks the planet.

rough bay
#

Anyone else having trouble getting scores above 900? I feel like I was able to get into the 800s with a somewhat simple agent but no changes I have made from there has managed to get much above 900.

brazen swallow
#

Are you doing heuristics or rl?

#

I was unable to push higher than 900 with rule based stuff

rough bay
#

I haven't tried RL yet, trying to see how well I can do with rule based. I though I saw some people saying they were getting up to 1200 with rule based but that was earlier in the competition, I think scores have inflated a bit since then.

brazen swallow
#

The leaderboard does move up, but don't be disencouraged, your rule based agent will be invaluable when you move onto self play ppo or something else as a good opponent to collect samples from

proper crystal
#

hey, how good are you guys' RL agents at aiming if you guys are giving them free reign on choosing angles

loud lynx
proper crystal
brazen swallow
#

If you have high throughput it'll learn how to aim over time, but you should be thinking in terms of action space

#

Planet * angle * ships is incredibly large, especially because angle is a float here

proper crystal
#

as it turns out it does

#

kinda

grizzled junco
#

How is RL better tho

#

Do you still have to compute manual functions for executing moves or allocating ships or no