#Balatro RL Agent

1 messages · Page 1 of 1 (latest)

tough stratus
#

I trained a balatro RL agent to play the game. It's not very good yet, but it's getting to ante 4/5 most of the time while I'm debugging the automation mod. I think I can get it to beat white stake soonish, but I need to implement a few things first, like using tarot cards on cards, and simulating more of the joker pool.
This isn't my best trained agent, but it's the first one I got working with the automation mod fairly well:
https://www.youtube.com/watch?v=Ne6hwy64c94

Ian

Trained via reinforcement learning on a custom environment, then integrated into the real game via a mod that exposes a websocket to control it.

▶ Play video
limber maple
#

not sure exactly how ai works but i think this might be a lil better way to train it
starting small: it hasn't figured out that playing less hands gives it more money. it hasn't worked out interest.
spending money: seems like rn it just buys as much as it can, prioritising joker packs > jokers > planet packs > planets, instead of actually recognising what is better for it in that situation
tbh just limiting it to a one joker at a time, replaying the game over and over at the time its meant to be triggered until its figured out the way to use it (because it doesn't even try getting value out of mystic summit, droll joker, ride the bus or walkie talkie)
obviously this would take ages, but if you really want it to learn the best then this probably better
the fact that it manages to get to ante 4/5 rn is purely luck tbh

tough stratus
#

It is rewarded for playing less hands and earning interest, this is just not fully tuned. Part of the issue with other jokers like ride the bus is that the mod doesn't have a way of exporting the information about it's current mult score. In the simulator that information is available and so the blind agent plays around it. At runtime though it always thinks there's no harm done because it's already at 0.
I do have a curriculum in the simulator, and part of that is just having it learn with one joker at a time. As for luck though, a purely random agent almost never beats round 1. A lot of the challenge is just learning to play the poker hands in the first place.

versed cosmos
tough stratus
#

Yeah that require for modal behavior combined with the combinatorial action space presents some serious challenges. For a bunch of reasons is this is not a trivial RL problem for sure.
Been playing around with various ways of decomposing the problem to make it easier to learn, e.g. having a hierarchical control setup where there are multiple agents to play different strategies, and a higher level control agent to select which to use for each blind.

tough stratus
#

Got it up to its first white stake win!
https://www.youtube.com/watch?v=tNoZEpwv93E

Ian

As far as I'm aware this is the first case of an AI winning a game of Balatro.
This run was done using a random seed on white stake. Win rate is currently about 30%

Trained using policy gradient methods, not LLM based.
Definitely still some problems and behavioral quirks, but significant improvement from last time.

Mods enable automation an...

▶ Play video
#

I haven't had much time to work on it the last few weeks, but had a couple significant breakthroughs regardless. Lots of behavioral quirks, but I can start tracing down and finding ways to tune them out (Or at least understand why it does them). I'm pretty happy with the progress so far.