#reinforcement-learning

1 messages · Page 1 of 1 (latest)

graceful widget
#

Hello RL folks.
Seems like the space is not so active.

#

Anybody learning RL with Me

silent acorn
graceful widget
#

Sure .
I'd be glad to

graceful widget
silent acorn
#

And you?

graceful widget
#

Reading a book for it.
I feel like the courses do not cover enough theoretical ground

silent acorn
icy tree
silent acorn
silent acorn
#

Okay thanks

icy tree
#

No problem!

graceful widget
icy tree
#

Though if you want a video course, David Silver's famous lectures are pretty good, nice balance of math

icy tree
sweet garden
#

Hi,,iam Sharath ,,iam a beginner here in this field of data science, AI and machine learning,,I want to learn RL,,can anybody have time for group studies or guiding me??

wise magnet
#

👋

graceful widget
#

@wise magnet nice to have you

sweet garden
#

Thanks 👍 looking forward to learn RL

brazen mist
#

Yes we can discuss here about deep reinforcement learning

#

As i finished a few topics and feeling confident but have some doubts as well

#

Any working professionals here ?

graceful widget
icy tree
#

What doubts do you have?

brazen mist
#

Also i want to implement those algo in games such as gta 5

icy tree
feral sequoia
#

Hi,
Did anyone work on multi agent algorithm in python ?

fallow vigil
#

do i ask rl related doubts here ?

wet girder
#

How can I balance the actor and critic networks?

verbal latch
brazen mist
verbal latch
#

I assume it's sarcastically said

brazen mist
#

I have a speciality in model based deep reinforcement learning

verbal latch
brazen mist
verbal latch
#

Man it's not just about blindly using the algo, I see that you talk about Deep RL and stuff, and aim to do something crazy which is absolutely great, however the only concern is the basics.

Before even starting with Deep RL, there's a hell lot of stuffs involved in the learning and the planning setting, whether it be standard Bandits in pure exploratory/regret setting or MDP with various versions of it distributed across different types.

marble sigil
#

Hey there! I am a complete beginner to RL, I completed a beginner course on datacamp and solved the Frozen-Lake environment using what I learnt from that course, now I am trying to delve into Deep Q Learning and would like to know what resources I could potentially use, thanks!

fiery jolt
#

Why RL is so hard to understand and learn? Am I the only one having this issue?

fiery jolt
#

If so, can anyone tell me how to start with a simple explanation video or book or any other suggestions

solar geyser
# fiery jolt If so, can anyone tell me how to start with a simple explanation video or book o...

💡Enroll to gain access to the full course:
https://deeplizard.com/course/rlcpailzrd

Welcome to this series on reinforcement learning! We'll first start out by introducing the absolute basics to build a solid ground for us to run.

We'll then progress onto more advanced and sophisticated topics that integrate artificial neural networks and deep ...

▶ Play video
#

This is one of the best series I have seen on RL, she explains it really nicely

solar geyser
fiery jolt
#

But that is a video that I have never seen before so I know it is going to solve my problems for sure

solar geyser
fiery jolt
solar geyser
#

Real

fiery jolt
# solar geyser Real

Thanks man for the amazing playlist you send above I finished 7 of the 15 videos on the playlist I feel like I know more than I could possibly imagine in reinforcement learning

#

And I will definitely watch the rest to make projects in RL and Deep RL

solar geyser
#

😁

fiery jolt
solar geyser
fiery jolt
verbal latch
#

Those interested in bandit literature can check our recent work out

http://arxiv.org/abs/2408.14195

TLDR: We analyse a clustered multi-armed bandit formulation, where the learning objective is to identify representative arms from each cluster, in a fixed confidence setting

@here

fiery jolt
# silent siren Huh can i see it too pls ?

💡Enroll to gain access to the full course:
https://deeplizard.com/course/rlcpailzrd

Welcome to this series on reinforcement learning! We'll first start out by introducing the absolute basics to build a solid ground for us to run.

We'll then progress onto more advanced and sophisticated topics that integrate artificial neural networks and deep ...

▶ Play video
#

Here is the playlist

silent siren
teal folio
silent siren
silent siren
loud harborBOT
#
ruhaan10 has been warned

Reason: Bad word usage

dry widget
#

Does anyone have best reinforcement learning playlist with maths

tired cove
#

@dry widget Check out UCLxDeepmind RL series on youtube. You can also find a specialization from Uni of alberta on coursera. You can audit for free

native kernel
#

i dont know too much about reinforcement learning, but is it possible to make a program that takes cards from a trading card game (like pokemon or MTG) and figure out what a very good deck is by having it battle other decks?

#

i ask if it's possible because i dont know if it would take too long to run and if it would need like millions of battles to figure it out

normal siren
# native kernel i ask if it's possible because i dont know if it would take too long to run and ...

Depending of the game, here they report it could take ~500k steps to train the agent: https://doi.org/10.48550/arXiv.1910.04376

native kernel
#

@normal siren oh wow thank you!! I'll check that out

tender seal
#

HI, I am Abdullah I am an ML engineer want to join any team to particapte in kaggle competions

spice wren
loud palm
#

Hi, Please help me.
I'm going to make a search engine based on customer behaviors.
Inputs: query embedding, history embedding (Metadata is stored with vector formats)
We use cosine similarity and train embedding models using multi armed bandits.(Is that possible?)
I have two questions about this.
First, how to get the gradient of embedding when use consime similarity?(Can that be estimated in torch?)
Second, for the search, we use two steps, updating the weights about historical embedding and query embedding at the same time, I think that can be noisy.
But I can't make sure. I attahced diagram. And if any questions, feel free to ask.
https://drive.google.com/file/d/1_vWxdasnHjCL6_momviQzcDYHAAgc-1M/view?usp=sharing

verbal otter
#

I believe this channel is about Discussing Reinforcement learning kindly refrain to put any other content or stuff which is unrelated

merry pollen
#

Hello everyone, kindly suggest me best courses on reinforcement learning and share reviews about reinforcement learning specialization on Coursera

normal siren
# merry pollen Hello everyone, kindly suggest me best courses on reinforcement learning and sha...

Some learning resources on Reinforcement Learning in no particular order.

spice cypress
#

anyone interested in teaming up for lux AI ?

gloomy token
#

How good is DQN method

#

For EHR'S

fallen cargo
#

Anyone used NotebookLM how is it?

fallen cargo
#

Hi I am new here I think my earlier message was not relevant to this thread please tell me in which thread I can ask such questions
Thank you

clear mortar
#

Hello, I am learning Reinforcement Learning and interested in Automation, robotics and automotive technologies. Looking for a peer or group to learn together. Is anyone interested?

verbal otter
#

Hi Everyone, I am a 2nd Year PhD student in Computer Science at University of Maryland Baltimore County specializing in Machine Learning, Reinforcement Learning, and Mathematical Reasoning in LLMs. I was thinking to write a Review paper on the current Maths Reasoning in LLMs , so was looking for potential collabrators on it. Thanks

ember blaze
wise canopy
white olive
boreal dagger
#

Hi. I'm starting with RL, namely PPO and by extension GRPO. Anyone has prior experience?!

final sphinx
#

im working on a project involving RL and drone delivery optimization, could you guys help by sharing some resources for learning RL that actually helped you all in learning RL? Thanks!

sinful turtle
#

Job Title: Part-Time Senior AI/ML Engineer (Remote)

We are seeking a skilled and experienced Senior AI/ML Engineer to join our remote team on a part-time basis. The ideal candidate will have a strong technical background, excellent communication skills, and the ability to work independently in a fast-paced environment.

Requirements:
-Minimum of 7–10 years of professional software development experience

-Proven experience working effectively in a remote environment

-Advanced English proficiency (C1 or higher); an American accent is preferred

-Availability to work 10–15 hours per week during EST or CST business hours

If you're a highly motivated engineer with a passion for building high-quality software and can commit to a flexible part-time schedule, we’d love to hear from you.
You can connect with me on WhatsApp: +1 (567) 469-5384

bold arch
#

Hi, @everybody
I have one question, I'm training ml models for the prediction, which is classification problem of 3 classes, where the number of samples are similar but the predition is skewed.
First class and second class is predicted with low precision tough, third class is never predicted. What's the reason? I can' t find the reason.
Before, when I applyed reinforcement learning, where the three classes were assigned to three actions and one action is never selected, too.
Actually, that is the preeiction model of forex eur/usd.

tawdry yarrow
bold arch
#

I'm finding a US developer for the collaboration. If anybody interested, please dm me.

gleaming gorge
#

Quick little game thing

#

if anyone has played Buck Shot roullete this is a little sample enviroment I made that you can make a AI on

#

very similar to the video game

solar knoll
#

@gleaming gorge This is really cool, love that you used Mesa for the multi-agent setup, makes it super easy to swap in different agent classes. The BaseAgent abstraction is clean too.

gleaming gorge
#

I've kind of started moving it to seperate files because after 500 lines of code mesa can get hard to track so moving it into a kind of variable holder can help