I am writing a gym environment for my reinforcement learning agent to train in, but it is slow as heck. I am using numpy for all my operations, there is a single loop in my code (the while in the init to make sure the sample is within the range). I am thinking that maybe pandas is the reason why its slow, or maybe im just writing slow inefficient code, in any case id appreciate a second set of eyes
https://paste.pythondiscord.com/isepapabuw (please ping replies thanks <3)
#Is there anyway to make this faster?
19 messages · Page 1 of 1 (latest)
profiled the environment using cProfile and as i thought, its mostly pandas
this is the code i used to profile it
from .env import TradingSimulator, Action
import cProfile
import io
import pstats
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--out', type=str, default='./envprofile.csv', nargs='?')
out = parser.parse_args().out
def test_trading_env() -> None:
env = TradingSimulator()
state, _ = env.reset()
assert state.shape == (2, env.seq_length + 2)
action = Action(0.4, 0.3, 0.3)
state, reward, done, _, _ = env.step(action)
assert state.shape == (2, env.seq_length + 2)
assert isinstance(reward, float)
assert isinstance(done, bool)
## profiling the environment
if __name__ == '__main__':
pr = cProfile.Profile()
pr.enable()
for _ in range(1000):
test_trading_env()
pr.disable()
result = io.StringIO()
ps = pstats.Stats(pr, stream=result).print_stats()
result = result.getvalue()
result='ncalls'+result.split('ncalls')[-1]
result='\n'.join([','.join(line.rstrip().split(None,5)) for line in result.split('\n')])
with open(out, 'w+') as f:
#f=open(result.rsplit('.')[0]+'.csv','w')
f.write(result)
f.close()
im still not blaming pandas completely, it could be that I am using it wrong, my _calc_state function is what spends the most time, and then the getting of data from pandas in there is what slows it down, perhaps there is a better way to use pandas there
def _calc_state(self) -> None:
## state is shape (assets, assets + 288)
## first calculate features matrix (assets, 288)
features = np.concatenate([self.btc_price_data[self.current_step-self.seq_length:self.current_step]['close'].to_numpy(),
self.eth_price_data[self.current_step-self.seq_length:self.current_step]['close'].to_numpy()]).reshape((2,self.seq_length))
## price relative vector (assets, 288)
price_relative = np.zeros((2, self.seq_length))
open_prices = np.array([self.btc_price_data['open'][self.current_step - self.seq_length], self.eth_price_data['open'][self.current_step - self.seq_length]])
price_relative[:, 0] = features[:, 0] / open_prices
price_relative[:, 1:] = features[:, 1:] / features[:, :-1]
## concatenate features and price relative covariance
self.state = np.concatenate([features, np.cov(price_relative)], axis=1)
this is the calc state function
My first thought is that you create a lot of temporary variables which involves a lot of manipulation of the arrays. Now you have the logic sorted, can you reduce that?
not really much i can do tbh, i did make it 7x faster by switching to polars and using float32s instead of float64s though
Nice!
Have you tried pandas 2.0?
never heard of it tbh, but I switched to polars and that is so much faster
is it a different package than pandas? my pandas package is up to date
I haven’t tried, but supposedly Arrow was used instead of numpy, so should work well with large data
It’s just the latest updates
For the longest time, that was the reason why it’s slow
i doubt that tbh
because polars also uses apache arrow
and the speedup was obvious when i switched to it
I keep seeing things for polar but never tried it, definitely will now.
its not that different from pandas
in my case i just had to rewrite like 3 loc but the logic remained largely the same