twilit tundra Aug 3, 2023, 4:57 PM

#

I'd go with the cross-validation selected model, it's the most likely to generalize well

celest vine Aug 3, 2023, 4:58 PM

#

Guys help.
I am running spark in pycharm. Loading data, doing some transformations. But when I try to write it in the same project folder I am getting windows error 5 Access is denied

haughty nest Aug 3, 2023, 5:00 PM

#

twilit tundra I'd go with the cross-validation selected model, it's the most likely to general...

k thanks

agile cobalt Aug 3, 2023, 5:01 PM

#

just keep in mind that if you tune the hyperparameters too much you might end up 'overfitting' them to your test set

small wedge Aug 3, 2023, 5:15 PM

#

I will try to break out the math to make it easier to understand.

C = 1/m * Σ(i=1;m) (yᵢ - aᵢ)^2
a = 1/(1+e^-z)
z = wᵀx + b

∂C/∂a = -2/m * Σ(i=1;m) (yᵢ - aᵢ)
∂a/∂z = a * (1-a)
∂z/∂w = xᵀ


to calculate the gradient of the weights ∂C/∂w we just multiply these partials together using the vector chain rule.

∂z/∂w * ∂a/∂z * ∂C/∂a

which is the same as 

-2/m Σ(i=1;m) (yᵢ-aᵢ) * a * (1-a) * xᵀᵢ

Sigmoid makes it a bit messy to write out but that would be how you calculate the gradient.  Breaking it up into partials makes it a lot easier to understand IMO.

#

I'd recommend reading this https://explained.ai/matrix-calculus/index.html as it goes a lot more in depth than I can on the exact details; although they use ReLU in the example instead of sigmoid.

slim bone Aug 3, 2023, 5:36 PM

#

small wedge I will try to break out the math to make it easier to understand. ``` C = 1/m *...

First off, thank you for taking the time to type all of this
I'm trying to process the part with the derivatives, I'm not entirely sure I follow though:
When you write ∂C/∂v = 2/m Σ(i=1;m) vᵢ for example, don't you need to get the partial derivative of some specific vᵢ? It looks like you took the derivative of every single vᵢ and added them up?

small wedge Aug 3, 2023, 5:39 PM

#

hmm why would you need to take a derivative with respect to a single vᵢ? The sum makes the output scalar anyway

slim bone Aug 3, 2023, 5:40 PM

#

Because there is no ‘v’ in the equation, or is there?

#

It’s just a sum of the delta between predictions and targets?

small wedge Aug 3, 2023, 5:41 PM

#

yes, v is just a term used to shorten that delta between predictions and targets when writing out the formula (and also to make it clear where the -1 is coming from)

#

you can remove v and put the (y - y_hat) there instead and the math will work the same

#

C = 1/m * Σ(i=1;m) (yᵢ - aᵢ)^2
∂C/∂a = -2/m * Σ(i=1;m) (yᵢ - aᵢ)

#

am I understanding what you're asking correctly?

tidal bough Aug 3, 2023, 5:51 PM

#

small wedge hmm why would you need to take a derivative with respect to a single `vᵢ`? The s...

That's not very valid though - what you wrote as ∂C/∂v (which should be a vector - the gradient of the scalar C) is technically ∂C/(Σvᵢ).

small wedge Aug 3, 2023, 5:55 PM

#

oops, I can change it then. I meant to make it more clear where the -1 was coming from but ig I just made it more confusing.

slim bone Aug 3, 2023, 6:05 PM

#

small wedge am I understanding what you're asking correctly?

I think so, I just had to bail for a moment. I need to ponder a little on what you wrote

#

Thanks again

tidal bough Aug 3, 2023, 6:15 PM

#

Here's what I'm getting

small wedge Aug 3, 2023, 6:17 PM

#

tidal bough Here's what I'm getting

ahhh I see what you're saying

#

you're right that was a big mistake

slim bone Aug 3, 2023, 6:18 PM

#

Wow

#

You two are special, I might finally get it
When people just say “use the chain rule” it really doesn’t mean much to a beginner but when you put it like that it’s so much nicer

#

Oh and needless to say, thank you

small wedge Aug 3, 2023, 6:23 PM

#

That's great ^^, I know the feeling of that epiphany. It's really simple if you abstract it out to partials

#

and my apologies with the incorrect math and making it more confusing lol

slim bone Aug 3, 2023, 6:25 PM

#

Oh no worries! It wasn’t such a terrible mistake and it probably made the initial reading a little clearer :)

tidal bough Aug 3, 2023, 6:29 PM

#

I think "use the chain rule" is often confusing because calculus courses don't often have examples where a function has n variables 🙂

#

It'd be like this:

#

(In the derivation above, the function in question is C, which depends on a_1, ..., a_m. Hence the sum in the result.)

desert oar Aug 3, 2023, 6:35 PM

#

fair, i'm guilty of advising that 😉

slim bone Aug 3, 2023, 6:35 PM

#

tidal bough I think "use the chain rule" is often confusing because calculus courses don't o...

Admittedly my calculus 2 course barely even touched multivariable calculus so this is all fairly confusing ^^;
This really helped though

slim bone Aug 3, 2023, 6:35 PM

#

desert oar fair, i'm guilty of advising that 😉

You helped a ton too!

desert oar Aug 3, 2023, 7:24 PM

#

yeah but i hope i didn't mislead along the way

#

@tidal bough do you actually need the sum property there? i think it "expands" naturally in this case because C is itself a sum over i, so the whole thing expands out into a sum of partial derivatives by linearity

#

i never learned total vs partial derivatives properly in school, it's something i should probably revisit at some point

tidal bough Aug 3, 2023, 7:34 PM

#

desert oar <@266216750876459008> do you actually need the sum property there? i think it "e...

sure, you could think of it that way, I'm just mentioning the general case

desert oar Aug 3, 2023, 7:36 PM

#

tidal bough sure, you could think of it that way, I'm just mentioning the general case

i just wanted to make sure i wasn't missing something fundamental! i feel like i didn't learn anything in school properly and had to re-teach myself everything i know, so i'm always wondering if there's something i don't know that i don't know

iron basalt Aug 3, 2023, 7:44 PM

#

slim bone Admittedly my calculus 2 course barely even touched multivariable calculus so th...

There are a few key ideas from which you can probably guess how the rest will play out. It's important to keep in mind what the derivative actually is / represents, and how that plays together with linear algebra. For example, if I just flash this image of the Jacobian matrix, you can probably guess how a lot of other things work (but really you want a multivariate calculus book):

tidal bough Aug 3, 2023, 7:44 PM

#

ah yes, nabla-transposed 🥴

iron basalt Aug 3, 2023, 7:45 PM

#

tidal bough ah yes, nabla-transposed 🥴

Notation was a mistake.

tidal bough Aug 3, 2023, 7:45 PM

#

mathematicians try not to abuse notation challenge (impossible)

slim bone Aug 3, 2023, 7:45 PM

#

iron basalt There are a few key ideas from which you can probably guess how the rest will pl...

I actually think as far as geometric intuition goes I did nail most of it down
As far as connecting between connecting the dots between calculus and linear algebra though… I’m only now beginning to dip my toes into that

#

ML is probably the first time I’ve seen the two go hand in hand

iron basalt Aug 3, 2023, 7:46 PM

#

Oh, and this one image:

slim bone Aug 3, 2023, 7:47 PM

#

iron basalt There are a few key ideas from which you can probably guess how the rest will pl...

And uh, I’m genuinely not sure what you mean by “guess how other things work” ^^;

#

feel free to elaborate

slim bone Aug 3, 2023, 7:49 PM

#

iron basalt Oh, and this one image:

I think this is hitting beyond my bracket considering I’ve just learned how to use the chain rule with more than a single nested function

Nevertheless, extremely curious to note that such a connection even exists. Thank you

iron basalt Aug 3, 2023, 8:05 PM

#

slim bone And uh, I’m genuinely not sure what you mean by “guess how other things work” ^^...

With a strong foundation in the fundamentals of calculus (its purpose / what it's doing), you can predict how it will play in out combination with something like vectors / linear algebra. This is my usual approach, I derive my own stuff, then after that read more about the topic. After a certain point I only need a few cues to predict the rest / adjust my stuff to match. This approach lets me know that I actually understood the previous topic leading up to the next one (confirmation of prediction), including its purpose from which much can be predicted since I then can guess what the inventors of the topic were aiming for / would probably go for next (on the same timeline / "wavelength"/ however you want to put it). The other thing I do is follow a historical approach. I try to find how/why they were inventing the math / what the context was at the time (what did people know then / what were the unsolved problems). This kind of prediction task (predict answer, then check) is usually done via the practice problems in books, but I like to take it a step further and predict the next chapter(s) too. I'm not recommending this approach, it's just what I do.

#

Basically, I like to reinvent the wheel.

serene scaffold Aug 3, 2023, 8:49 PM

#

!otn a squiggle's reinvented wheel

arctic wedgeBOT Aug 3, 2023, 8:49 PM

#

:ok_hand: Added squiggle’s-reinvented-wheel to the names list.

upper flame Aug 3, 2023, 8:57 PM

#

Hey guys do you know a lil bit of finance ? Cause i have a trading ai that i try to finish … could someone help me please 🙏. This AI has a very big potential, the people who accept to help can keep the code and run it to generate some wealth… it’s about 95% done

small wedge Aug 3, 2023, 9:03 PM

#

upper flame Hey guys do you know a lil bit of finance ? Cause i have a trading ai that i try...

you're not allowed to offer money for work here. You can instead ask your questions that you need help with and people might try to help for free 🙂

upper flame Aug 3, 2023, 9:59 PM

#

small wedge you're not allowed to offer money for work here. You can instead ask your quest...

Hello @small wedge, thank you for your advice i didn’t offer any money i just mentioned the code generates wealth. If you want to help me, you’re the most welcome i’ll give you the informations. Thank you

small wedge Aug 3, 2023, 10:08 PM

#

upper flame Hello <@256979025778442245>, thank you for your advice i didn’t offer any money ...

if you post your question and the relevant information here I'd be happy to try

upper flame Aug 3, 2023, 10:11 PM

#

small wedge if you post your question and the relevant information here I'd be happy to try

thank you so much for your engagement

#

Here is a snippet here they are function calls

#

print("test 1")
bot = Bot()
print("test 2")    

# Call Market class

market = Market(symbol='EURUSD', yahoo_ticker='MSFT', currency='EUR', hist_window=365)
market.fx_price()
market.stock_price()
data=market.market_to_dataframe()

# Call the Balance class
ip_address = "127.0.0.1" 
port_id = 7495 
client_id = 1  
current_price=market.fx_price(real_time= True)
price=market.fx_price()


bot.nextorderId = None
bot.run_loop();
print("wa7el Houni");
balance = BalanceApp(ip_address,port_id,client_id)
balance.start()
balance.accountSummary(reqId=123, account="DU11643091", tag="TotalCashValue", value="12345", currency="EUR")
balance.error(reqId=123, errorCode=456, errorString="Some error message")

# Call the RiskManager class

riskmg = RiskManager(balance, stop_loss_pct=0.05)
max_take_profit_pct = riskmg.calculate_max_take_profit_pct()
print("Maximum take profit pct: ", max_take_profit_pct)
order_size=riskmg.calculate_order_size(current_price)
print("Order size:", order_size)
riskmg.calculate_risk(price, stop_loss=7.5)

# Call the NNTS class

nnts = NNTS(lookback=50, units=128, dropout=0.5, epochs=200, batch_size=64)
X, y=nnts._prepare_data(data)
model=nnts._build_model(X)
buy_signals=nnts.generate_signals(data, strategy='buy')
sell_signals=nnts.generate_signals(data, strategy='sell')

# Call the TradingProcess class

tp = TradingProcess(balance, risk_percentage=0.05)
tp.update_equity()
tp.can_open_position(price, stop_loss=0.05)
tp.can_afford_position(price)
tp.open_position(price, stop_loss=0.05)
tp.close_position(price)
tp.update_position(price)
tp.fit(X, y)
tp.predict(X)

# Call the DataProcessor class

datapp = DataProcessor(feature_collumns=["open","high", "low", "close", "volume"])
datapp.preprocess_data(data)


# Call PlaceCancelOrder class

pcorder = PlaceCancelOrder()
pcorder.place_order(buy_signals, sell_signals, symbol='EURUSD', order_type='MKT')
pcorder.cancel_order(order_id=1)

# Call Bot function
bot.execute_trade(buy_signals, sell_signals, price)

#

and the full code is here: https://github.com/CodeBYMehdi/GPT

GitHub

GitHub - CodeBYMehdi/GPT

Contribute to CodeBYMehdi/GPT development by creating an account on GitHub.

#

@small wedge

small wedge Aug 3, 2023, 10:23 PM

#

which part do you need help with?

upper flame Aug 3, 2023, 10:28 PM

#

small wedge which part do you need help with?

principaly the function calls

#

there are almost done but there are some arguments that i couldn't figure how to call them

tidal scroll Aug 3, 2023, 10:30 PM

#

Hello, everyone. I would like to ask about a slight problem in an RDF graph, so the elements are too close to each other, and there is no space between them. I have been working on this project for my final school assignment and have searched everywhere on Google, Graphviz documentation, Stack Overflow, and YouTube, but none of the solutions are working. Therefore, I would appreciate some assistance here if you don't mind.

This is the code

`new_rdf_file = '../../output/rdf/dummy_rdf.rdf'

g.parse(new_rdf_file, format='xml')

gv_graph = graphviz.Graph(strict=True, format='svg', engine='neato')

def get_local_name(uri):
uri_str = str(uri)
return uri_str.replace(nba_players, '').replace("http://", '').replace("https://", '')

for subject, predicate, obj in g:
subject_label = get_local_name(subject)
obj_label = get_local_name(obj)
predicate_str = str(predicate)

# Add nodes and edges to the Graphviz graph
gv_graph.node(subject_label)
gv_graph.node(obj_label)
# gv_graph.edge(subject_label, obj_label, label=predicate_str)
gv_graph.edge(head_name=subject_label, tail_name=obj_label, label=predicate_str)
gv_graph.attr(pad="1.0")

output_file = 'dummy_output.svg'
gv_graph.render(output_file, view=True)`

Thank you in advance

small wedge Aug 3, 2023, 10:36 PM

#

upper flame there are almost done but there are some arguments that i couldn't figure how to...

can you give some specific examples? which function(s) are giving you issues

upper flame Aug 3, 2023, 10:38 PM

#

small wedge can you give some specific examples? which function(s) are giving you issues

the first one is: ```py
bot.execute_trade()

#

what should i put in quantity

small wedge Aug 3, 2023, 10:49 PM

#

upper flame what should i put in quantity

well it'd be whatever units of thing that you're buying/selling with the bot, looks like this is currency exchange?

#

oh or it's just stocks

#

then I assume it'd be how many shares you want of a stock

upper flame Aug 3, 2023, 10:49 PM

#

no it's both

#

but i'll start with currency exchange

#

the thing is i don't know how the ai will buy the units that i can afford

misty flint Aug 4, 2023, 4:08 AM

#

new colab feature?

random raft Aug 4, 2023, 5:14 AM

#

hello

#

anyone?

sleek harbor Aug 4, 2023, 5:46 AM

#

upper flame the thing is i don't know how the ai will buy the units that i can afford

I can only give a small recommendation of do not use that thing in real life with your money. You don't seem to even know how it works..

weary sedge Aug 4, 2023, 6:45 AM

#

Does anyone know a solution to the issue that I have?

"(base)" does not display on the terminal, when I use bash. But when I change my shell to zsh, it displays.

Why is that?

sage obsidian Aug 4, 2023, 6:46 AM

#

Can anyone recommended a difficult python project?

unique flame Aug 4, 2023, 7:40 AM

#

Does a text summarisation model from huggingface sent your text to huggingface? I downloaded the model to my device and run it without internet, but was wondering if there are security issues when summarising personal documents.

twilit tundra Aug 4, 2023, 7:46 AM

#

unique flame Does a text summarisation model from huggingface sent your text to huggingface? ...

In theory, you're just loading pretrained models. There is no data sent to their server

unique flame Aug 4, 2023, 7:49 AM

#

Thanks wanted to be sure!

upper flame Aug 4, 2023, 8:49 AM

#

sleek harbor I can only give a small recommendation of do *not* use that thing in real life w...

of course i know because i wrote the code but there are some issues that i am struggling to fix, and even if i finish it i'll test in a simulated environment

unique current Aug 4, 2023, 8:56 AM

#

guys got question is there a way to replace \ in text using replace option?

#

i tried

message.replace("\", "")

#

but doesnt work

twilit tundra Aug 4, 2023, 9:05 AM

#

unique current guys got question is there a way to replace \ in text using replace option?

\

#

\\

unique current Aug 4, 2023, 9:31 AM

#

?

unique current Aug 4, 2023, 9:31 AM

#

twilit tundra \\

?

twilit tundra Aug 4, 2023, 9:32 AM

#

Use double backslash, backslah is a special character

unique current Aug 4, 2023, 9:32 AM

#

but one is in text

#

i just need to change output of script

#

and output is text and \

sleek harbor Aug 4, 2023, 11:18 AM

#

Question to those who work. How powerful of a PC/laptop do you need? Do employers provide cloud compute, so that you could work on a weak device, or do they expect you to have a powerful PC and use your own processing power for everything you do?

Would the laptop linked below (gave 2 links in case one doesn't work) be good enough for work? No GPU, and not the best CPU. Only 8Gb RAM.. but what do you think?

https://sl.aliexpress.ru/p?key=ScdFZED
https://aliexpress.ru/item/1005001520846730.html?sku_id=12000027438880217&spm=a2g2w.productlist.search_results.1.1a364aa6fKeQyh

serene scaffold Aug 4, 2023, 11:31 AM

#

sleek harbor Question to those who work. How powerful of a PC/laptop do you need? Do employer...

if you're doing AI/ML for a company, they'll very likely provide you with a laptop. and it's very unlikely that you'd be doing model development on that laptop.

sleek harbor Aug 4, 2023, 11:32 AM

#

serene scaffold if you're doing AI/ML for a company, they'll very likely provide you with a lapt...

😢 if it's a remote job for an overseas company - they won't be able to give me a laptop..

serene scaffold Aug 4, 2023, 11:33 AM

#

sleek harbor 😢 if it's a remote job for an overseas company - they won't be able to give me ...

laptops can be delivered

twilit tundra Aug 4, 2023, 11:51 AM

#

sleek harbor 😢 if it's a remote job for an overseas company - they won't be able to give me ...

Most of the companies I've worked for had a cloud environment or equivalent, you don't need a powerful laptop

#

Even outside of tech, they provide their own laptop for security reasons

boreal gale Aug 4, 2023, 1:24 PM

#

this is pretty dang cool! i am just swamped by work atm, i might give this a look in the weekend

sleek harbor Aug 4, 2023, 3:29 PM

#

boreal gale this is pretty dang cool! i am just swamped by work atm, i might give this a loo...

Would be grateful, as I really can't seem to figure out why the table doesn't display immediately, as expected

slim bone Aug 4, 2023, 3:32 PM

#

Kind of curious regarding Pytorch's nn.Linear() function:

test_img = torch.ones(1,4,4, dtype=torch.float)
test_flatten = nn.Flatten()
test_flattened_image = test_flatten(test_img)
test_layer1 = nn.Linear(16, 4)
test_hidden1 = test_layer1(test_flattened_image)
print(test_hidden1)
-------------------
output:
random values in (-1,1)

Anyone got any idea what that's about? are the weights just initialized randomly?
document for reference: https://pytorch.org/docs/stable/generated/torch.nn.Linear.html

twilit tundra Aug 4, 2023, 3:34 PM

#

The weights are initialized randomly

slim bone Aug 4, 2023, 3:35 PM

#

twilit tundra The weights are initialized randomly

Oh, and I'm assuming there's a way to attribute a weight to every node between two layers somehow later on?

twilit tundra Aug 4, 2023, 3:35 PM

#

Screenshot_20230804_173500_io.github.forkmaintainers.iceraven.png

#

Yes you can overwrite the attributes or reload a pretrained model

slim bone Aug 4, 2023, 3:35 PM

#

Ah cool, that's kind of what confused me to begin with

slim bone Aug 4, 2023, 3:36 PM

#

twilit tundra

Which browser are you using by the way? Pytorch.org doesn't have native dark mode right?

twilit tundra Aug 4, 2023, 3:37 PM

#

I'm on iceraven on my phone, with the dark reader extension

slim bone Aug 4, 2023, 3:37 PM

#

Looks really neat 🙂 Ty for the help

dusty valve Aug 4, 2023, 4:08 PM

#

Hello humans, fastest way to render a 2d image? Im doing some computing and the output is a 2d array. Ive used mplt. Scipy or Pillow seem good too

molten hamlet Aug 4, 2023, 4:19 PM

#

depends whats loader is doing ;d

#

Can you stack arrays with different shapes somehow? I want to index different arrays

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5])
arr3 = np.array([6, 7, 8, 9])

stacked = np.stack([arr1, arr2, arr3])

scope = stacked[0,1] # 1
scope = stacked[2,2] # 8

celest vine Aug 4, 2023, 4:31 PM

#

Hey, any data engineers here?
I have 1 year experience as a data analyst and I am trying to break into data engineering.
I know python, sql, hadoop, spark and azure(adf, databricks) and aslo basics of airflow.
Is this enough to land a job?

balmy idol Aug 4, 2023, 4:36 PM

#

does python have a filter function similar to R's? specifically I'm looking for a way to utilize R's circular parameter
circular: for convolution filters only. If TRUE, wrap the filter around the ends of the series, otherwise assume external values are missing (NA).
https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/filter

filter function - RDocumentation

Applies linear filtering to a univariate time series or to each series
separately of a multivariate time series.

mild dirge Aug 4, 2023, 4:36 PM

#

molten hamlet Can you stack arrays with different shapes somehow? I want to index different ar...

Numpy arrays need to have a homogenous shape, so no

#

You could perhaps pad the shorter arrays if it really helps with efficiency and the lengths don't differ too much

molten hamlet Aug 4, 2023, 4:38 PM

#

mild dirge You could perhaps pad the shorter arrays if it really helps with efficiency and ...

nah, I guess looping is my only solution to it

tidal bough Aug 4, 2023, 4:42 PM

#

balmy idol does python have a filter function similar to R's? specifically I'm looking for ...

The functions from scipy.signal usually have an argument for boundary conditions. Not sure which specific function this one corresponds to, though.

balmy idol Aug 4, 2023, 4:45 PM

#

tidal bough The functions from `scipy.signal` usually have an argument for boundary conditio...

thank you, im trying to accomplish this from R:

> filter(x, rep(1, 3), circular = TRUE)
Time Series:
Start = 1 
End = 100 
Frequency = 1 
  [1] 103   6   9  12  15  18  21  24  27  30  33  36  39  42  45  48  51  54  57  60  63  66  69  72  75  78  81  84  87  90  93  96  99 102
 [35] 105 108 111 114 117 120 123 126 129 132 135 138 141 144 147 150 153 156 159 162 165 168 171 174 177 180 183 186 189 192 195 198 201 204
 [69] 207 210 213 216 219 222 225 228 231 234 237 240 243 246 249 252 255 258 261 264 267 270 273 276 279 282 285 288 291 294 297 200

in python the closest i can get is this:

np.convolve(x, [1,1,1], mode='valid')
array([  6,   9,  12,  15,  18,  21,  24,  27,  30,  33,  36,  39,  42,
        45,  48,  51,  54,  57,  60,  63,  66,  69,  72,  75,  78,  81,
        84,  87,  90,  93,  96,  99, 102, 105, 108, 111, 114, 117, 120,
       123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156, 159,
       162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198,
       201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234, 237,
       240, 243, 246, 249, 252, 255, 258, 261, 264, 267, 270, 273, 276,
       279, 282, 285, 288, 291, 294, 297])```
but the 200 and the 103 drop off

tidal bough Aug 4, 2023, 4:48 PM

#

balmy idol thank you, im trying to accomplish this from R: ``` > x <- 1:100 > filter(x, re...

Huh, it looks like scipy.signal's convolve doesn't have wrapping, which is weird to me. Anyway, you can use ndimage's instead:

>>> scipy.ndimage.convolve(x, [1,1,1], mode='wrap')
array([103,   6,   9,  12,  15,  18,  21,  24,  27,  30,  33,  36,  39,
        42,  45,  48,  51,  54,  57,  60,  63,  66,  69,  72,  75,  78,
        81,  84,  87,  90,  93,  96,  99, 102, 105, 108, 111, 114, 117,
       120, 123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156,
       159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195,
       198, 201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234,
       237, 240, 243, 246, 249, 252, 255, 258, 261, 264, 267, 270, 273,
       276, 279, 282, 285, 288, 291, 294, 297, 200])

balmy idol Aug 4, 2023, 4:49 PM

#

tidal bough Huh, it looks like scipy.signal's convolve doesn't have wrapping, which is weird...

oh you just made my day friend!

molten hamlet Aug 4, 2023, 6:10 PM

#

__getitem__ is used with []

lean breach Aug 4, 2023, 6:13 PM

#

is anybody really strong with using langchain/chroma with the gpt-api. Im working on a project and have some questions if anybody would hop on a discord call wit me.

molten hamlet Aug 4, 2023, 6:17 PM

#

what do you measure with time? full loop? or just getting items

boreal gale Aug 4, 2023, 7:56 PM

#

sleek harbor Would be grateful, as I really can't seem to figure out why the table doesn't di...

prefixing this answer with disclaimer: i am not a dash expert, so take what i said with a grain of salt.

here is what i understood

upon turning on debug mode, i immediately saw there is an error on start up, this would explain the issue you are seeing (or maybe rather confirming the symptoms you are seeing)
upon opening the callback DAG view with debug mode on, i see your intra_sector_corr populating function takes quite some time to run, where as your stocks table populating function is immediately triggered, this is basically a race condition due to improper specification of what callback to run first (as to how your can do this, see the link i posted before or my below attempt)
by adding

@app.callback(Output("stocks-dropdown", "value"), Input("stocks-dropdown", "options"))
def pick_first_option_on_change(options):
    return options[0]

i can alter the callback DAG into this, i believe the callback run from top to bottom on initialisation, so hence we have successfully made the race condition go away by forcing stocks table to wait until the first callback for intra_sector_corr population completes

twilit tundra Aug 4, 2023, 9:29 PM

#

I'm not sure I understand your question. Should the 5k photos be labelled as 0 and the 70-100 photos labelled as 1?

#

@lapis sequoia

#

Since you should have random photos vs a small number of close photos, I'd use an existing image embedder, embed a few of the class 1 photos as reference, embed the rest of the image and class them by cosine similarity with the reference

#

The technique I mentionned should work, the alternative is probably managing the unbalance by oversampling your class 1 photos

dusty valve Aug 5, 2023, 12:29 AM

#

how do i build numpy with certain cpu options? i need it to be faster so i want to build it and i followed this
https://numpy.org/devdocs/reference/simd/build-options.html#quick-start
the problem is, i dunno where the setup.py is. if i go to the site-packages/numpy/setup.py it just says that this is the incorrect setup.py to run

serene scaffold Aug 5, 2023, 1:00 AM

#

dusty valve how do i build numpy with certain cpu options? i need it to be faster so i want ...

how do you know that your CPU supports operations that aren't supported by the wheel you would get from pypi?

dusty valve Aug 5, 2023, 3:02 AM

#

serene scaffold how do you know that your CPU supports operations that aren't supported by the w...

just checked, numpy wheels on pypi already utilize max cpu instructions

balmy idol Aug 5, 2023, 4:06 AM

#

Is there a python analog to this behavior from sequence in R? :

 [1] 1 2 3 4 1 2 3 1 2 1```

agile cobalt Aug 5, 2023, 4:40 AM

#

not built-in, but you can just implement it yourself

#

we have range() for normal [1, 2, 3, 4], but that behaviour you exemplified seems very weird

#

!e ```py
def sequence(n):
for j in range(n, -1, -1):
yield from range(j)
print(list(sequence(4)))

arctic wedgeBOT Aug 5, 2023, 4:42 AM

#

@agile cobalt :white_check_mark: Your 3.11 eval job has completed with return code 0.

[0, 1, 2, 3, 0, 1, 2, 0, 1, 0]

agile cobalt Aug 5, 2023, 4:44 AM

#

numpy/scipy might have it builtin somewhere, but if not you'll probably want to create arrays with np.arange then concatenate them with some other numpy function

wind shale Aug 5, 2023, 6:07 AM

#

#

i ahve solution but dont know how it work , can anyone help me

timid grove Aug 5, 2023, 6:58 AM

#

I have penned an article with valuable insights . i would love to hear your feedbacks.
https://medium.com/@sahaniiianuj/bidirectional-english-marathi-language-translation-model-82f39b99bf98
Thank You !

Medium

Bidirectional English — Marathi Language Translation Model

Fine-tuning pretrained mbart50 model to make a en-mr bidirectional translation model using Hugging Face transformers.

trim spruce Aug 5, 2023, 8:03 AM

#

Hello!
I would like to make an application that will make me an electricity forecast for the next period based on trained models. what should I start with? Apart from the correlation coefficient/humidity/temperature/seasonality coefficient, what else can I use? Sorry if I'm posting where I shouldn't, please redirect me!

sleek harbor Aug 5, 2023, 8:05 AM

#

boreal gale prefixing this answer with disclaimer: i am not a dash expert, so take what i sa...

so basically, if I understand correctly, that's simply setting the default value for the second dropdown. And it works!!! Thank you! I still don't really quite understand why it doesn't work without it.. I would have thought that since the table updating callback has the second dropdowns "value" as an input, dash would make the connection that the second dropdown needs to have an "options" parameter to work, which is the output of the first dropdown, and dash would define the order appropriately. Turns out, it seems, it doesn't make the connection, that in order to choose a value, you first need options. Will keep that in mind for the future. I learned something new today! Thx)

vestal spruce Aug 5, 2023, 10:14 AM

#

From my past experience, as long as your data is consistent and not SVG as its using a different method to represent image. everything should be fine. Albeit some other factor outside of your question that could impact into the performance of your model, such as the general size of the image, usual HR image tend to have thousand if not million of pixel, which could affect the time it takes for your model to process and train the data. I hope my answer is to your satisfaction and be of use. 🎩 👌

oblique quarry Aug 5, 2023, 11:56 AM

#

Good afternoon, can somebody please take a look at this?

WhatsApp_Bild_2023-08-05_um_13.54.04.jpg

slim bone Aug 5, 2023, 11:59 AM

#

Hey folks, I think there's a critical knowledge gap in my understanding of gradient descent:
Let us assume a neural network with a single input layer with 3 neurons , and an output layer with 2 neurons
So we feed the system some data, and it outputs some neuron with the highest value (prediction)
I'll ignore the activation function

To fix the weights take some loss function L:
L = loss(w1a1 + w2a2 + w3a3 + b)
calculate its gradient with respect to the weights, and update the weights - This decreases the loss of the function (Assuming we're not already close to the local minima):
New weights: m1, m2, m3
Now we go to the next batch of data and do the same thing: (b1, b2, b3)

Problem is though - now the function has changed: The input is different, and thus the loss function is different - so the local minima of the loss function has shifted elsewhere.
L = loss(m1b1 + m2b2 + m3b3 + b)

What am I missing here? Thanks in advance

#

(Just to clarify, this is me explaining the mental image in my head - not me trying to prove something of course)

tidal bough Aug 5, 2023, 12:14 PM

#

slim bone Hey folks, I think there's a critical knowledge gap in my understanding of gradi...

You aren't missing anything - when doing minibatching, the current weights "jerk around", moving each time towards the local minimum on the current batch.

#

But weirdly enough, that ends up working alright to optimize on the whole dataset. In fact, even weirder (to me at least), stochastic gradient descent sometimes works better than optimizing on the whole dataset at once, because this jittering helps the model to not get stuck in shallow local minima, but rather move gradually to the global one - much like how optimization algorithms like simulated annealing occasionally accept changes that increase loss in order to break out of local minima.

slim bone Aug 5, 2023, 12:40 PM

#

tidal bough You aren't missing anything - when doing minibatching, the current weights "jerk...

You're kind of blowing me mind here, let me get this straight -
The function changes between each batch - and thus the local minima we've been chasing* moves (as in, it might be in a different direction entirely than the negative gradient we've been "chasing" thus far*)

And yet, despite this local minima shift, the algorithm still works?

#

Is this because the loss decreases between batches due to chasing the minimum? So the current minimum we're chasing isn't as important as just decreasing the cost?

#

I hope this makes sense

#

More simply, maybe - we're moreso trying to simply decrease the cost as efficiently as possible ,which happens to be in the direction of some local minima, rather than trying to actually reach a local minima

tidal bough Aug 5, 2023, 12:47 PM

#

slim bone You're kind of blowing me mind here, let me get this straight - The function cha...

Well, suppose we have very large batches - we split our million-sample dataset into only ten parts. It's pretty believable that in that case, a random tenth of the dataset will have roughly the same local minima as the whole dataset, and after averaging over the 10 batches it works out to pretty much the same as training on the whole dataset.

#

Another intuition pump is that if your learning rate is very small, then it should work no matter how small the batch is - because taking a tiny step down the gradient of the entire dataset is the same as taking a very tiny step down the gradient of each sample. (I think I can mathematically formalize this one if you want)

#

And it turns out that in between it it still works - if you have a not-too-big learning rate and use not-too-small samples, the weigths on average end up going in the whole dataset's direction.

slim bone Aug 5, 2023, 12:54 PM

#

@tidal bough It’s not so much that batching is possible that confuses me - more so that the local minima shifts between each batch and that the algorithm still works as intended

#

Like, it’s the fact that the function changes at any point of time

#

Although, I think I kind of get it now

slim bone Aug 5, 2023, 12:55 PM

#

slim bone More simply, maybe - we're moreso trying to simply decrease the cost as efficien...

If this is correct, that is

tidal bough Aug 5, 2023, 12:59 PM

#

slim bone More simply, maybe - we're moreso trying to simply decrease the cost as efficien...

Sure, that's about right - we don't actually want to just get into some local minimum, because any notable NN has approximately infinity of them and most of them are pretty bad. We actually want to reach as deep a minimum as we can.

#

So just perfectly going along the gradient is actually a bad idea. And it turns out that introducing minibatching, and hence a random aspect to the walk, fixes that.

slim bone Aug 5, 2023, 1:06 PM

#

tidal bough So just perfectly going along the gradient is actually a bad idea. And it turns ...

I hope I'm explaining myself properly, or misinterpreting what you're saying (Thus far, everything you've aid makes perfect sense)

Maybe I should explain the root of this question: I'm watching 3b1b's videos, and when they explain gradient descent, they explain it as "We're trying to reach a function's local minima"
More specifically, they use this graph throughout all of the videos

#

So I got the impression that the loss function has a single "form" if you will, and that the local minimas never move

#

So maybe if I ask declare what I understand in a concise manner, and you could just confirm:

The loss function changes between each and every step
Thus, the local minimas* move between each step
Despite this property, the algorithm still works
Not only that it works, it's occassionally better, and helps us break out of "bad" local minimas (Typically done with batching, which is what the right side of the image is trying to illustrate)

Are all of these correct?

tidal bough Aug 5, 2023, 1:15 PM

#

Yeah, this looks right to me. Showing a graph like that is mostly a lies-for-simplicity kind of thing - a realistic one would be where there's local minima everywhere, and some are deeper than others, and just going for the valley in which you start will be a bad solution.

slim bone Aug 5, 2023, 1:15 PM

#

Yeah, this is obviously just a function with two variables

tidal bough Aug 5, 2023, 1:19 PM

#

It might also be interesting for you to look up some modifications of gradient descent other than SGD, like gradient descent with momentum, but tbh I don't myself know much about how they work (basically, you can make your gradient descent intentionally overshoot the minima it goes for, which again helps with getting into a global optimum instead).
Maybe even more illuminating would be simulated annealing. It's a metaheuristic algorithm for multidimensional optimization (I don't think people use it in NNs, mostly just for normal problems) - you have an iterative optimizer with "temperature", and for zero temperature it's basically gradient descent, whereas for infinite temperature it's just a random walk. You start with a high temperature and gradually lower it to zero over the iterations. As a result, the optimizer ends up first wandering into a relatively large and deep basin, and then finding its local minimum, and that usually produces decent.

slim bone Aug 5, 2023, 1:24 PM

#

That's so interesting - indeed a lot of the things in ML feel so... What's the word, deterministic? As in -
"Why should I use X over Y"
"We just tested a bunch of models and we've reached the conclusion that X is typically better"
Which is a pretty unsatisfying answer, but at the same time kind of what you want to hear as a beginner instead of being overwhelmed with even more theory

Specifically, the second approach you mentioned sounds extremely random and doesn't sound like anything you can formally explain beyond "Yeah it just sounded like something that could work and it did"

tidal bough Aug 5, 2023, 1:29 PM

#

Sure, the reason it's called that is because it's loosely based on the theory of how metals anneal. Works for nature, apparently works for numerical optimization too 😛

#

I suspect that there are in fact more convergence guarantees for all of this than I'm implying, because I don't often read ML research papers, but not sure it's much more.

tidal bough Aug 5, 2023, 1:31 PM

#

tidal bough Another intuition pump is that if your learning rate is very small, then it shou...

In the meanwhile I mathematically formalized this note (for two batches, but it generalizes).

#

So minibatching (stochastic gradient descent) is provably the same as ordinary gradient descent for small enough learning rates.

#

(Note how this means that this is a case where lowering the learning rate might hurt your model, because the lower the learning rate, the more SGD acts like ordinary SG, which means going for the closest local minimum rather than jumping around - and for training NNs, that's generally a bad idea.)

slim bone Aug 5, 2023, 1:42 PM

#

Extremely interesting - I’ll read what you’ve formalized in a few minutes (irl shenanigans)

Thank you for your help and the curious insights!

tidal bough Aug 5, 2023, 1:52 PM

#

Here's the same thing but slightly rewritten (including made slightly more correct by noting the next term of the taylor series, etc) and using linalg notation rather than indices everywhere.

lime karma Aug 5, 2023, 2:03 PM

#

First time ever hustling with sorta-data-science, and I just challenged myself to build a script to find dominant colors in each frame for a video

#

with AMD Ryzen 5 4600H with Radeon Graphics (12) @ 3.000GHz processing 59450 frames takes ```bash
name id tid ttot scnt
_MainThread 0 139651551671424 73.62470 14472
ManagerThread 1 139651316668096 7.709011 12115
Thread 2 139651325060800 4.361026 8368

small wedge Aug 5, 2023, 4:11 PM

#

tidal bough In the meanwhile I mathematically formalized this note (for two batches, but it ...

Are you making all of this notation that you're sending? pithink

tidal bough Aug 5, 2023, 4:22 PM

#

small wedge Are you making all of this notation that you're sending? <:pithink:6522475599092...

I don't think I do? What do you mean?

small wedge Aug 5, 2023, 4:43 PM

#

tidal bough I don't think I do? What do you mean?

like all these screenshots you're sending, are you making them with latex or is this from a source somewhere?

tidal bough Aug 5, 2023, 4:43 PM

#

small wedge like all these screenshots you're sending, are you making them with latex or is ...

I wrote that just now, yeah.

small wedge Aug 5, 2023, 4:43 PM

#

woah

#

damn I was gonna ask for the source lol

tidal bough Aug 5, 2023, 4:44 PM

#

i mean, I can post the latex 😛

small wedge Aug 5, 2023, 4:44 PM

#

nah I figured it was from a book or something, disregard me XD

lean breach Aug 5, 2023, 6:30 PM

#

help

https://stackoverflow.com/questions/76842678/how-to-have-my-retrievalqa-search-though-embedding-vectors-of-cleaned-text-but

Stack Overflow

How to have my RetrievalQA search though embedding vectors of clean...

I am writing a customQA chatbot using Langchain, Chroma, and the GPT-API. Below you will see the function I use for instantiating a peristed database for my vectors.
def create_db(pdf_file):

boreal gale Aug 5, 2023, 6:52 PM

#

sleek harbor so basically, if I understand correctly, that's simply setting the default value...

that's simply setting the default value for the second dropdown.
kind of, it's setting the default value of the dropdown when the dropdown's list of possible options changes.

I still don't really quite understand why it doesn't work without it..
it's about the ordering of when callback are invoked on initialisation. and as you rightly pointed out, dash does not make that connection between "value"and "options" for you.

fresh harbor Aug 5, 2023, 7:23 PM

#

Can model conversion to fp16 take a hit on accuracy? Does it have an impact on inference time?

agile cobalt Aug 5, 2023, 7:25 PM

#

typically you'd expect for it to decrease the accuracy while either keeping the inference time constant or lowering it, but reducing the model size significantly

#

usually you'll have to train a bit after converting to a different precision iirc

fresh harbor Aug 5, 2023, 7:26 PM

#

Its a completely closed source model

agile cobalt Aug 5, 2023, 7:27 PM

#

and?

fresh harbor Aug 5, 2023, 7:27 PM

#

No instructions on how to train it

#

Just an onnx lying around

agile cobalt Aug 5, 2023, 7:28 PM

#

I'd recommend not trying to convert it yourself then but rather asking whoever gave you the model then

fresh harbor Aug 5, 2023, 7:28 PM

#

Alright

errant bison Aug 5, 2023, 11:48 PM

#

hii, i want to make an automatic licence plate detection, how can i do so and what tutorial should i follow?

lapis sequoia Aug 6, 2023, 1:20 AM

#

If I wanted to make a mlp in pytorch and then move all the weights to my own library for testing, is there any better move than making a mlp nn.Module and then manually parsing it's .state_dict()?

twilit tundra Aug 6, 2023, 6:05 AM

#

errant bison hii, i want to make an automatic licence plate detection, how can i do so and wh...

There are 3 steps for this kind of task: detecting the license plate, outlining the characters in the license plate, and read those characters. This is a basic tutorial and then you can improve each step: https://pyimagesearch.com/2020/09/21/opencv-automatic-license-number-plate-recognition-anpr-with-python/

PyImageSearch

Adrian Rosebrock

OpenCV: Automatic License/Number Plate Recognition (ANPR) with Pyth...

In this tutorial, you will build a basic Automatic License/Number Plate (ANPR) recognition system using OpenCV and Python.

twilit tundra Aug 6, 2023, 6:19 AM

#

lapis sequoia If I wanted to make a mlp in pytorch and then move all the weights to my own lib...

That would probably be the easiest if all you want are the parameters

sleek harbor Aug 6, 2023, 7:11 AM

#

When working with pandas and you have a categorical column, do you usually convert it from the default object type to categorical (which saves a bit of memory and, I assume, makes some operations faster), or is the overhead of the type casting/conversion (whatever it does under the hood) not worth it? What's the best practice here?

twilit tundra Aug 6, 2023, 9:53 AM

#

sleek harbor When working with pandas and you have a categorical column, do you usually conve...

I use it when the values repeat a lot and I have to do multiple transformations and/or the dataframe is large enough. In my experience casting to categorical is pretty cheap anyway.

jaunty lion Aug 6, 2023, 1:21 PM

#

Hey im trying to create a rnn. I have multiple audio dataframes for each song. Every dataframe corresponds to a chunk of the song. this means that songs with varying lengths have varying amount of dataframes. From my very limited understanding of rnn, its beneficial to train it in batchsizes where the batchsize matches the length of the Dataframes for a single element. My question is, if it is a valid approach to pad the amount of dataframes with dataframes containing only -1, so its consistent.
If something i said makes no sense or is stupid, feel free to point it out.

errant bison Aug 6, 2023, 3:53 PM

#

twilit tundra There are 3 steps for this kind of task: detecting the license plate, outlining ...

i tried this but the results werent accurate because of only using opencv, thats why i used neural nets, but still i am struggling with the results, thats why was looking for other tutorials. Can u share if there are any other with good accuracy as most of them i saw uses api

mild dirge Aug 6, 2023, 3:59 PM

#

errant bison i tried this but the results werent accurate because of only using opencv, thats...

Which part of it was not accurate?

#

Dividing it up into three parts is not a bad idea, so if you only need to replace one part that is more doable

harsh bane Aug 6, 2023, 4:44 PM

#

Hoi, before i ask, which channel is appropriate for help with stable diffuson dependencies and such? Specifically on amd

serene scaffold Aug 6, 2023, 4:51 PM

#

harsh bane Hoi, before i ask, which channel is appropriate for help with stable diffuson de...

are you trying to run it locally, or through an API?

harsh bane Aug 6, 2023, 4:53 PM

#

serene scaffold are you trying to run it locally, or through an API?

Locally.

#

Could be a imcompatible something that it can't read from because it's too new possibly. Honestly don't know

serene scaffold Aug 6, 2023, 4:53 PM

#

harsh bane Locally.

I will not read any screenshots of text; please copy and paste it directly

harsh bane Aug 6, 2023, 4:54 PM

#

serene scaffold I will not read any screenshots of text; please copy and paste it directly

Gotchu. Just didn't know if there was a "don't do that, it's linespam" :P

#

(134)(deck@arch ComfyUI)$ python main.py --normalvram --disable-cuda-malloc --use-split-cross-attention
Total VRAM 4096 MB, total RAM 11795 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Custom GPU 0405 : native
Using split optimization for cross attention
python: /usr/src/debug/hip-runtime-amd/clr-rocm-5.6.0/hipamd/src/hip_code_object.cpp:754: hip::FatBinaryInfo** hip::StatCO::addFatBinary(const void*, bool): Assertion err == hipSuccess' failed. Aborted (core dumped) (134)(deck@arch ComfyUI)$ python main.py --normalvram --disable-cuda-malloc --use-split-cross-attention Total VRAM 4096 MB, total RAM 11795 MB Set vram state to: NORMAL_VRAM Device: cuda:0 AMD Custom GPU 0405 : native Using split optimization for cross attention python: /usr/src/debug/hip-runtime-amd/clr-rocm-5.6.0/hipamd/src/hip_code_object.cpp:754: hip::FatBinaryInfo** hip::StatCO::addFatBinary(const void*, bool): Assertion err == hipSuccess' failed.
Aborted (core dumped)

#

As it's a steam deck, i'd toy around with the "--insertcommand" to see what runs the best, but can't get it to even launch :P Gotten automatic to run in the past, but can't seem to get it to work now, so truing comfyui lol

serene scaffold Aug 6, 2023, 4:58 PM

#

harsh bane _____ As it's a steam deck, i'd toy around with the "--insertcommand" to see wha...

and you replaced the steam deck's native linux flavor with arch?

harsh bane Aug 6, 2023, 4:59 PM

#

serene scaffold and you replaced the steam deck's native linux flavor with arch?

Oops, forgot to note that.

SteamOS, but distrobox with arch. Also got distrobox for ubuntu 20.04, but it didn't work either with automatic1111.

serene scaffold Aug 6, 2023, 5:00 PM

#

I'm not sure what to suggest, unfortunately

harsh bane Aug 6, 2023, 5:01 PM

#

No worries. I'll ask around/wait for someone who could possibly know :P

iron basalt Aug 6, 2023, 6:19 PM

#

harsh bane (134)(deck@arch ComfyUI)$ python main.py --normalvram --disable-cuda-malloc --us...

HIP probably does not support that GPU. These kinds of libraries tend to only support the most recent dedicated GPUs.

harsh bane Aug 6, 2023, 6:20 PM

#

Aye. But got it somewhat working now, with python rocm, now i'm debugging with comfy's creator as there'a a conflict when i try to generate. Doesn't get past clip

novel python Aug 6, 2023, 10:27 PM

#

guys, I don't think this is worth a topic on help because it's not exactly python-related. But since you guys are used to using jupyter notebooks, I'd like to know something: if you use the vscode extension, does it stop the colors out of nowhere sometimes too? It's getting annoying for me over the last day. It keeps "crashing" the colors, autocomplete, etc. The notebook itself still works. I've already looked for conflicting extensions, but nothing I could find that helped.

serene scaffold Aug 6, 2023, 10:30 PM

#

novel python guys, I don't think this is worth a topic on help because it's not exactly pytho...

Try asking in #editors-ides instead

novel python Aug 6, 2023, 10:31 PM

#

serene scaffold Try asking in <#813178633006350366> instead

tyty!

granite atlas Aug 6, 2023, 11:30 PM

#

I had a question about Neural Networks.

Are there any tutorials which teach how to make neural networks from scratch without using any library or frameworks?

#

I wanted to learn the basics in Julialang, so the language won't matter for the most part as long as it's a sane one.

serene scaffold Aug 6, 2023, 11:37 PM

#

granite atlas I had a question about Neural Networks. Are there any tutorials which teach how...

There are, but many will probably still use numpy.

granite atlas Aug 6, 2023, 11:38 PM

#

Numpy doesn't exist in other languages

#

lemon_sentimental

serene scaffold Aug 6, 2023, 11:39 PM

#

So? Any language can have constructs that don't exist in other languages

#

And you could have a language where something like bumpy is part of the language

granite atlas Aug 6, 2023, 11:40 PM

#

@serene scaffold most of the numpy's functions are covered in base Julia interpreter. so what are the suggested tutorials that you have

#

I would still avoid any frameworks related directly to ai/nneu though

serene scaffold Aug 6, 2023, 11:41 PM

#

granite atlas <@253696366952316929> most of the numpy's functions are covered in base Julia in...

I don't have a specific one in mind, but you'll probably get better results for "neural network in Python with numpy"

Numpy doesn't have any constructs that are intended to make machine learning any easier, so none of the important parts would be abstracted away.

iron basalt Aug 6, 2023, 11:42 PM

#

How about https://iamtrask.github.io/2015/07/12/basic-python-network/

A Neural Network in 11 lines of Python (Part 1) - i am trask

A machine learning craftsmanship blog.

granite atlas Aug 6, 2023, 11:43 PM

#

I was doing a dude's nneu from scratch in python but he used his own library in middle so i felt betrayed

granite atlas Aug 6, 2023, 11:48 PM

#

iron basalt How about https://iamtrask.github.io/2015/07/12/basic-python-network/

OK, that is a very awesome tutorial

#

God bless you mate

ashen seal Aug 7, 2023, 3:23 AM

#

Looking for some guidance in creating an interactive html report similar looking to the image. I have some csv/excel data and want to create a nice dashboard looking report detailing data migration progress. The objective would be to output a single html encapsulating the data and interactive visualisations. Has anyone done something similar before? Would you be able to point me in the right direction?

worn stratus Aug 7, 2023, 5:47 AM

#

ashen seal Looking for some guidance in creating an interactive html report similar looking...

for something as complicated as that, with multiple pages or a complex layout? you have no choice but to go to JS

otherwise Quarto or possibly Plotly subplots + writing out HTML are the closest things I'm aware of in python

quartz wigeon Aug 7, 2023, 6:39 AM

#

Are there any good resources for learning reinforcement learning hands-on? I've tried a few university courses on youtube but all of them are highly theoretical and don't involve code.

agile cobalt Aug 7, 2023, 6:41 AM

#

well yeah, machine learning is 95% theory 5% code

quartz wigeon Aug 7, 2023, 7:05 AM

#

If so, as a complete beginner in ML, where should I start learning reinforcement learning? Is it ok to skip stuff like supervised and unsupervised learning and delve into reinforcement learning directly?

#

In short, what are the prerequisites for reinforcement learning?

sonic meteor Aug 7, 2023, 7:08 AM

#

Can anyone provide me a roadmap or maybe some resources to get started with neruoevolution (genetic algorithms and NEAT)
I am currently doing the Huggy Face Deep RL course (if that helps)
Please ping me when you reply

sonic meteor Aug 7, 2023, 7:10 AM

#

quartz wigeon If so, as a complete beginner in ML, where should I start learning reinforcement...

HuggyFace's Deep RL course is where i started my reinforcement learning journey, and no its not okay to skip those topics as they form the foundation of all topics in machine learning

quartz wigeon Aug 7, 2023, 7:12 AM

#

sonic meteor HuggyFace's Deep RL course is where i started my reinforcement learning journey,...

Ah I see... I do have some basic understanding on stuff such as cost functions, gradient descent, regression etc. Is that enough to start learning RL? How far should I delve into those topics before starting to learn RL?

#

Thanks for the suggestion though, I'll check it out

sonic meteor Aug 7, 2023, 7:43 AM

#

quartz wigeon Ah I see... I do have some basic understanding on stuff such as cost functions, ...

that should be enough, although the course i mentioned is a Deep RL course, so if you are considering to do it, then i would recommend familiarising yourself with neural networks as well

worn stratus Aug 7, 2023, 8:29 AM

#

ashen seal Looking for some guidance in creating an interactive html report similar looking...

I happened to look at Quarto more this morning. Seems super powerful, and should be able to do what you want

quaint loom Aug 7, 2023, 8:32 AM

#

I may have asked several times about this questions but how would you guys made this into latex text?

merry ridge Aug 7, 2023, 9:12 AM

#

I normally would not make something like that in LaTeX and instead make it elsewhere and include it as a figure later. If you really want to make it in LaTeX I would use TikZ, but that's not a very pleasant task

ashen seal Aug 7, 2023, 9:21 AM

#

worn stratus I happened to look at Quarto more this morning. Seems super powerful, and should...

Ok great thank you, i will take a look.

mild dirge Aug 7, 2023, 9:34 AM

#

merry ridge I normally would not make something like that in LaTeX and instead make it elsew...

I agree with that last part haha

ashen axle Aug 7, 2023, 2:01 PM

#

Hi all, I'm looking for some advice about tensor libraries. I'm working on a chemometrics project worknig with spectrum-chromatograms, 2nd order tensors, and am looking for a python library that will enable me to apply customed preprocessing algorithms to the tensor prior to modelling. In the past i have achieved this by producing pandas dataframes of dataframes, but this is both cumbersome and frankly just feels dirty. I've given a cursory glance to several libraries such as Keras, but they don't seem to fit my needs, at least not superficially. Please help!

serene scaffold Aug 7, 2023, 2:04 PM

#

ashen axle Hi all, I'm looking for some advice about tensor libraries. I'm working on a che...

what preprocessing are you trying to do? one hot encoding?

ashen axle Aug 7, 2023, 2:05 PM

#

serene scaffold what preprocessing are you trying to do? one hot encoding?

no its all numerical signals with a single label, at least initially. Looking to normalize, apply a savitzky-golay filter, PLS baseline correction, resample, and then align the signals. not necessarily in that order.

serene scaffold Aug 7, 2023, 2:06 PM

#

ashen axle no its all numerical signals with a single label, at least initially. Looking to...

if you can use pandas to produce a dataframe that's structured like the tensors you need, that is fine.

ashen axle Aug 7, 2023, 2:08 PM

#

serene scaffold if you can use pandas to produce a dataframe that's structured like the tensors ...

the previous solution was a series of dataframes, I was wondering if there was a more elegant approach, as without for example creating my own dataframe class, a series of dataframes is difficult to observe / debug

serene scaffold Aug 7, 2023, 2:08 PM

#

ashen axle the previous solution was a series of dataframes, I was wondering if there was a...

a series of dataframe. and pandas Series?

#

!docs pandas.Series

arctic wedgeBOT Aug 7, 2023, 2:08 PM

#

pandas.Series


class pandas.Series(data=None, index=None, dtype=None, name=None, copy=None, fastpath=False)```
One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, \*, \*\*) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

serene scaffold Aug 7, 2023, 2:08 PM

#

this?

ashen axle Aug 7, 2023, 2:08 PM

#

serene scaffold a series of dataframe. and pandas Series?

correct

serene scaffold Aug 7, 2023, 2:09 PM

#

you should never have a Series of DataFrames. all the DataFrames that are in it should be concatenated into one (potentially with multiple levels of indexing)

ashen axle Aug 7, 2023, 2:12 PM

#

Yeah I know its unorthodox, hence my question. I was inspired to use a Series of Dataframes by Jodie Burchell in this podcast https://open.spotify.com/episode/6iN2nYZGBTdUAdpVnWvI5W

Spotify

Using NumPy and Linear Algebra for Faster Python Code

Listen to this episode from The Real Python Podcast on Spotify. Are you still using loops and lists to process your data in Python? Have you heard of a Python library with optimized data structures and built-in operations that can speed up your data science code? This week on the show, Jodie Burchell, developer advocate for data science at JetBr...

serene scaffold Aug 7, 2023, 2:13 PM

#

ashen axle Yeah I know its unorthodox, hence my question. I was inspired to use a Series of...

having nested pandas structures doesn't mean that you need to switch away from pandas. it just means that you're using pandas wrong.

#

and keras isn't an alternative to pandas

#

pandas and polars are for the same thing
pytorch and tensorflow (and therefore keras) are for the same thing

ashen axle Aug 7, 2023, 2:17 PM

#

serene scaffold having nested pandas structures doesn't mean that you need to switch away from p...

That's fair to say, and Id rather stay within the pandas ecosystem if i can. How would you suggest structuring my 3-dimensional numerical data?

serene scaffold Aug 7, 2023, 2:18 PM

#

ashen axle That's fair to say, and Id rather stay within the pandas ecosystem if i can. How...

with multiindexing

ashen axle Aug 7, 2023, 2:19 PM

#

serene scaffold with multiindexing

yeah right, I can see how that would work.

rose dagger Aug 7, 2023, 3:42 PM

#

What's a good modern reference for 3D image classification? Any SOTA models and standard data processing techniques?

broken gorge Aug 7, 2023, 4:08 PM

#

Hi,
I just got my master degree in experimental psychology from a really good college.

Right now i'm doing a gap year bc I want to properly learn how to code and ML.
I'm not sure yet if I want to do a Phd mixing experimental psychology and cognitive process modeling or become a data scientist.

I just started CS50P(ython) from Harvard and like it very much.
I plan to do the regular CS50 after and follow with CS50AI.

I'm also considering using Dataquest or Datacamp on the side to reinforce/train.
Are there worth it ? Are there equally good ?
(I read on reddit that datacamp is too easy and consist in filling blank, no actual typing. I juste started the free version of the python course and it seems it's not the case in the first course.
On the other hand Dataquest seems more challenging but is lacking variety of courses and is much more expensive)

Thank for reading this long message 🙂

PS: my only real coding experience is some C in highschool and R during college for stats, but R is quite different from other languages from my understanding.

void veldt Aug 7, 2023, 4:13 PM

#

how does one transfer arguments from minimizer_kwargs in basinhopping to a custom function being used as a method?

molten hamlet Aug 7, 2023, 5:09 PM

#

I don't know how to copy pandas,

.copy method is not working, .copy(deep=True) also does not work.
"Not working" I mean that columns are not copies, so when I modify columns in one, the other dataframe is also modified 😐

#

segment_df.columns.to_numpy() has some trickery inside not doing copy...

tidal bough Aug 7, 2023, 5:14 PM

#

molten hamlet I don't know how to copy pandas, `.copy` method is not working, `.copy(deep=Tru...

!e Can't replicate this:

import pandas as pd
df = pd.DataFrame({'a': list('aaabb'), 'b': range(5)})
df2 = df.copy()
df2["b"] *= 2
print(df)

arctic wedgeBOT Aug 7, 2023, 5:14 PM

#

@tidal bough :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 |    a  b
002 | 0  a  0
003 | 1  a  1
004 | 2  a  2
005 | 3  b  3
006 | 4  b  4

tidal bough Aug 7, 2023, 5:14 PM

#

What's the .dtypes of your dataframe? Perhaps it's something really weird that needs deep copying?

molten hamlet Aug 7, 2023, 5:18 PM

#

I solved, but problem was with this code.

"""list_ofdf is list of dataframe splited into smaller segments to separate scope
"""
    for segi, segment_df in enumerate(list_ofdf):
        # print()
        segment_df = segment_df.copy(deep=True)
        print(f"Segment {segi:>3}: {segment_df.shape}. cols: ")
        # print(segment_df.columns)

        timestamp_ind = np.argwhere(segment_df.columns == "timestamp_ns").ravel()[0]
        segment_df.iloc[:, timestamp_ind] = segment_df.iloc[:, timestamp_ind] / 1e9
        # print(f"timestamp_ind: {timestamp_ind}")

        base_features = segment_df.shape[1]

        segm_columns = segment_df.columns.to_numpy() # THIS DOES NOT WORK
        #segm_columns = np.array(segment_df.columns) # THIS WORKS
        segm_columns[timestamp_ind] = "timestamp_s"

tidal bough Aug 7, 2023, 5:20 PM

#

to_numpy tries to make a view rather than a copy if possible, pass copy=True if that's undesirable.

molten hamlet Aug 7, 2023, 5:21 PM

#

tidal bough `to_numpy` tries to make a view rather than a copy if possible, pass `copy=True`...

but it should be copied already so its a bit confusing

solar carbon Aug 7, 2023, 6:06 PM

#

hello guys, does anyone had this problem ?

#

i installed tensorflow, tf-gpu and keras. I want to train a zoo model and i have problems setting up 😦

tidal bough Aug 7, 2023, 6:07 PM

#

you cut off almost all of the traceback, so hard to tell

solar carbon Aug 7, 2023, 6:09 PM

#

i have looked on the internet and some say that numpy version is depricated, installed other version and still didnt work

tropic prairie Aug 7, 2023, 8:28 PM

#

hi can someone help me with a project that I have

#

I have to make a NN model for regression, and the dataset just consists of x and y values so really simple.

young granite Aug 7, 2023, 8:29 PM

#

broken gorge Hi, I just got my master degree in experimental psychology from a really good co...

learn coding by doing some projects a good first impression can be found

#

!resources

arctic wedgeBOT Aug 7, 2023, 8:29 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

tropic prairie Aug 7, 2023, 8:30 PM

#

how shd I get started?

young granite Aug 7, 2023, 8:30 PM

#

tropic prairie I have to make a NN model for regression, and the dataset just consists of x and...

own NN? (e.g from scratch)

tropic prairie Aug 7, 2023, 8:30 PM

#

yep ig

#

I have around 1 month

young granite Aug 7, 2023, 8:31 PM

#

so from scratch

tropic prairie Aug 7, 2023, 8:31 PM

#

yep just need to acheive a high accuracy rate

young granite Aug 7, 2023, 8:31 PM

#

this is a pretty amazing video
https://www.youtube.com/watch?v=w8yWXqWQYmU&ab_channel=SamsonZhang

YouTube

Samson Zhang

Building a neural network FROM SCRATCH (no Tensorflow/Pytorch, just...

Kaggle notebook with all the code: https://www.kaggle.com/wwsalmon/simple-mnist-nn-from-scratch-numpy-no-tf-keras

Blog article with more/clearer math explanation: https://www.samsonzhang.com/2020/11/24/understanding-the-math-behind-neural-networks-by-building-one-from-scratch-no-tf-keras-just-numpy.html

▶ Play video

tropic prairie Aug 7, 2023, 8:32 PM

#

oh no I can use tensorflow and stuff

young granite Aug 7, 2023, 8:32 PM

#

🗿

tropic prairie Aug 7, 2023, 8:32 PM

#

sorry I was a bit unclear haha cuz I'm so confused rn

#

I just don't know how to get started

young granite Aug 7, 2023, 8:32 PM

#

so did u plotted ur data (x,y)?

tropic prairie Aug 7, 2023, 8:32 PM

#

yep really linear

#

already cleaned

young granite Aug 7, 2023, 8:33 PM

#

always good to do a exploratory data analysis

#

ok

#

are u a graduate? (which complexity u want)

tropic prairie Aug 7, 2023, 8:34 PM

#

yes

#

already hv experience but not project based, education system sucks lol

young granite Aug 7, 2023, 8:35 PM

#

do u need to present and explain why u chosen certain model?

tropic prairie Aug 7, 2023, 8:36 PM

#

not really, since I only have to use neural networks, so I don't really need to explore forests or kcluster but I do need to present how I built my model and it's kinda important that I get a good accuracy rate (which shd not be too complication since the data is (x,y)

young granite Aug 7, 2023, 8:39 PM

#

so for simple regression go with a sequential model

tropic prairie Aug 7, 2023, 8:40 PM

#

yea that's what I thought thanks a lot!

#

do you know any code online that mimics a project aiming for a high accuracy rate

young granite Aug 7, 2023, 8:42 PM

#

tropic prairie do you know any code online that mimics a project aiming for a high accuracy rat...

u can write a "crossvalidation" so different architecture batch size epochs etc.

#

im pretty sure u will find packages which do that for u

tropic prairie Aug 7, 2023, 8:43 PM

#

oh never looked into it, do you know what I should search up to get started? like any package names you know?

#

is this similar to hypertuning parameters?

young granite Aug 7, 2023, 8:44 PM

#

https://stackoverflow.com/questions/48085182/cross-validation-in-keras

Stack Overflow

Cross Validation in Keras

I'm implementing a Multilayer Perceptron in Keras and using scikit-learn to perform cross-validation. For this, I was inspired by the code found in the issue Cross Validation in Keras

from sklearn.

tropic prairie Aug 7, 2023, 8:45 PM

#

ok tysm!

gaunt elbow Aug 7, 2023, 10:21 PM

#

I'm working on an accounting dataset. They have a revenue amount that's like 1000 and then some n number of lines that offset that 1000. I can group the accounting data into small chunks of usually less than 100 lines. Problem is that it the offsetting amount can be 1 line or 10 lines and it's mixed into those 100 mostly identical looking lines. (There isn't any other good data to filter down anymore) My natural inclination is to iterate over every combination to see if any number of lines equals 1000. Then I can show them these as proposed matches and they can nail down which ones they want. Does anyone know if a fast-ish algorithm to do this? I'm going to run it on a GL dataset filtered down to the 25 million potential offsetting lines. Thankfully once it's done we just need to do it go-forward which should be easier

stable mist Aug 8, 2023, 1:43 AM

#

can anyone teach me pytorch with flask

left tartan Aug 8, 2023, 2:27 AM

#

gaunt elbow I'm working on an accounting dataset. They have a revenue amount that's like 100...

I really can't understand what you're describing. Can you provide a minimal example?

left tartan Aug 8, 2023, 2:28 AM

#

gaunt elbow I'm working on an accounting dataset. They have a revenue amount that's like 100...

the short answer is: sure, probably, as long as you can provide a formula, it can be done!

gaunt elbow Aug 8, 2023, 2:38 AM

#

@left tartan sure!
Here are some headers and fake data. This is my revenue line:
Account | cost center | location| date | amount
ABC123 | 111 | CA | 6/30/2023 | 1000

Same headers, here is my offsetting cash.
ABC123 | 111 | CA | 6/30/2023 | -250
ABC123 | 111 | CA | 6/30/2023 | -333
ABC123 | 111 | CA | 6/30/2023 | -500
ABC123 | 111 | CA | 6/30/2023 | -250

Out of the above 4 lines 3 of them will tie to my 1000 revenue. So (for now) I've a recursive function that sums different every possible combo together to see if they match the 1000 revenue amount. So for these 4 lines it should create about 4! Or 24 loops to try every combo until it finds that 1,3,4 add together to offset the revenue. I have a description column to tell me which column is the revenue line but the cash offsets are either blank or not helpful in finding the offsetting amount

left tartan Aug 8, 2023, 2:40 AM

#

Sounds very reminiscent of two sums, but with N.

#

I guess you could employ a recursive approach or DP algorithm (ie: cache the intermediate calculations so you're not re-calculating) in the abstract.

#

This might be a better question in #algos-and-data-structs

#

definitely look at the 2sum algorithms, they can be generalized: https://leetcode.com/problems/two-sum/

LeetCode

LeetCode - The World's Leading Online Programming Learning Platform

Level up your coding skills and quickly land a job. This is the best place to expand your knowledge and get prepared for your next interview.

gaunt elbow Aug 8, 2023, 2:49 AM

#

Okay yeah! Oh nice 👍 thank you! I'll give this a shot. It gives me better words to keep researching at least

merry ridge Aug 8, 2023, 2:51 AM

#

See: https://en.wikipedia.org/wiki/Subset_sum_problem

ashen seal Aug 8, 2023, 3:43 AM

#

worn stratus I happened to look at Quarto more this morning. Seems super powerful, and should...

It looks like a really messy way to generate a complex selfcontained website. I have done something similar in rmarkdown but I think probably best either to simplify the project or try find something else. The embedded html approach gets quite messy with complex designs 😦

charred light Aug 8, 2023, 5:07 AM

#

How do I hide **specific **cells from rendering when exporting to HTML in Jupyter Notebook (VSCode) Similar to how you can do it in R studio notebooks. edit: I know about cmd line nbconvert --no-input

ashen axle Aug 8, 2023, 5:07 AM

#

Question - I have created a duckdb database with a 3d instrument signal table totalling 173 million rows by 9 columns. Trying to introduce this into memory is resulting in kill process. Is a database the best solution for this type of data, or should I be looknig at another format?

twilit tundra Aug 8, 2023, 5:09 AM

#

ashen axle Question - I have created a duckdb database with a 3d instrument signal table to...

Do you need 173Mx9 in memory?

ashen axle Aug 8, 2023, 5:09 AM

#

twilit tundra Do you need 173Mx9 in memory?

There will be no use case that requires all of the signal data at once but I am trying to establish test parameters

charred light Aug 8, 2023, 5:16 AM

#

ashen axle Question - I have created a duckdb database with a 3d instrument signal table to...

See if there's an option to chunk your data set. (i.e. read in one chunk of your data at a time)

#

I'm not familiar with DuckDB so I won't be able to give much advice other than that.

ashen axle Aug 8, 2023, 5:19 AM

#

charred light See if there's an option to chunk your data set. (i.e. read in one chunk of your...

That might work! Side note, if I wasnt going to use a database, what would be your choice of data format, considering my database table is actually 200 individuals in long form, approximately 4500 rows per individual once pivoted

charred light Aug 8, 2023, 5:23 AM

#

ashen axle That might work! Side note, if I wasnt going to use a database, what would be yo...

"Depends on your data" would be the generic answer. Probably be a better question for #databases
I only really use CSV in terms 'data format', which can be a csv, parquet, sparse matrix, or something else. But each one, depends on how your data looks like. (I.e. Sparse matrix is good for datasets with a lot of 0s)

#

It's also good to point out that, at a small scale, it doesn't matter too much. Optimization only really matters when you reach like hundreds of millions of rows +

ashen axle Aug 8, 2023, 5:29 AM

#

charred light "Depends on your data" would be the generic answer. Probably be a better questio...

thanks, ill give it a go at #databases as well.

quaint loom Aug 8, 2023, 6:10 AM

#

merry ridge I normally would not make something like that in LaTeX and instead make it elsew...

Would you elaborate a little more? Make it elsewhere what? : )

hollow magnet Aug 8, 2023, 7:53 AM

#

A little off topic, is VSC ok for the use of SQL ? Or it's better to use My SQL or something else ?

raw zenith Aug 8, 2023, 9:47 AM

#

Guys, just a quick help, lets say i have size of data (1214 rows , 93 columns), if i want to remove rows based on columns ranging from 44:88 for example using pandas. I am having difficulty achieving this cuz all i can do is remove rows based on columns values, i want to remove just based on columns

#

I tried something like this df.drop(df.columns[44:89], axis=0, inplace=True)

#

but does not work as it drop columns but not rows associated with it

twilit tundra Aug 8, 2023, 9:53 AM

#

df.drop(list(range(44,89)),axis=0) should work

merry ridge Aug 8, 2023, 10:21 AM

#

quaint loom Would you elaborate a little more? Make it elsewhere what? : )

When I am making publication quality figures, if I am not using TikZ for whatever technical reason, I’ll make them by hand in Adobe Photoshop or Illustrator or pay someone to do it if it’s beyond my ability.

split drift Aug 8, 2023, 10:55 AM

#

Hey,
I'm looking for data documentation tool / package.
I want it to document the input data and the output data.

Currently each stage in the pipeline load the data from a given path.
Compute some features, and save to another given path.

Thanks

left tartan Aug 8, 2023, 11:55 AM

#

ashen axle Question - I have created a duckdb database with a 3d instrument signal table to...

I’m usually the duckdb guy around here, but I’d suggest going over to the duckdb discord and asking there. And sharing the query. The reason is; there are strategies for dealing with larger than memory datasets and queries, whether by chunking or writing queries that operate in subsets of data. Another strategy is to partition the source data… I use parquet a lot for this, and keep large data external.

quaint loom Aug 8, 2023, 1:07 PM

#

merry ridge When I am making publication quality figures, if I am not using TikZ for whateve...

I just want to make it into text form and thought latex would be the simpliest knowing what was connected to what*

young granite Aug 8, 2023, 1:22 PM

#

split drift Hey, I'm looking for data documentation tool / package. I want it to document th...

so u generate json files ? what is ur goal

young granite Aug 8, 2023, 1:22 PM

#

twilit tundra df.drop(list(range(44,89)),axis=0) should work

we dont .drop we use .loc 🗿

upper drift Aug 8, 2023, 2:10 PM

#

Hi! I'm dealing with a slimy landlord unfortunately and may have to go to a tribunal hearing. I want to be well prepared and had the idea that I could scrape their publicly available decision cases, and then train a LLM using that. For scraping I think I could use BeautifulSoup, but does anyone have suggestions for the LLM part?

left tartan Aug 8, 2023, 2:13 PM

#

upper drift Hi! I'm dealing with a slimy landlord unfortunately and may have to go to a trib...

Having worked with many lawyers, and although IANAL, I can safely say: that sounds like a collosal waste of time to try to do (to try to build a meaningful model from a handful of cases). How many cases are you talking? A dozen? You might as well read each of them, take notes, and summarize.

upper drift Aug 8, 2023, 2:13 PM

#

They have about 12-13 years of stuff, and it looks about 25-40 cases each year

#

If I could learn some neat stuff while getting some benefit from it, I'd be happy

left tartan Aug 8, 2023, 2:17 PM

#

And, i'd be remiss if I didn't say: Get a lawyer :)... anyway, If I were starting with something like this, I'd probably look at classifying them. Check this out Comparative Study of Classifying Legal Documents with Neural Networks

upper drift Aug 8, 2023, 2:20 PM

#

Awesome, thanks! Also, yea, I have some legal aid, this is just supplementary / hobby project 🤓

zealous hollow Aug 8, 2023, 2:28 PM

#

if anyone has exprience with time series models especially ARIMA, can you kindly help me with my project. ADF and pacf,acf plots are done. just need help with p,d,q values

left tartan Aug 8, 2023, 2:36 PM

#

zealous hollow if anyone has exprience with time series models especially ARIMA, can you kindly...

I'm no expert, but have you looked at auto-arima? I've used it mostly when trying to optimize the parameters https://alkaline-ml.com/pmdarima/modules/generated/pmdarima.arima.auto_arima.html

zealous hollow Aug 8, 2023, 2:36 PM

#

no work it just gave a straight line as output

left tartan Aug 8, 2023, 2:37 PM

#

That would seem like some sort of data error, i'd guess. Can you share a minimal reproduction?

zealous hollow Aug 8, 2023, 2:37 PM

#

btw letme show you the acf and pacf plots

#

#

#

the nymber of lags selected is 365
bcz the data is yearly and a value does depend on it's last year value

#

#

i am using gradient boosting models with input data like this

#

using temp did give it a little boost

#

it's only giving me 0.53 r2score

#

i need it to be atleast 0.75+

#

so trying arima model

#

i am open to other model suggestions as well

left tartan Aug 8, 2023, 2:47 PM

#

And what do you get from Arima?

#

Here's an example of a simple arima model against a sin+noise signal: ```py

ARIMA Example w sin + noise (updated with correct m =20)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

n_points = 100
x = np.linspace(0, 20 * np.pi, n_points)
noise = np.random.normal(0, 0.5, n_points)
y = 5 * np.sin(x / 2) + noise

from pmdarima import auto_arima

model = auto_arima(y, seasonal=True, m=20, trace=True, error_action='ignore', suppress_warnings=True)
forecast, conf_int = model.predict(n_periods=20, return_conf_int=True)

plt.figure(figsize=(12, 6))
plt.plot(y, label="Data")
plt.plot(np.arange(n_points, n_points + 20), forecast, color="red", label="Forecast")
plt.fill_between(np.arange(n_points, n_points + 20), conf_int[:, 0], conf_int[:, 1], color="pink", alpha=0.3)
plt.legend()
plt.show()

print(model.summary())

#

zealous hollow Aug 8, 2023, 3:18 PM

#

what is m?

left tartan Aug 8, 2023, 3:29 PM

#

the size of the season

#

Like, if you have quarterly data, m=4

tidal bough Aug 8, 2023, 3:31 PM

#

but... the period of the input data is 20 points, not 25. or to be precise, 19.8, I think.

zealous hollow Aug 8, 2023, 3:32 PM

#

and if i have yearly ? 🌝

left tartan Aug 8, 2023, 3:33 PM

#

tidal bough but... the period of the input data is 20 points, not 25. or to be precise, 19.8...

good point. Trig was never my strong suit 🙂 fixed

vestal widget Aug 8, 2023, 3:33 PM

#

I want to finetune a language model with my custom data, does the data have like a define format or does the format depends on the model?

left tartan Aug 8, 2023, 3:49 PM

#

zealous hollow and if i have yearly ? 🌝

The question really is what's your "season". Is there an inherent cycle to the data? If you only have annual points, you may not be looking at an arima model, unless there's some underlying cycle to the data

zealous hollow Aug 8, 2023, 3:50 PM

#

zealous hollow Aug 8, 2023, 3:51 PM

#

left tartan The question really is what's your "season". Is there an inherent cycle to the d...

nah i have daily readings

left tartan Aug 8, 2023, 3:51 PM

#

zealous hollow and if i have yearly ? 🌝

Yah, so m=365, probably

zealous hollow Aug 8, 2023, 3:52 PM

#

should i use datetime object as input
or
this type of inputs

#

left tartan Aug 8, 2023, 3:54 PM

#

iirc, pmdarima doesn't use an x axis... it expects the data to be sequential (chronological) and the observations to be uniformed distributed

zealous hollow Aug 8, 2023, 3:55 PM

#

hmm so datetime objects as index and simply 'temp' values to it should work rigt

left tartan Aug 8, 2023, 3:55 PM

#

it doesn't matter what you do for the datetimes

#

Just make sure you don't have gaps.

zealous hollow Aug 8, 2023, 3:59 PM

#

nah data is proper and completee

zealous hollow Aug 8, 2023, 4:44 PM

#

left tartan Just make sure you don't have gaps.

🌚

#

🌝

jolly ginkgo Aug 8, 2023, 5:11 PM

#

Hello guys, I made a library which will genrate random data matrix, would anyone will try?

zealous hollow Aug 8, 2023, 5:28 PM

#

jolly ginkgo Hello guys, I made a library which will genrate random data matrix, would anyone...

sure

jolly ginkgo Aug 8, 2023, 5:29 PM

#

Here
pip install rand-omata

left tartan Aug 8, 2023, 5:33 PM

#

zealous hollow 🌚

Auto arima is slow because it’s trying lots of models out

zealous hollow Aug 8, 2023, 5:37 PM

#

jolly ginkgo Here ``` pip install rand-omata```

i amma need it very soon so as soon as i use it will give you my feedback

jolly ginkgo Aug 8, 2023, 5:38 PM

#

zealous hollow i amma need it very soon so as soon as i use it will give you my feedback

Okk

zealous hollow Aug 8, 2023, 5:40 PM

#

left tartan Auto arima is slow because it’s trying lots of models out

1.5 h 😭

zealous hollow Aug 8, 2023, 6:08 PM

#

2h

zealous hollow Aug 8, 2023, 6:42 PM

#

2.5 h

small wedge Aug 8, 2023, 6:43 PM

#

lmao

left tartan Aug 8, 2023, 7:08 PM

#

zealous hollow 2.5 h

Maybe share the code you’re running?

#

And, you don’t have to use auto arima, you could try tuning the parameters m yourself and measuring aic

zealous hollow Aug 8, 2023, 7:08 PM

#

import pandas as pd

# %%
df=pd.read_csv('data.csv')

# %%
df=df[['date','Temperature']]

# %%
df.columns=['date','temp']

# %%
df['date']=pd.to_datetime(df['date'],format='%d/%m/%Y')

# %%
df.set_index('date')

# %%
y_train=df['temp'].iloc[:5843]
y_test=df['temp'].iloc[5843:]

# %%
from pmdarima import auto_arima
model = auto_arima(y_train, seasonal=True, m=365, trace=True, error_action='ignore', suppress_warnings=True)
forecast, conf_int = model.predict(n_periods=len(y_test), return_conf_int=True)

#

i think it;s bcz the data has more nearly 7300 expriences

left tartan Aug 8, 2023, 7:10 PM

#

Maybe autoarima on a subset to get the parameters, then train?

zealous hollow Aug 8, 2023, 7:10 PM

#

btw still running 💀

#

only a year?

#

btw training data has values from 2003 - 2018

#

while testing 2018-22

#

and pandemic really affected the values i am working with

#

eto so for research standards
is it plausible to explain the drop in accuracy with pandamic as anamoly?

zealous hollow Aug 8, 2023, 7:13 PM

#

left tartan Maybe autoarima on a subset to get the parameters, then train?

how small of a data set you suggest i should use?

left tartan Aug 8, 2023, 7:14 PM

#

No idea, seems odd it’s so slow but I don’t work with daily data much and only use arima occasionally

zealous hollow Aug 8, 2023, 7:15 PM

#

well i amma try running 2 instances
of one year data of 2 diffirent years

#

if the parameters remain same

#

it should be okay to go with

#

right?

left tartan Aug 8, 2023, 7:17 PM

#

Yah, yah, the arima parameters aren’t extremely complicated

zealous hollow Aug 8, 2023, 7:20 PM

#

btw are there any serveices like google collab?

#

free

mild dirge Aug 8, 2023, 7:20 PM

#

free computing power?

#

Why not collab?

zealous hollow Aug 8, 2023, 7:21 PM

#

ye

#

isnt collab like very slow?

mild dirge Aug 8, 2023, 7:21 PM

#

Right, because it's free 😛

#

There aren't that many companies happy with throwing money away for the greater good

burnt saffron Aug 8, 2023, 7:21 PM

#

Hello

mild dirge Aug 8, 2023, 7:21 PM

#

And it's not that slow I don't think

burnt saffron Aug 8, 2023, 7:21 PM

#

How are you

dire iron Aug 8, 2023, 7:21 PM

#

zealous hollow btw are there any serveices like google collab?

jupyter notebook

zealous hollow Aug 8, 2023, 7:22 PM

#

...

zealous hollow Aug 8, 2023, 7:23 PM

#

mild dirge And it's not that slow I don't think

ehh works i can use it to play games while it's being done on the cloud 🤣

dire iron Aug 8, 2023, 7:24 PM

#

zealous hollow how small of a data set you suggest i should use?

how big is your dataset

zealous hollow Aug 8, 2023, 7:24 PM

#

it's not that big just 7304 expriences

#

should take 2 years right?

dire iron Aug 8, 2023, 7:26 PM

#

how are you defining seasonal in your dataset?

zealous hollow Aug 8, 2023, 7:27 PM

#

?

#

didnt get your question>

#

how my data is seasonal?

#

well it's temperature values

#

💀 first and foremost and it's visible from the plots as well

gaunt elbow Aug 8, 2023, 7:29 PM

#

@zealous hollow I think parsimonious is generally appreciated in ARIMA modeling. Trying to forecast tomorrow's temp probably is relatively related to last year temp but probably more closely related to today's temp. Since theres a moving avg component, using a 365 is just going to move your model to the yearly average which in summer or winter isn't representative at the extremes. An AR(365) is saying that every single day for the past year impacts the temperature tomorrow. I don't think either of those are convincing (personally IMHO)

zealous hollow Aug 8, 2023, 7:29 PM

#

yeah 💀

dire iron Aug 8, 2023, 7:29 PM

#

that may fix your issue too

zealous hollow Aug 8, 2023, 7:29 PM

#

so 1,2? i do have the acf and pacf plots

#

#

lag no is 76

#

i have tested arima with
0-6 for each p,d,q but only time it even showed something other than a straight like was with 4,0,3

#

but that quickly went to straight line as well only after a few cycles

gaunt elbow Aug 8, 2023, 7:33 PM

#

Where are your confidence intervals in those graphs? To me that looks like an AR(2) you would need to run the graphs again after running the model on your residuals to look for remaining significance

#

How far into the future are you forecasting?

zealous hollow Aug 8, 2023, 7:34 PM

#

my data set has temps from 2003-2022 so i am atleast expecting it to go to 2030

left tartan Aug 8, 2023, 7:35 PM

#

gaunt elbow <@951308284067991633> I think parsimonious is generally appreciated in ARIMA mod...

M=365 is just defining the seasonality of the data, that’s the correct use of the m parameter for daily data… that’s separate from the parameters to the arima model

zealous hollow Aug 8, 2023, 7:35 PM

#

ye i amma take billy's side on this one 🌝

left tartan Aug 8, 2023, 7:36 PM

#

Second, op is running pmd autoarima to find optimal parameters. That’s the slow part for op

gaunt elbow Aug 8, 2023, 7:36 PM

#

Oh I must have misread, I thought they were using a model like (365,0,365) or something crazy.

left tartan Aug 8, 2023, 7:36 PM

#

Oh no, autoarima searches through the parameter space. Ghosty: can you paste the autoarima output?

zealous hollow Aug 8, 2023, 7:37 PM

#

still running 🤡

#

#

made these adjustments training data to only 2 years

left tartan Aug 8, 2023, 7:37 PM

#

Oh, you’re not printing the output incrementally?

zealous hollow Aug 8, 2023, 7:38 PM

#

i am a total noob 🤡
through and thorough

#

so let me quickly look it up and do it

left tartan Aug 8, 2023, 7:38 PM

#

I dunno, my example above would print the models as they are tested

zealous hollow Aug 8, 2023, 7:39 PM

#

from pmdarima import auto_arima
model = auto_arima(y_train, seasonal=True, m=365, trace=True, error_action='ignore', suppress_warnings=True)
forecast, conf_int = model.predict(n_periods=len(y_test), return_conf_int=True)

#

it's exactly your code for me it's not printing out anything

dire iron Aug 8, 2023, 7:42 PM

#

set seasonal to false

zealous hollow Aug 8, 2023, 7:42 PM

#

zealous hollow Aug 8, 2023, 7:42 PM

#

dire iron set seasonal to false

data is seasonal though

zealous hollow Aug 8, 2023, 7:43 PM

#

zealous hollow

should i use m=7? @left tartan ||sorry for pinging||

dense crane Aug 8, 2023, 7:44 PM

#

i want change the shape of data frame which is (400 x 1300) reduced to (400 x 1200) with something similar to PCA, but cannot use the PCA since n_components has to be smaller than 400, any ideas?

left tartan Aug 8, 2023, 7:44 PM

#

zealous hollow should i use m=7? <@738234281146712084> ||sorry for pinging||

7 is for weekly periodicity. Like sun-sat

zealous hollow Aug 8, 2023, 7:45 PM

#

oh well i amma wait till it finishes and see the output

dire iron Aug 8, 2023, 7:45 PM

#

dense crane i want change the shape of data frame which is (400 x 1300) reduced to (400 x 12...

what type of data?

dense crane Aug 8, 2023, 7:45 PM

#

dire iron what type of data?

real numbers

zealous hollow Aug 8, 2023, 7:46 PM

#

dense crane real numbers

how about LDA

dense crane Aug 8, 2023, 7:47 PM

#

zealous hollow how about LDA

the same story

zealous hollow Aug 8, 2023, 7:47 PM

#

??

dense crane Aug 8, 2023, 7:47 PM

#

zealous hollow Aug 8, 2023, 7:48 PM

#

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

# Assuming X is your data and y are the corresponding class labels
lda = LinearDiscriminantAnalysis(n_components=1200)
X_lda = lda.fit_transform(X, y)

dense crane Aug 8, 2023, 7:49 PM

#

zealous hollow ``` from sklearn.discriminant_analysis import LinearDiscriminantAnalysis # Assu...

yeah but it also has to be lower than 400

zealous hollow Aug 8, 2023, 7:49 PM

#

and you dont want it to stay 400

#

?right

dense crane Aug 8, 2023, 7:49 PM

#

like i dont have (1300 x 400) but (400 x 1300)

#

yeah i want 1200

#

i want (400 x 1200)

zealous hollow Aug 8, 2023, 7:50 PM

#

how about rfe?

dense crane Aug 8, 2023, 7:51 PM

#

zealous hollow how about rfe?

ok let me do some reserach about that

gaunt elbow Aug 8, 2023, 7:51 PM

#

@zealous hollow what is the problem you're trying to solve exactly? The forecast being flat?

zealous hollow Aug 8, 2023, 7:51 PM

#

this is my data
i want to predict values of each till 2030(very least)

dense crane Aug 8, 2023, 7:52 PM

#

zealous hollow how about rfe?

seems working

#

thanks for that!

zealous hollow Aug 8, 2023, 7:53 PM

#

i alreay achieved 93% accuracy with gradient boosting for temp
but rest it's not getting any above 60%

dense crane Aug 8, 2023, 7:53 PM

#

@zealous hollow but can you in few wordk describes more or less what this does (rfe)

zealous hollow Aug 8, 2023, 7:57 PM

#

dense crane <@951308284067991633> but can you in few wordk describes more or less what this ...

https://paste.pythondiscord.com/RX3A

fresh harbor Aug 8, 2023, 7:58 PM

#

why are many pytorch models not exported to onnx?

zealous hollow Aug 8, 2023, 7:58 PM

#

dire iron what type of data?

btw can you resend that research paper

fresh harbor Aug 8, 2023, 7:59 PM

#

shouldnt this help in getting of torch dependency

dire iron Aug 8, 2023, 8:02 PM

#

zealous hollow btw can you resend that research paper

hm?

zealous hollow Aug 8, 2023, 8:02 PM

#

someone sent a research paper link here just now related to my work and it got deleted by bot

serene scaffold Aug 8, 2023, 8:09 PM

#

zealous hollow someone sent a research paper link here just now related to my work and it got d...

who is "someone"? I can check why the message was removed, but I have to know the user ID for the author.

zealous hollow Aug 8, 2023, 8:10 PM

#

i dnot now myseld 💀

left tartan Aug 8, 2023, 8:17 PM

#

zealous hollow i dnot now myseld 💀

Ghosty, is that data set public? Was gonna run arima on it on my end. Always fun to try new data.

zealous hollow Aug 8, 2023, 8:18 PM

#

nah

#

but you can use it

#

||bro please do, it will save me a lot of time||

#

🤣

#

make sure to save your results of testing

left tartan Aug 8, 2023, 8:18 PM

#

feel free to throw it in a gist or whatever, and dm

#

got it, thx

zealous hollow Aug 8, 2023, 8:21 PM

#

np

zealous hollow Aug 8, 2023, 8:46 PM

#

left tartan Aug 8, 2023, 9:02 PM

#

wow, that's one heckuva first search

verbal venture Aug 8, 2023, 10:57 PM

#

The first layer of my fully connected layer is 2x the output of my final model's layer. What am I doing wrong here? ```py

if accuracy is not higher, and not changing epochs or batch size

more CNN Layers, more nodes in layers

'Conv2d, BatchNorm2d, and ReLU.'

class MyModel(nn.Module):
def init(self):
super(MyModel, self).init()
self.model = nn.Sequential(
nn.Conv2d(3, 16, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(16),
nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(64))

    self.classifier = nn.Sequential(
    nn.Linear(64 * 128 * 128, out_features=256),
    nn.ReLU(),
    nn.Linear(in_features=256, out_features=10)
    )
    
def forward(self, x):
    x = self.model(x)
    x = x.view(x.size(0), -1)
    x = self.classifier(x)
    
    return x ```

#

basically my first linear layer needs to be divided by 2, but I shouldn't hard code that. What am I getting wrong about the input parmaeters?

maiden wadi Aug 8, 2023, 11:08 PM

#

verbal venture The first layer of my fully connected layer is 2x the output of my final model's...

can you paste the error?

verbal venture Aug 8, 2023, 11:09 PM

#

mat1 and mat2 shapes cannot be multiplied (32x65536 and 1048576x256)

maiden wadi Aug 8, 2023, 11:09 PM

#

okay

#

the shapes are wrong

#

self.classifier = nn.Sequential(
        nn.Linear(65536, out_features=256),
        nn.ReLU(),
        nn.Linear(in_features=256, out_features=10)
        )

#

this should fix

verbal venture Aug 8, 2023, 11:12 PM

#

right. but I shouldn't hardcode 65536 - I should reuse the height * w of the previous output layer. I'm wondering what the numbers are, as I thought the output of the previous layer was 64, 128, 128 (which is incorrect)

maiden wadi Aug 8, 2023, 11:13 PM

#

yep is more readable using the formula

#

in this case is 64 * 32 * 32

verbal venture Aug 8, 2023, 11:15 PM

#

yeah, why is it 32 * 32?

maiden wadi Aug 8, 2023, 11:20 PM

#

verbal venture Aug 8, 2023, 11:27 PM

#

But applied to this: nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(64). Wouldn't the output be 64*64?

maiden wadi Aug 8, 2023, 11:30 PM

#

verbal venture But applied to this: nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, p...

the image size is 1x32x32?

verbal venture Aug 8, 2023, 11:30 PM

#

yes

maiden wadi Aug 8, 2023, 11:31 PM

#

okay in the formula

#

W = 32 (Image size)
F = 3 ( Kernel)
P = 1 (Padding)
S = 1 (Strides)

#

so

verbal venture Aug 8, 2023, 11:32 PM

#

so the weight and height of the image does not change throughout the network?

#

just the feature maps at each convolution?

maiden wadi Aug 8, 2023, 11:34 PM

#

as soon as you have the 64

#

refering to out channels

#

64 * 32 * 32

#

of the previous layer

#

the calc is the same

verbal venture Aug 8, 2023, 11:48 PM

#

@maiden wadi so are feature maps kind of arbitrarily chosen? I was getting confused because I was using that as W to compute the feature maps at the next stage (which somehow worked)

maiden wadi Aug 8, 2023, 11:50 PM

#

verbal venture <@261619486854086657> so are feature maps kind of arbitrarily chosen? I was gett...

you were doing well, In this case the shape don't change because of the values of the kernel and pading

#

(32 - 3 + 2 * 1 ) / 1 + 1 = 32

#

but imagine

#

we have kerne size of 4x4

#

(32 - 4 + 2 * 1 ) / 1 + 1 = 31

#

now we decrese the value

#

so eache layer will decreese in 1

verbal venture Aug 8, 2023, 11:52 PM

#

how are the increasing feature map values determined?

maiden wadi Aug 8, 2023, 11:54 PM

#

There is no like a correct way of selecting the features

mild dirge Aug 8, 2023, 11:55 PM

#

There is a bit of intution in it that you expect earlier layers to often have smaller features (and smaller perceptive field) thus probably less. and later layers combine them into more complex features, of which you expect more.

#

But no set rule.

#

This translates into later layers having more channels

zealous hollow Aug 9, 2023, 1:38 AM

#

left tartan wow, that's one heckuva first search

got this error

#

i have 32 gbs of ram

#

and i just saw python take 20 gb of ram 🤣

verbal venture Aug 9, 2023, 1:54 AM

#

Does anyone know why the flattened features are 10368? Shouldn't they be 3200? My input dims are 3, 160, 160: ```py
self.model = nn.Sequential(
nn.Conv2d(3, 32, 3, 1),
nn.ReLU(),
nn.MaxPool2d(),

    # 32, 80, 80
    nn.Conv2d(32, 64, 3, 1),
    nn.ReLU(),
    # 64, 40, 40
    nn.MaxPool2d(),
        

    # 32, 40, 40
    nn.Conv2d(64, 32, 3, 1),
    nn.ReLU(),

    # 32, 20, 20
    nn.MaxPool2d(),
       
    # 32, 20, 20
    nn.Conv2d(32, 32, 3, 1),
    nn.ReLU(),
    # 32, 10, 10
    nn.MaxPool2d())
    
    self.classifier = nn.Sequential(
    nn.Linear(10368, 2048),
    nn.Linear(2048, 128),
    nn.Linear(128, n_classes))```

zealous hollow Aug 9, 2023, 3:13 AM

#

verbal venture Does anyone know why the flattened features are 10368? Shouldn't they be 3200? M...

The discrepancy in flattened features arises from not accounting for the final Conv2d layer's output dimensions before the classifier; specifically, the dimensions after the last MaxPool2d layer are 32x8x8, leading to 8192 flattened features instead of 10368.

verbal venture Aug 9, 2023, 3:21 AM

#

zealous hollow The discrepancy in flattened features arises from not accounting for the final C...

why would that equarte 8192

#

shouldn't the flattened input from 32 * 8 * 8 = 2048?

dire iron Aug 9, 2023, 3:23 AM

#

verbal venture shouldn't the flattened input from 32 * 8 * 8 = 2048?

why would they be 32 * 8 * 8

verbal venture Aug 9, 2023, 3:24 AM

#

don't you flatten the feature maps * h * w

#

if the final layer after max pool is 32 * 8 * 8

dire iron Aug 9, 2023, 3:25 AM

#

what features do you care about? if output, then only the final layer, right?

verbal venture Aug 9, 2023, 3:27 AM

#

Idk I thought that was just the process

#

pass the input through the conv layers then get the flattened output and use as FCL input

dire iron Aug 9, 2023, 3:33 AM

#

well, if you're classifying something, it just uses 3 linear layers. if you are training it, then it is nn.Conv2d(32, 32, 3, 1),

verbal venture Aug 9, 2023, 3:38 AM

#

yeah I'm training it

#

just confused how 8192 got reached as the flattened layers

twilit tundra Aug 9, 2023, 5:26 AM

#

verbal venture just confused how 8192 got reached as the flattened layers

Your intermediate dimensions seem wrong, how did you compute them?

#

If you don't put strides, the dimensions will not be divided but reduced by kernel size * dilation (-padding if there is)

split drift Aug 9, 2023, 5:31 AM

#

young granite so u generate json files ? what is ur goal

.parquet

My goal is to know how each feature was created, what were the inputs that were use for these feature creation

twilit tundra Aug 9, 2023, 5:54 AM

#

split drift .parquet My goal is to know how each feature was created, what were the inputs ...

Are you looking for something like this? https://github.com/qchenevier/pandas-pipeline-graphviz doesn't seem well-maintained unfortunately, it's apparently a built-in feature in dask

GitHub

GitHub - qchenevier/pandas-pipeline-graphviz: Pandas pipeline in gr...

Pandas pipeline in graphviz. Contribute to qchenevier/pandas-pipeline-graphviz development by creating an account on GitHub.

split drift Aug 9, 2023, 6:21 AM

#

twilit tundra Are you looking for something like this? https://github.com/qchenevier/pandas-pi...

yes, thank you

clever owl Aug 9, 2023, 7:37 AM

#

Im doing data transformations on an excel file. Now I wanna test a function that cleans the top off an excel file (the file sorta looks like this)

My company name. Some Additional Info Some Other stuff
Space
Space
Additional Stuff

Column Title 1 Column Title 2
Column Value 1 Column Value 2

Just wondering if for my pytest dataframe fixture, would it be ok to make it read from an excel file instead of making this manually in the dataframe? Or is communicating with an external source strictly bad for testing

brittle lily Aug 9, 2023, 8:28 AM

#

i have a bunch of rna sequences (and their secondary structures) and their corresponding energy values and im trying to find a way to identify features (patterns in their structures or sequences) common between samples of similar energy values. would be super super grateful if anyone could recommend me algorithms to look into using - im assuming this would be an unsupervised learning project and i only have experience with supervised stuff, but im looking into pca right now and not sure if thatd be useful. should i be looking into something else?

intuitively im imagining it as like a clustering + feature extraction problem where i have a bunch of dots, each representing an rna sample, and then the axis represents energy so the dots could be clustered in energy value similarity and then within each energy value cluster patterns/relationships could be found between the sequences and structures of the samples. but not sure if an ml algo exists to handle this and pca seems like not what im looking for because the axes would be principal components and not energy... pls help if u have any ideas!

#

again would really really appreciate any ideas for how to go about doing this or validation on whether pca is the way to go

quartz wigeon Aug 9, 2023, 9:06 AM

#

I'm having trouble understanding this ensemble learning boosting equation in the georgia tech course for machine learning. What does Zt represent in the equation?
Here is the link to the playlist, the explanation of the algorithm goes on for 4-5 videos.
https://www.youtube.com/watch?v=ooxQS5-Grgc&list=PLPhC147aCdDF_9RWFadPcZomRB2tjAhQ9&index=14

#

Really appreciate if someone can help me out. I'm a self learner and don't have access to a teacher so discord is my only way to ask questions

sleek harbor Aug 9, 2023, 9:40 AM

#

Is knowing stuff like Agile (Scrum, Kanban) and/or Jira necessary for data scientists/analysts? What tools do you use?

boreal gale Aug 9, 2023, 10:13 AM

#

ads are not allowed here unless they have been previously approved. please remove your post if you haven't obtained permission to post this.

boreal gale Aug 9, 2023, 10:15 AM

#

sleek harbor Is knowing stuff like Agile (Scrum, Kanban) and/or Jira necessary for data scien...

necessary as in to get a job?
no, those are things you pick up on the job.

necessary as in to improve your workflow?
not really, they are just ways to do project management, i feel like this is just a matter of the organisation's prefrerence

past meteor Aug 9, 2023, 12:23 PM

#

Properly applying the principles of the agile manifesto are of benefit to nearly every job

#

But it's improperly applied more than it is so it's pretty toxic in reality

slim bone Aug 9, 2023, 2:10 PM

#

Hey folks, for my summer vacation I took up ML as I'd like to work in the field in the future (or something close to it, at least)

Learning about the fundamental theory of neural networks was fascinating but I find the programming a bit... uninspired? I just find myself simply copy-pasting everything which feels pretty lame, regardless of how cool the outcome is.

Basically, I'd like to know if this is a common issue, and how can I make the coding process a little more creative? I feel like I'm lacking vision regarding what's exactly out there.

Hopefully most of what I'm saying makes sense, if not @ me and I'll clarify
Thanks in advance!

small wedge Aug 9, 2023, 2:47 PM

#

slim bone Hey folks, for my summer vacation I took up ML as I'd like to work in the field ...

What kind of projects have you worked with so far?

split drift Aug 9, 2023, 3:04 PM

#

While using pandas, should I write in a chaining method, or is is a recipe for disaster?

heavy crow Aug 9, 2023, 3:55 PM

#

Try picking some project (mnist classifier) and try creating the solution from scratch. Sure look at how other people have done it, but don't look up a tutorial

#

Take the tensorflow example, and then start by implementing some steps. (2d convolution, forward pass etc)

fierce merlin Aug 9, 2023, 4:10 PM

#

Yo guys im trying to build a posture detector app, from what ive found out, ill need openCV + either tensorflow OR pytorch, what would you guys recommened? i need some kind of guidance on this

slim bone Aug 9, 2023, 4:13 PM

#

small wedge What kind of projects have you worked with so far?

I'm really sorry for the late reply.
I only did a couple of vision models, where the code was pretty much entirely just this:
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
With a few tweaks here and there.

I didn't realize just how much of the programming would just be "behind the scenes magic". It feels like everything has been implemented for me.
What am I missing here?

small wedge Aug 9, 2023, 4:29 PM

#

slim bone I'm really sorry for the late reply. I only did a couple of vision models, where...

Yeah, with modern high level libs/apis for ML it is pretty much all implemented for you. I'd say there are two branches here to go down, either you should pick a project that doesn't have a fleshed out tutorial. Maybe even something you have to collect your own dataset for, and do some experimentation implementing your knowledge of theory on a sort of novel project. That will lead you to do a lot experimentation with activation functions, architectures, optimizers, hyperparameters, maybe even push you to researching other types of models such as RNN's.

I also know you've been working on the math, another option is to go the other direction and try to implement all of the low level stuff yourself. For this I would recommend using a reputable dataset like you have been, and a task that requires a small/shallow/simple model like classifying MNIST handwritten digits.

fierce merlin Aug 9, 2023, 4:35 PM

#

fierce merlin Yo guys im trying to build a posture detector app, from what ive found out, ill ...

Hey id appreciate if anyone could help

slim bone Aug 9, 2023, 4:38 PM

#

small wedge Yeah, with modern high level libs/apis for ML it is pretty much all implemented ...

and a task that requires a small/shallow/simple model like classifying MNIST handwritten digits.
I actually did try implementing this but I was a tad bit overwhelmed. I later implemented it with Pytorch with, like, 25 lines of code (Insanity!)

I am a lot more interested in the "under the hood stuff" than the actual implementation of things. So I suppose I'm a little more interested in the math-ier side of things and actually understanding what's going on.
On a slightly different note, I can't help but wonder what people who work in the field actually do? I can't imagine they're just using Pytorch all day long.

twilit tundra Aug 9, 2023, 4:51 PM

#

slim bone > and a task that requires a small/shallow/simple model like classifying MNIST h...

Why not?

slim bone Aug 9, 2023, 4:54 PM

#

twilit tundra Why not?

Apparently it's a highly specialized field that requires master's and PhD degrees in order to be* qualified to work in the field
It seems like Pytorch has removed so many layers of abstractions it's almost unreal. I'm probably missing something though

desert oar Aug 9, 2023, 4:54 PM

#

slim bone > and a task that requires a small/shallow/simple model like classifying MNIST h...

you kind of need all 3: how to work with code, understanding the math (not just linear algebra, lots of probability and stats to learn too), and getting experience-based intuition for working with data and models generally

#

how you proceed through the very long process of developing all 3 of those things is up to you and depends on your immediate interests

desert oar Aug 9, 2023, 4:56 PM

#

slim bone Apparently it's a highly specialized field that requires master's and PhD degree...

i do data science professionally and i have a fairly "light" masters in quantitative social science. i've had to go back and re-learn several parts of the math that i didn't learn well or didn't learn completely enough the first time around. but even before i did that, i knew enough to fit models with pytorch. i just didn't have a really solid grasp of things enough to develop more advanced customized solutions to problems i had.

if you're doing NN stuff you're probably using pytorch on a regular basis. but plenty of people are very productive in data-scientist-like jobs and generally don't need NNs on a regular basis. it depends on a lot on what you specialize in and/or what your particular company/industry needs.

fierce merlin Aug 9, 2023, 4:57 PM

#

fierce merlin Yo guys im trying to build a posture detector app, from what ive found out, ill ...

guys please 😭

slim bone Aug 9, 2023, 5:05 PM

#

desert oar i do data science professionally and i have a fairly "light" masters in quantita...

I thought Data Science merely uses ML as a tool and is not strictly about ML? Or have I missed what you're trying to say?

slim bone Aug 9, 2023, 5:08 PM

#

desert oar you kind of need all 3: how to work with code, understanding the math (not just ...

Also, regarding this - If I had to put my current ambitions into words as accurately as possible I'd say "Whatever it's like to work as an AI(*) Engineer/Scientist/Whatever in the industry, I'd like to experience that"

Unfortunately I have no knowledge in stats nor probabilities yet, so if that's impossible I'll probably put that ambition on hold. I would like to know if that's actually the case though and if I could do something relevant without any knowledge in stats

(*) I don’t know if this is the generalization I’m looking for

sleek harbor Aug 9, 2023, 6:13 PM

#

Is knowing ORM necessary, or is SQL enough?

twilit tundra Aug 9, 2023, 6:37 PM

#

slim bone Also, regarding this - If I had to put my current ambitions into words as accura...

I'm pretty sure data/ML engineering doesn't require as much statistics, but I'm not an expert. In general, you don't need a very deep understanding of statistics but it's highly recommended you at least know the fundamentals

slim bone Aug 9, 2023, 6:44 PM

#

twilit tundra I'm pretty sure data/ML engineering doesn't require as much statistics, but I'm ...

Oh, I was told the opposite - that most of the work involved is actually statistics. Curious

twilit tundra Aug 9, 2023, 6:45 PM

#

Well, it is but it's not that hard? pithink

slim bone Aug 9, 2023, 6:45 PM

#

Oh, that would be cool if that’s the case 🙂

twilit tundra Aug 9, 2023, 6:50 PM

#

I wouldn't recommend it over an engineering position if you're struggling with stats and probabilities, but if it's just that you haven't studied it yet and you have a good sense of stats then there shouldn't be any issue

slim bone Aug 9, 2023, 6:55 PM

#

I don’t really believe you’re inherently “bad” at something. From my experience there’s a strong correlation between grades and interest

#

That’s besides the point though

#

I’m really just trying to understand how to experience “the real deal” of AI/ML engineering or whatever

past meteor Aug 9, 2023, 7:02 PM

#

slim bone I thought Data Science merely uses ML as a tool and is not strictly about ML? Or...

Data science doesn't really mean anything

#

Because each company defines it differently

sturdy canyon Aug 9, 2023, 7:21 PM

#

slim bone I’m really just trying to understand how to experience “the real deal” of AI/ML ...

This isn't standard across pretty much anything either. You could make an entire career out of using existing open source repos for ML without ever needing to understand how exactly it works or modifying underlying architectures. It all depends on what specifically you're working on

slim bone Aug 9, 2023, 7:23 PM

#

sturdy canyon This isn't standard across pretty much anything either. You could make an entire...

Surely there's some foundation necessary to work in the field, though

#

And some common experience all of them share

sturdy canyon Aug 9, 2023, 7:23 PM

#

I've taught middle schoolers how to implement image classification models

slim bone Aug 9, 2023, 7:23 PM

#

I don't understand what this has to do with the question at hand though

twilit tundra Aug 9, 2023, 7:24 PM

#

It means middle schoolers can work as data scientists

slim bone Aug 9, 2023, 7:24 PM

#

I... have to call BS on that, I'm sorry

sturdy canyon Aug 9, 2023, 7:24 PM

#

I believe it's because you haven't posed an answerable question. There is no "standard" foundation necessary, as different companies define the roles and the requirements for those roles differently

#

My point in mentioning teaching middle schoolers classification is if someone needed a very basic classifier to be implemented across a lot of different applications with pretty low accuracy, they would be qualified to do it

slim bone Aug 9, 2023, 7:26 PM

#

Okay, I suppose there's no "standard" for being a CEO either - but I'm not asking for an absolute standard but rather a generalisation of said standard

That generalisation appears to be higher education. I'm fairly sure there's a common derivative at the end of the tunnel

#

Also, just for the record - @ Tonabrix seemed to have an idea of what I'm asking, and indeed suggested a couple of "paths" or whatever one might call them

sleek harbor Aug 9, 2023, 7:30 PM

#

slim bone Okay, I suppose there's no "standard" for being a CEO either - but I'm not askin...

I'll just note that I don't have a job, so.. don't listen to me. But if you want to work with AI, higher mathematics and statistics definitely will not be extra. I've been regretting that I didn't try to actually remember what I studied at university, and I haven't even gone very far down the rabbit hole of DS/ML/AI

Can you be a DS/MLE without strong fundamentals of math and stats? Yes, there are enough abstractions and good libraries to get by. But you'll eventually have to learn all that stuff, probably :3

slim bone Aug 9, 2023, 7:31 PM

#

sleek harbor I'll just note that I don't have a job, so.. don't listen to me. But if you want...

Yeah this has been made clear to me, I've actually been somewhat fond of the mathematics I've learned so far so I'm looking to apply my knowledge

#

The question has sort of been missed throughout the entire conversation unfortuantely, it started out fairly simple and diverged to other topics I'm afraid

twilit tundra Aug 9, 2023, 7:31 PM

#

You'll need at least linear algebra or stats, both is better

sturdy canyon Aug 9, 2023, 7:31 PM

#

I also understand what you're asking, but I don't believe you're asking the right question. It's pretty straightforward to figure out what is necessary to become an ML person "in general", and what they do. Just go to indeed, search Machine Learning Engineering and do your best to find common threads. Past that, everyone is just going to give you responses based on their personal experience

slim bone Aug 9, 2023, 7:32 PM

#

I... thank you but that really wasn't the question - I have a rough idea what mathematical background is required. I'm currently in university lol

slim bone Aug 9, 2023, 7:32 PM

#

sturdy canyon I also understand what you're asking, but I don't believe you're asking the righ...

Personal experience is perfect

sturdy canyon Aug 9, 2023, 7:32 PM

#

slim bone I’m really just trying to understand how to experience “the real deal” of AI/ML ...

What did you mean by this then?

slim bone Aug 9, 2023, 7:32 PM

#

sturdy canyon What did you mean by this then?

Sorry this was directed at Rose, I forgot to tag

slim bone Aug 9, 2023, 7:33 PM

#

sturdy canyon I also understand what you're asking, but I don't believe you're asking the righ...

I'd like to emphasize that I'm not looking for an objective answer in case that wasn't clear
If there's somebody here that can share their own experience from the job market/academia that'd be grand. That's sort of what I'm looking for

twilit tundra Aug 9, 2023, 7:50 PM

#

slim bone I'd like to emphasize that I'm not looking for an objective answer in case that ...

My main tasks as a data scientist have been (on different projects, I'm a consultant) :

data cleaning/analysis
feature engineering
being up-to-date on recent models/advances
finding ways to exploit available data
designing models
fine-tuning models (can be deep learning models or boosting ones)
develop an interface for POC
deploy models/apps (most of the time in a cloud environment)
communication to stakeholders

sturdy canyon Aug 9, 2023, 7:57 PM

#

I now do data science for a living, and have my own business that provides ML solutions to clients. However my situation will be vastly different from yours because I started as a mechanical engineer that got interested in applying the statistics electives I took in uni after working on measurement systems. I eventually worked my way through jobs focused more and more on data/stats/ML and now I'm here. In my experience, your focus should be on what makes you want to investigate and try things. If you want a list of things you should be able to do, I know a lot of what Rose said as well, but I didn't know how to do a number of them when I got what I'd consider my first true job in this field. Additionally, a lot of what I do in my actual job involves estimating risk (and therefore stats), but if you're working on keypoint tracking to put digital butterflies on people in a museum, you may never touch the kind of risk analysis I do

sturdy canyon Aug 9, 2023, 8:13 PM

#

I also have two friends that work in AI for the defense sector and high precision optics respectively (I'm in healthcare). They focus on WILDLY different things than me once the "standard" stuff above is done or needs to be customized for a specific purpose

slim bone Aug 9, 2023, 8:15 PM

#

Thank you both for the detailed description. This does give me some insights regarding where my question lacks. If you don't mind though, I'd like to try and explain my current situation and perhaps just seek "advice", instead of asking something concrete, regarding where I should continue from here

I've picked up ML not long ago, and learning about the fundamental theory of how the field absolutely fascinated me. Unfortunately when I got to work with Pytorch I've discovered a lot of those processes have been simplified to lines like

outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()

Which on one hand, is incredible, but on the other hand this made coding not fun at all, as everything feels like a black box.

Essentially, I like to know what's going on but not sure where to continue from here. "Read a book" seems like the obvious answer here but most of the books in my arsenal are study books which require knowledge I yet have.

Essentially, I'm not sure where to go, or what I'm even looking for. I'm hoping this vague description of my experience should suffice for you folks to understand where my interest is and what it is I'm trying to do

#

If I haven't mentioned already, I finished my calculus and linear algebra courses - no knowledge in probability/stats as those come in the following semesters

placid cedar Aug 9, 2023, 8:21 PM

#

hi guys i need ur help. if i have a line chart with the x axis from 2020 to 2022, and the y axis being sale quantity, and i have different lines representing each store which are the legends, is that bivariate or univariate

tidal bough Aug 9, 2023, 8:22 PM

#

slim bone Thank you both for the detailed description. This does give me some insights reg...

I got a lot of my knowledge of ML from Ngo's free introductory course on coursera. That was years ago, back when that course was entirely in Octave, so I don't know how the modern version (which is in Python) compares, but the course had many assignments for implementing ML primitives like backpropagation, gradient descent, support vector models, etc. Maybe that'll make them feel less like blackboxes for you (if the modern version even has these exercises still, of course).

placid cedar Aug 9, 2023, 8:23 PM

#

placid cedar hi guys i need ur help. if i have a line chart with the x axis from 2020 to 2022...

quite a basic question but wld like to clarify for my school assignments

twilit tundra Aug 9, 2023, 8:23 PM

#

tidal bough I got a lot of my knowledge of ML from Ngo's free introductory course on courser...

Ng's course was(is?) incredible

sturdy canyon Aug 9, 2023, 8:25 PM

#

slim bone Thank you both for the detailed description. This does give me some insights reg...

I suppose the question I have is why do you want to learn more about how these processes exactly work? Are you hoping to go into research and/or work independently to create the next best pytorch or the most accurate/fastest model architecture X? Or is it more related to confidence in what you're developing? Pytorch is open source, so if you don't like not knowing how something works, you should be able to find exactly how these "black boxes" work. At one point I also felt like I needed to understand absolutely every detail about everything in stats worked, but after a while I personally found it to be far more practical and enjoyable when encountering something new to learn just enough to be able to implement it while paying attention to the assumptions/uncertainties with the model. That way I could see how it worked, and then determine how much more I needed to learn in order to achieve what I was trying to do.

slim bone Aug 9, 2023, 8:32 PM

#

tidal bough I got a lot of my knowledge of ML from Ngo's free introductory course on courser...

From what I've seen most of these courses (Coursera, Udemy, Google, and the like) mostly focus on the "create something" side of things rather than the theory (foundation?) of it all. Then again what you're describing sounds like what I'm looking for which is a little strange. Maybe I haven't looked hard enough?

tidal bough Aug 9, 2023, 8:32 PM

#

I haven't looked at that course nowadays but back when I did it, it had lectures on a lot of the theory, and the implementation tasks were mostly of little parts.

twilit tundra Aug 9, 2023, 8:35 PM

#

slim bone From what I've seen most of these courses (Coursera, Udemy, Google, and the like...

In my experience, coursera has very fundamental courses. You should probably sign up and try some since it's free

slim bone Aug 9, 2023, 8:36 PM

#

sturdy canyon I suppose the question I have is why do you want to learn more about how these p...

"Why?" is a bit of a hard question because I'm at the beginning of my road of course. When I envision myself working in the field I'm thinking about the "Big dreamy models", like chatGPT or an automated car or whatever.

Of course this is all a "postcard description" but I'm not sure how else to put it. The dream project would probably be an attempt at a general purpose AI? Or a model capable of automating basic tasks millions of people work have as their job nowadays?

Whatever that entails, ig. Honestly a part of me just tells me to wait out and take the university courses I'll inevitably have to take anyway, but I have some time to burn and it'd be lovely to spend it on something I gravitate towards

slim bone Aug 9, 2023, 8:36 PM

#

twilit tundra In my experience, coursera has very fundamental courses. You should probably sig...

Completely unrelated but isn't Coursera an extremely expensive monthly subscription service?

twilit tundra Aug 9, 2023, 8:37 PM

#

You can enroll for free on all courses

#

You just don't get the certification

slim bone Aug 9, 2023, 8:37 PM

#

Wait, really?

twilit tundra Aug 9, 2023, 8:38 PM

#

There is a small link when you clik to enroll on a course

slim bone Aug 9, 2023, 8:39 PM

#

Must've missed it, I'll check again. Thank you

tidal bough Aug 9, 2023, 8:40 PM

#

yeah, enrolling for free locks you out of some assignments but it's usually not very different

placid cedar Aug 9, 2023, 8:42 PM

#

im getting quite confused and struggling to disguish bivariate and univariate analyses, cld anyone lend me a helping hand? 🥲

#

tried finding information online, couldn't really find any useful sources

sturdy canyon Aug 9, 2023, 8:46 PM

#

slim bone "Why?" is a bit of a hard question because I'm at the beginning of my road of co...

As someone who spent a bunch of time trying to learn fundamentals before jumping into the real world, only to realize once I got there that I actually wanted to do something totally different (mechanical engineering -> data sci + ML), I would HIGHLY suggest you spend some time trying to answer that question. It's also why I keep asking the questions I do and responding in the way I do. I would suggest figuring out what you want to work on before going in deep on the how. Some understanding is necessary to pick a direction, but not much. Whether it be the nitty gritty math, or implementing models for time series analysis, image analysis, NLP, etc. I would pick one (or many) things that interest you and try to figure out how to do it yourself, even if it doesn't work well. If you really like what you picked, and are like me, you'll find motiviation in figuring out how to make it better. From there, you have the context and the reason to want to dig into the "black box" and find out what you need to learn in order to do what you want to do.

#

Or, you could go deep into a field and discover that you really love it via learning the technical details first and everythings great! Though, in my case I just hope some day I find a use for the fluid dynamics of viscous plastic extrusion that's still taking up space in my brain pithink

slim bone Aug 9, 2023, 8:53 PM

#

sturdy canyon As someone who spent a bunch of time trying to learn fundamentals before jumping...

Admittedly, I just enjoy learning about the abstractions and the math behind things at the moment :/ So I figured whatever involves "that", is what I'd like to try for now

I completely agree with what you're saying, about trying out everything, but I have a hunch that a lot of "things" are sort of "out of my reach" in terms of knowledge (Please do correct me if I'm wrong on this, but NLP seems crazy complicated for example).

That might be kind of what I'm poking at though - how do I even go about "trying everything"?

#

I'd like to emphasize that this is supposed to be more "fun" than actually practical. Although, if I could spare myself some future headache when I'll actually have to learn about this in university that'd be grand

slim bone Aug 9, 2023, 8:58 PM

#

sturdy canyon Or, you could go deep into a field and discover that you really love it via lear...

Funny you should say that, I got into CS because I had to take a programming course as an entrance test to get accepted into a Psychology major

So maybe I won't even get to work in ML in the future and will shift to something else entirely, who knows? ^^;

twilit tundra Aug 9, 2023, 8:59 PM

#

A psych major doing ML would be good for the field

sturdy canyon Aug 9, 2023, 9:00 PM

#

slim bone Admittedly, I just enjoy learning about the abstractions and the math behind thi...

If you want the way I did it, start with something that irritates you, and see if you think could be predicted (even partially) and/or automated. From there, look into how other people did it, and how you might do it yourself

#

I've got to go, but feel free to DM me if you still have more questions

rough nova Aug 9, 2023, 9:02 PM

#

@sturdy canyon
Ok

#

Tire means

slim bone Aug 9, 2023, 9:03 PM

#

twilit tundra A psych major doing ML would be good for the field

Maybe later down the line, for now I’m enjoying CS 🙂

slim bone Aug 9, 2023, 9:03 PM

#

sturdy canyon I've got to go, but feel free to DM me if you still have more questions

Cheers! Take care

twilit tundra Aug 9, 2023, 9:03 PM

#

I meant the ML field

slim bone Aug 9, 2023, 9:04 PM

#

Yeah I got that haha

raw compass Aug 9, 2023, 9:06 PM

#

What is the best Linux server for training a model?

#

Like what cloud

twilit tundra Aug 9, 2023, 9:07 PM

#

raw compass What is the best Linux server for training a model?

Based on what?

tidal bough Aug 9, 2023, 9:09 PM

#

I remember seeing some nice site that trained some model (ResNet?) on several cloud providers and provided a table with the costs, but I don't have a link

#

(I think AWS was ahead? not sure the results would even apply a year or two later, though)

raw compass Aug 9, 2023, 9:20 PM

#

twilit tundra Based on what?

or what is the most popular way, I don't think people train models on personal computers.

twilit tundra Aug 9, 2023, 9:21 PM

#

AWS is probably the most popular one, followed by Azure

raw compass Aug 9, 2023, 9:23 PM

#

twilit tundra AWS is probably the most popular one, followed by Azure

under what distro?

misty flint Aug 9, 2023, 9:23 PM

#

??

#

oh you mean linux distro on the cloud?

raw compass Aug 9, 2023, 9:24 PM

#

misty flint oh you mean linux distro on the cloud?

yes

misty flint Aug 9, 2023, 9:24 PM

#

amazon linux is the default for most

#

they have their own flavor of distro

#

for aws

#

its so their services can be compatible (SageMaker, ECR, Lambda, etc.)

#

dont know about azure or gcp

twilit tundra Aug 9, 2023, 9:27 PM

#

Out of curiosity, why are you interested in the distro used by cloud services?

sturdy canyon Aug 9, 2023, 9:50 PM

#

rough nova <@314879124776681473> Ok

What?

dusty valve Aug 9, 2023, 10:12 PM

#

hello #data-science-and-ml, i got a bunch of different datasets, some are weekly data, some daily, some monthly. i need to group them all together, preferably like i round down the weekly rows to the monthly rows, daily rows to monthly etc... so should i just write up a script that finds the month that each week/day occurs in?

#

how would i do that with pandas?

raw compass Aug 9, 2023, 10:18 PM

#

twilit tundra Out of curiosity, why are you interested in the distro used by cloud services?

well its good to know before trying one out.

twilit tundra Aug 9, 2023, 10:23 PM

#

dusty valve hello <#366673247892275221>, i got a bunch of different datasets, some are weekl...

Convert to datetime, you can directly access the month/week/etc. If it's in datetime format https://pandas.pydata.org/docs/user_guide/timeseries.html

final hound Aug 9, 2023, 11:30 PM

#

hi,i know this is pretty simple but how do i take an average of a dataframe column, I tried np.average but i get errors, ive also tried .mean(). with similar errors

serene scaffold Aug 10, 2023, 12:49 AM

#

final hound hi,i know this is pretty simple but how do i take an average of a dataframe colu...

you would do df['the_column'].mean(). if you tried that and you "got an error", please copy and paste the whole error message into this chat.

#

(In general, you should always give the error message for the error you need help with. if you just say that you got an error, we have no way of knowing what it is until you tell us.)

slim lance Aug 10, 2023, 12:59 AM

#

What’s a good format on disk for time series data? (Basically wondering if there is something like parquet but optimized for time series?)

left tartan Aug 10, 2023, 1:09 AM

#

slim lance What’s a good format on disk for time series data? (Basically wondering if there...

Parquet is pretty darn good for time series, imo. not sure I know anything better.

shy rock Aug 10, 2023, 1:19 AM

#

Hi @hot obsidian - How to call(execute) a function with 2 or more dataframe arguments? Sample code as below.. I would like to print the result of this function.

#

import pandas as pd
def department_highest_salary(emp: pd.DataFrame, dept: pd.DataFrame):
merged_data = pd.merge(emp, dept, left_on='DEPTNO', right_on='DEPTNO')
grouped = merged_data.groupby('ENAME')['SAL'].max().reset_index()
result = pd.merge(merged_data, grouped, how='inner', left_on=['DEPTNO', 'SAL'], right_on=['DEPTNO', 'SAL'])
result = result.rename(columns={'DEPTNO': 'Department', 'ENAME': 'Employee', 'SAL': 'Salary'})
return result[['Department', 'Employee', 'Salary']]

left tartan Aug 10, 2023, 1:22 AM

#

Not sure I understand the q. You're asking how to call a function with two arguments? That's just result=department_highest_salary(df1, df2)

shy rock Aug 10, 2023, 1:27 AM

#

left tartan Not sure I understand the q. You're asking how to call a function with two argum...

Let me try, thanks!!

#

Hi @left tartan - Thanks for response. I did pass the arguments like below. However, i am getting a key error

#

result=department_highest_salary(emp, dept)

left tartan Aug 10, 2023, 1:30 AM

#

You'd have to share the full error plz.

slim lance Aug 10, 2023, 1:38 AM

#

left tartan Parquet is pretty darn good for time series, imo. not sure I know anything bette...

I really don’t know how to wrap my head around it. I feel like time series data would best be represented in a 3 dimensional data frame. Think daily or houly metrics from thousands of data producers. I’d want to look in aggregate over time as well as drill into individual time series.

left tartan Aug 10, 2023, 1:38 AM

#

That's more where hive comes in tho, partitioning across... say... producer

slim lance Aug 10, 2023, 1:41 AM

#

I see.. So I can’t do this just in the storage layer? (I wanted to just use python with an on disk format.)

left tartan Aug 10, 2023, 1:43 AM

#

Hive partitioning is just a directory organization of parquet files, so yah, you can do it in the storage layer

slim lance Aug 10, 2023, 1:43 AM

#

Ah ok.. I understand now..

#

🙏

left tartan Aug 10, 2023, 1:45 AM

#

slim lance Ah ok.. I understand now..

fwiw, you dont need "hive" to read hive files. pyspark, duckdb, etc can read hive partitioned files.

shy rock Aug 10, 2023, 1:49 AM

#

left tartan You'd have to share the full error plz.

Hi @edgy venturey Bobby - Just checking if you got the full error message as file?

left tartan Aug 10, 2023, 1:50 AM

#

shy rock Hi <@920839113463185408>y Bobby - Just checking if you got the full error messag...

Nope, just paste the text in discord

#

You can't upload files

shy rock Aug 10, 2023, 1:50 AM

#

KeyError Traceback (most recent call last)

D:\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_label_or_level_values(self, key, axis)
1838 values = self.axes[axis].get_level_values(key)._values
1839 else:
-> 1840 raise KeyError(key)
1841
1842 # Check for duplicates

KeyError: 'DEPTNO'

#

KeyError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_18972\3746871449.py in <module>
----> 1 result=department_highest_salary(emp, dept)

~\AppData\Local\Temp\ipykernel_18972\3858326592.py in department_highest_salary(emp, dept)
5 merged_data = pd.merge(emp, dept, left_on='DEPTNO', right_on='DEPTNO') # join two data frames based on
6 grouped = merged_data.groupby('ENAME')['SAL'].max().reset_index()
----> 7 Merge_group = pd.merge(merged_data, grouped, how='inner', left_on=['DEPTNO', 'SAL'], right_on=['DEPTNO', 'SAL'])
8 Merge_group= Merge_group.rename(columns={'DEPTNO': 'Department', 'ENAME': 'Employee', 'SAL': 'Salary'})
9 return Merge_group[['Department', 'Employee', 'Salary']]

D:\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
105 validate: str | None = None,
106 ) -> DataFrame:
--> 107 op = _MergeOperation(
108 left,
109 right,

D:\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py in init(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate)
698 self.right_join_keys,
699 self.join_names,
--> 700 ) = self._get_merge_keys()
701
702 # validate the merge keys dtypes. We may need to coerce

D:\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py in _get_merge_keys(self)
1095 if not is_rkey(rk):
1096 if rk is not None:
-> 1097 right_keys.append(right._get_label_or_level_values(rk))
1098 else:
1099 # work-around for merge_asof(right_index=True)

timber sinew Aug 10, 2023, 2:00 AM

#

Hi everybody! I'm looking for someone who uses arcgis and langchain to provide feedback on my pull request

https://github.com/langchain-ai/langchain/pull/8873

GitHub

Create ArcGISLoader & example notebook by joshuasundance-swca · Pul...

Description: Adds the ArcGISLoader class to langchain.document_loaders
Allows users to load data from ArcGIS Online, Portal, and similar
Users can authenticate with arcgis.gis.GIS or retrieve publi...

shy rock Aug 10, 2023, 2:13 AM

#

left tartan You can't upload files

Hi @left tartan - Please ignore.. I figured out, an error in code... fixed myself...thanks for response..

charred light Aug 10, 2023, 3:49 AM

#

For Power BI, How do I create a dynamic index based on the current view? Like .reset_index() in python applied every time the table is updated.

coral field Aug 10, 2023, 4:45 AM

#

Is there any way to get more free compute credits on Google Colab after using all 100 (?) hours?

#

And does the T4 GPU even take up credits?

late shell Aug 10, 2023, 4:48 AM

#

Hello everyone, I want to use an open source LLM (like LlaMA 2) for text generation task. My prompt looks something like this:

Use the given question and context to generate a detailed, 
authentic description about the machine. Make it sound as if you are a great salesman and are pitching this machine 
to a potential buyer. Use good formatting and the description should not be too long (About 200 words only). 
Try to make it as easy to read as possible. Most importantly, you absolutely must include all the information provided in the description 
that you generate. Do not make up new information. It's a pre owned machine, therefore the description should not be like the launch of a new product.

Generate a description of the machine using the information provided under the Context.
 
Context: 

categoryName: Post Press
subcategoryName: Saddle Stitcher
subsubcategoryName: Conveyor belt
manufacturerName: Monotype
Year: 0.0
MachineModelName: Boston Double Head Stitching
Location: Germany
Info: DOUBLE HEAD STITCHING MACHINE BOSTON

2 HEAD FLAT AND SADDLE STITCHING MACHINE
DOUBLE WIRE

SCENARIO: I'm currently running the ggml quantized version of llama-2-7b and llama-2-13b locally (I can't use API based models due to data security concerns by my company). The results that this prompt generates are somewhat satisfactory for a starting point but the problem is that it takes around 4-5 mins to generate the whole response (with 33.6gbs of ram and NViDIA GeForce GTX 1080 Ti ) . Sometimes it just keeps on running for 15 mins and doesn't generate anything.

QUESTION: I'm wondering if I can either speed up the inference somehow or even considering to downgrade my model since maybe the llama-2-7b/13b model (quantized) could be an overkill for this task. I want to use a model that gives satisfactory results while using the least amount of resources since I need to run this model on the company server. How do I go about narrowing down my model hunt for this task?

void sail Aug 10, 2023, 7:41 AM

#

late shell Hello everyone, I want to use an open source LLM (like LlaMA 2) for text generat...

Hi there! Take a look a vLLM , it will (hopefully) give you acceptable performance

#

If you want to research more on your own this issue is called: inference speed up / tokens per second

#

On a mac m1 mac you can reach without any effort 20 / 30 tokens per second on llama 2 13b 4 bit ggml

late shell Aug 10, 2023, 7:43 AM

#

Oh okay okay. vLLM looks promising. I'll look into it. There's also exlllama, do you think that could be helpful too?

covert hearth Aug 10, 2023, 8:38 AM

#

Hey All,

I am not sure this is the right chat, but it is about LLMs.
So, I am trying to get https://github.com/mosaicml/llm-foundry running.
Short: I think I am there, the only problem I seem to be facing is that composer seems not to be using my conda env and thus I am not able to run the train example on this page.

Long:
I installed all deps et cetera and I am trying out the quick start example.
When I am at the training part, you have to run:"composer train/train.py \ et cetera"
This returns ModuleNotFoundError: No module named 'llmfoundry'. Which is interesting since when I open python and import this, it works.
When I was debugging I found that composer wants to use another python executable: /sw/arch/RHEL8/EB_production/2022/software/Python/3.10.4-GCCcore-11.3.0/bin/python
Is there a way to force to use the conda env python with all the required packages et cetera?

quartz wigeon Aug 10, 2023, 9:18 AM

#

Is there a machine learning classifier algorithm that classifies points on a 2d plane using a vertical or horizontal line as a separator?

#

I'm trying to write an ensemble learning algorithm from scratch, and I need a simple classifier like this as my base learner

sleek harbor Aug 10, 2023, 9:45 AM

#

🤔 how important is it for a data scientist to know a framework, like flask or django? Or is that a mostly useless skill and not necessary at all?

mild dirge Aug 10, 2023, 9:50 AM

#

quartz wigeon Is there a machine learning classifier algorithm that classifies points on a 2d ...

This is basically just any classifier (like logistic regression) but it takes only the x coordinate (vertical line) or y coordinate (horizontal line) of the input points. @quartz wigeon

quartz wigeon Aug 10, 2023, 9:54 AM

#

mild dirge This is basically just any classifier (like logistic regression) but it takes on...

can you clarify? I'm quite new to machine learning

twilit tundra Aug 10, 2023, 10:23 AM

#

sleek harbor 🤔 how important is it for a data scientist to know a framework, like flask or d...

Not always useful. I've had to use Flask but I think I'm in the minority

mild dirge Aug 10, 2023, 10:30 AM

#

quartz wigeon can you clarify? I'm quite new to machine learning

So logistic regression predicts a line that separates the points in 2d space (with an orientation and position). If you flatten the points, i.e. take only the x coordinate, or the y coordinate, you can predict a point that separates points on either side of the point in 1d space. This is the same as predicting a horizontal/vertical line in 2d space that separates the points.

quartz wigeon Aug 10, 2023, 10:31 AM

#

mild dirge So logistic regression predicts a line that separates the points in 2d space (wi...

thanks for the tip! I'll check out logistic regression

mild dirge Aug 10, 2023, 10:31 AM

#

The important part here is that you only use the x-coordinate or y-coordinate, as this forces you to predict a horizontal line or vertical line.

past meteor Aug 10, 2023, 10:35 AM

#

sleek harbor 🤔 how important is it for a data scientist to know a framework, like flask or d...

I got my first job without knowing any of this

#

Nowadays I'm more interested in making data / AI products and not just making models so I learnt those by myself. It's definitely not a requirement though

sleek harbor Aug 10, 2023, 10:37 AM

#

past meteor Nowadays I'm more interested in making data / AI products and not just making mo...

just out of curiosity, what exactly did u end up learning?

past meteor Aug 10, 2023, 10:40 AM

#

I read MDN's documentation first and then learnt (some of) Django and did a project without any JS

#

Django is one of best documented projects so it's a good place to start

sleek harbor Aug 10, 2023, 10:44 AM

#

I was thinking to just learn js (svelt or react) and do everything there

past meteor Aug 10, 2023, 10:45 AM

#

Afterwards I progressively went towards JS, Typescript and so on

sleek harbor Aug 10, 2023, 10:45 AM

#

look at me, making plans for the distant future when I don't even have a job.. :/

harsh bane Aug 10, 2023, 11:20 AM

#

Hoi, i'm really close to getting stable diffusion to run on steam deck in ubuntu 22.04, any idea how to fix these last remaining conflicts/issues?

past meteor Aug 10, 2023, 12:55 PM

#

sleek harbor look at me, making plans for the distant future when I don't even have a job.. :...

Don't worry - it could be part of your "strategy" imo

#

In businesses notebooks and models (purely exploratory work) don't really mean too much unless you're a bonafide statistician. You need to be able to put it into production / work. Smaller companies don't have the budget to have a data engineer, data analyst, AI engineer, frontend dev, backend dev and a devops.

#

You can be a generalist and spread yourself a bit more thin, but do end-to-end work

#

Or you can be a specialist and pick out for instance NLP, Vision, Time series, ... or a business domain e.g., finance and do that really well.

somber prism Aug 10, 2023, 3:09 PM

#

Hi anyone here familiar with fiftyone module? I’m getting a ServiceListenTimeout error , which is fiftyone is failing to bind to a port while importing the module

void sail Aug 10, 2023, 3:37 PM

#

sleek harbor look at me, making plans for the distant future when I don't even have a job.. :...

Fyi hiring a data scientist is hard

#

At my company we interviewed 15 for junior and 0 made it through yert

#

So if you become skilled it should be "easy" to get a job depending on location

void sail Aug 10, 2023, 3:41 PM

#

covert hearth Hey All, I am not sure this is the right chat, but it is about LLMs. So, I am t...

I suggest switching to poetry and/or pyenv

sleek harbor Aug 10, 2023, 3:53 PM

#

void sail So if you become skilled it should be "easy" to get a job depending on location

Exactly how skilled does one have to be? 👀

sleek harbor Aug 10, 2023, 3:53 PM

#

void sail At my company we interviewed 15 for junior and 0 made it through<:yert:832277526...

Subjective question, but is it because they were all that bad, or is it that the requirements and expectations are just very high for DS juniors?

#

I wonder if I would pass 🤔

void sail Aug 10, 2023, 4:11 PM

#

sleek harbor Subjective question, but is it because they were all *that bad*, or is it that t...

I dont minf sharing the whole process in DMs and where applications failed to meet expectations

#

If it helps you

soft badge Aug 10, 2023, 5:07 PM

#

GUys i am try to compare 2 dataframes verify what rows changed, created and removed anyone can help?

north epoch Aug 10, 2023, 5:13 PM

#

Can someone help me to write data into excel faster. For 20K records, it is taking around 25seconds in pandas xlsxwriter, Using pyexccelerate it is taking around 12 seconds. But pyexcelerate has limitations on the formatting(Accounting format)

iron folio Aug 10, 2023, 5:23 PM

#

Hi anyone here who tried ibm watson ai to create a chatbot using python?

twilit tundra Aug 10, 2023, 5:44 PM

#

soft badge GUys i am try to compare 2 dataframes verify what rows changed, created and remo...

https://pandas.pydata.org/docs/reference/api/pandas.testing.assert_frame_equal.html

soft badge Aug 10, 2023, 5:45 PM

#

twilit tundra https://pandas.pydata.org/docs/reference/api/pandas.testing.assert_frame_equal.h...

but i need verify values that are differents

twilit tundra Aug 10, 2023, 5:46 PM

#

soft badge but i need verify values that are differents

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.compare.html

soft badge Aug 10, 2023, 5:47 PM

#

but the rows in df are diferrents

#

what dont change is the columns

twilit tundra Aug 10, 2023, 5:48 PM

#

Yes that's what this function is for

#

Gives you the rows that are different

soft badge Aug 10, 2023, 5:51 PM

#

ValueError: Can only compare identically-labeled (both index and columns) DataFrame objects

#

diff = df1.compare(df2, align_axis = 0)

visual tundra Aug 10, 2023, 6:01 PM

#

Can someone please help me with this

I tried to import 'llama_index' in Jupiter and it shows error as following :

'If you use @root_validator with pre=False (the default) you MUST specify skip_on_failure=True. Note that @root_validator is deprecated and should be replaced with @model_validator.

Apparently Pydantic V2 has made some changes and it is showing this error .

#

#

twilit tundra Aug 10, 2023, 6:03 PM

#

soft badge ValueError: Can only compare identically-labeled (both index and columns) DataFr...

Makes sense, it would work only if there are only changes on both. My intuition is to make an outer join between the two and filter the rows with nans

neon jay Aug 10, 2023, 6:06 PM

#

Hey guys, I'm doing a kaggle comp rn and I am using gradient boosted regressor model along with using iterative imputer to fill in missing values. My laptop apparently isnt performing this well and has been running for a few hours I think there's some problem with it. But if I gave someone the dataset and the code could you please run it for me? It would rlly help a lot with my chances of getting higher on the leaderboard

vernal dome Aug 10, 2023, 6:18 PM

#

Excited to announce the initial release of VectorFlow, written in Python! VectorFlow is an open-source, high volume vector embedding pipeline.

Our pipeline is built to embed large volumes of data quickly and reliably. While embedding a handful of documents for Q&A is straightforward, the real challenge arises when ingesting gigabytes of unstructured data to leverage the full power of LLMs on top of your data.

With just a simple API request, you can effortlessly embed raw data and store the vectors in your vector database, eliminating the need for intricate cloud infrastructure setups.

🔗 Check out our Github repo and give us a star: https://lnkd.in/en6FhfN9

For all the innovators working with vector databases, we're eager to hear your insights, feedback, and ideas for the roadmap.

Demo can be viewed here: https://www.youtube.com/watch?v=aQOlOT14DaA

And our website is here, sign up for a free consultation: https://www.getvectorflow.com/

This link will take you to a page that’s not on LinkedIn

YouTube

Garnizzle

Vectorflow Demo

VectorFlow is an open source, high throughput, fault tolerant vector embedding pipeline. With a simple API request, you can send raw data that will be embedded and stored in any vector database or returned back to you.

▶ Play video

VectorFlow

VectorFlow: Open source, high-throughput, fault-tolerant vector embedding pipeline. Simple API endpoint that ingests large volumes of raw data, processes, and stores or returns the vectors quickly and reliably.

granite atlas Aug 10, 2023, 6:29 PM

#

@iron basalt good evening mate

#

I had a question about the nneu source material you suggested to me prior

#

#

When we multiply the "slopes" by the error, we are reducing the error of high confidence predictions

#

What does this statement mean

#

von https://iamtrask.github.io/2015/07/12/basic-python-network/

A Neural Network in 11 lines of Python (Part 1) - i am trask

A machine learning craftsmanship blog.

sleek harbor Aug 10, 2023, 6:48 PM

#

neon jay Hey guys, I'm doing a kaggle comp rn and I am using gradient boosted regressor m...

You can run your code on kaggle for free, any time and as much as you want (for up to 12h sessions, which are more like 11h, but that's still great)

neon jay Aug 10, 2023, 6:49 PM

#

What I didn’t even know about that

#

Thanks

sleek harbor Aug 10, 2023, 6:50 PM

#

neon jay Thanks

Google how to do it so u don't lose ur progress 👍

iron basalt Aug 10, 2023, 6:50 PM

#

granite atlas What does this statement mean

It seems to be a redundant statement, ignore it.

#

The following sentences is what they are getting at.

#

This post is mostly for the code, if you want a understanding of the mathematics there are other places to look.

placid cedar Aug 10, 2023, 7:09 PM

#

hey guys, i need some help clarifying a concept. is anyone available to help me at the moment? 🙂

#

i have 3 variables, date, store, sales amount. i put x axis as date, y axis as sales amount, and store as legends.

is this univariate, bivariate, multivariate, or neither of them?

void sail Aug 10, 2023, 7:33 PM

#

Multivariate if you use all three store timeseries for a singular goal

#

E.g. a timeseries with multiple features (2+)for each timestamp

midnight pagoda Aug 10, 2023, 11:34 PM

#

Hello, does anyone know why tensorflow stuck on import? i've waiting and nothing happen on the console(ignore the typo)

#

i'm using tensorflow 2 without avx i download from here

GitHub

TensorFlow 2.8.0 No AVX, No GPU, Python 3.7, 3.8, 3.9, 3.10, Ubuntu...

Built on Ubuntu 18.04. Builds are not tested and provided as is. Example configuration for Python 3.7 and westmere: PYTHON_VERSION=python3.7 PYTHON_BIN_PATH=$(which $PYTHON_VERSION) \ PYTHON_LIB_PA...

#

heres quick spec of my system:

OS: Linux Mint 20.1 x86_64
Host: 80MH Lenovo ideapad 100-14IBY
Kernel: 5.4.0-58-generic
CPU: Intel Celeron N2840 (2) @ 2.582GHz
GPU: Intel Atom Processor Z36xxx/Z37xxx Series Graphics & Di
Memory: 1227MiB / 1869MiB

I'm sure the processor/ram is not a problem when just import the module(right?)

sonic knoll Aug 11, 2023, 1:07 AM

#

Hello guys

#

Could someone recommend me a book to learn Python through projects?

ashen axle Aug 11, 2023, 1:52 AM

#

midnight pagoda Hello, does anyone know why tensorflow stuck on import? i've waiting and nothing...

try initializing python with verbose python -vvv then import again, see what it says.

dawn patrol Aug 11, 2023, 1:57 AM

#

hello

#

I try to convert file .py to .exe

#

I run file .py and it is well run, but I run file .exe and it is not run.

#

help me

#

Thank youuuuu

midnight pagoda Aug 11, 2023, 2:39 AM

#

ashen axle try initializing python with verbose `python -vvv` then import again, see what i...

there's so many output and i dont know how to read it

dawn patrol Aug 11, 2023, 2:48 AM

#

the file .py with only content print('hello')

#

hiccc

ashen axle Aug 11, 2023, 2:50 AM

#

dawn patrol I run file .py and it is well run, but I run file .exe and it is not run.

what are you using to build the .exe?

ashen axle Aug 11, 2023, 2:54 AM

#

midnight pagoda there's so many output and i dont know how to read it

is it still hanging? whats the last line of the output?

dawn patrol Aug 11, 2023, 2:58 AM

#

I use PyInstaller to convert file .py to .exe

#

When I run .exe, it not run.

#

But I run file .py, it display the word "hello"

#

I also use "Auto py to exe" tool to convert, but the result similar

midnight pagoda Aug 11, 2023, 3:24 AM

#

ashen axle is it still hanging? whats the last line of the output?

this

# code object from '/home/alfarizi/Documents/machine_learning_flask/venv/lib/python3.8/site-packages/tensorflow/lite/experimental/microfrontend/ops/__pycache__/gen_audio_microfrontend_op.cpython-38.pyc'
import 'tensorflow.lite.experimental.microfrontend.ops.gen_audio_microfrontend_op' # <_frozen_importlib_external.SourceFileLoader object at 0x7fa0632c0a30>

timber nexus Aug 11, 2023, 9:43 AM

#

dawn patrol But I run file .py, it display the word "hello"

here: pyinstaller --onefile [--windowed] yourfile.py

desert oar Aug 11, 2023, 2:53 PM

#

slim bone I thought Data Science merely uses ML as a tool and is not strictly about ML? Or...

it's a broad enough field that you can get through junior level (or could in the past, before it got super competitive) with expertise or deep knowledge only in 1 or 2 areas + being willing and able to learn on the job.

desert oar Aug 11, 2023, 2:54 PM

#

slim bone Also, regarding this - If I had to put my current ambitions into words as accura...

as for this, you really should take stats and probability classes while you're in school. it's harder to self-study that material than to learn it in an organized controlled environment

desert oar Aug 11, 2023, 2:54 PM

#

sleek harbor Is knowing ORM necessary, or is SQL enough?

sql is enough, you will likely not need or even want to use an ORM for most data science work

desert oar Aug 11, 2023, 2:55 PM

#

slim bone Oh, I was told the opposite - that most of the work involved is actually statist...

a lot of data science work requires you to have a good understanding of undergrad-level statistics, but you aren't usually "doing" statistics in the sense of running t-tests all day

slim bone Aug 11, 2023, 2:56 PM

#

desert oar as for this, you really _should_ take stats and probability classes while you're...

Don't worry, I couldn't skip them even if I wanted to 🙂
As a matter of fact, I'll probably have to take a couple of advanced classes related to statistics so I'll probably have that section covered

slim bone Aug 11, 2023, 2:57 PM

#

desert oar a lot of data science work requires you to have a good understanding of undergra...

Err, I'm not sure what you're trying to say
Basically, you need to know the theory, but you won't be practicing the statistics you'll learn at university?

past meteor Aug 11, 2023, 2:58 PM

#

It depends on what statistics.

#

Machine learning is technically a subset of statistics that is pioneerd by computer scientists

desert oar Aug 11, 2023, 2:58 PM

#

slim bone Don't worry, I couldn't skip them even if I wanted to 🙂 As a matter of fact, I...

good. you basically need all of this:

calculus (pre-req for probability mostly), ideally also multivariate
linear algebra
probability
statistics

that's a lot of new material to learn and intuition/understanding to develop. if you think you understand the basics after a couple of lectures, you didn't spend long enough time pondering them. take the advanced classes, but don't lose sight of the fact that it all builds sequentially, and you can't really apply any of the advanced material without really understanding the fundamentals.

tough radish Aug 11, 2023, 2:59 PM

#

We are not a job recruitment board. Please do not post job ads in the future.

past meteor Aug 11, 2023, 2:59 PM

#

From that perspective it makes no sense to limit yourself to just "machine learning", techniques from "traditional" statistics are valuable

desert oar Aug 11, 2023, 2:59 PM

#

slim bone Err, I'm not sure what you're trying to say Basically, you need to know the theo...

in a lot of statistics classes, students are taught certain procedures or recipes, which are often not directly applicable in the real world. however the principles underlying them are very useful and sometimes necessary to do your job well.

past meteor Aug 11, 2023, 3:00 PM

#

Take as many classes of those are possible because they'll cover techniques you may (or may not) use in the future

slim bone Aug 11, 2023, 3:00 PM

#

past meteor It depends on *what* statistics.

Well yeah, I understand that it's an entire sub-field (Like "Calculus") in math. I'm assuming there is some broad idea of what "Intro to statistics" teach though?

past meteor Aug 11, 2023, 3:01 PM

#

Intro to stats typically teaches probability theory (make no mistake, this is NOT part of statistics, it's a prereq), descriptive stats and inferential stats

#

People hate probability theory and get turned off statistics as a whole

desert oar Aug 11, 2023, 3:02 PM

#

it's unfortunate because probability theory is more useful than traditional statistics in some fields, e.g. reasoning about rare events and uncertain outcomes even when you don't need to "fit a model"

slim bone Aug 11, 2023, 3:02 PM

#

desert oar good. you basically need all of this: * calculus (pre-req for probability mostl...

Yeah I do have a solid foundation with Calculus (Real Analysis, apparently) and Linear Algebra
Understanding the fundamental theory is, of course, ideal - but not always possible under the extremely hectic curriculum of university

past meteor Aug 11, 2023, 3:02 PM

#

Descriptive statistics is (summarizing data) very important for machine learning as it relates closely to experimental data analysis

desert oar Aug 11, 2023, 3:02 PM

#

slim bone Yeah I do have a solid foundation with Calculus (Real Analysis, apparently) and ...

indeed, you're almost certainly going to keep studying things and learning things for years. however if you have the ability to choose priorities at all, hopefully this helps you decide what to focus on.

slim bone Aug 11, 2023, 3:02 PM

#

desert oar in a lot of statistics classes, students are taught certain procedures or recipe...

Oh of course. I don't think I'll ever be doing Delta-Epsilon proofs again but the idea behind them is rather crucial to understanding more complicated things like Gradient Descent I'd reckon

desert oar Aug 11, 2023, 3:03 PM

#

heh, if there's ever a class that i don't think has been even remotely useful for me, it's real analysis

slim bone Aug 11, 2023, 3:03 PM

#

past meteor Take as many classes of those are possible because they'll cover techniques you ...

Yeah but... my grades, y'know?

past meteor Aug 11, 2023, 3:03 PM

#

Inferential statistics is at the heart of machine learning. I see too many practitoners (even people at work!) focussing on getting a very low MSE/ high accuracy when the point is actually getting unbiased estimates of performance

#

To whole notion of unbiased estimates is very rarely covered in ML classes (We only briefly spoke about it in my entire AI masters) but it's a big part of statistics

desert oar Aug 11, 2023, 3:03 PM

#

(however if you get into numerical computing then yes real analysis i believe becomes very important)

slim bone Aug 11, 2023, 3:04 PM

#

past meteor Intro to stats typically teaches probability theory (make no mistake, this is NO...

I have intro to probability as a prerequisite class to Intro to statistics, so that shouldn't be a problem

slim bone Aug 11, 2023, 3:05 PM

#

desert oar heh, if there's ever a class that i don't think has been even remotely useful fo...

Really? It's my absolute favourite actually
Almost entirely useless in practice, but it really teaches you to "think" if that makes sense

desert oar Aug 11, 2023, 3:05 PM

#

this is also why people tend to need a masters degree to even get into this field. you're usually packing a ton of things into your 4 years at school (as you should!) and you need a year or two to reset and focus a little more heavily on a smaller set of core ideas + spend more dedicated time on a thesis or capstone project

slim bone Aug 11, 2023, 3:05 PM

#

past meteor To whole notion of unbiased estimates is very rarely covered in ML classes (We o...

That's interesting, what should one do about this?

desert oar Aug 11, 2023, 3:06 PM

#

slim bone Really? It's my absolute favourite actually Almost entirely useless in practice,...

yes definitely. i shouldn't discourage people from spending their undergrad time just learning how to think. that's arguably even more foundational than any particular math concept. as i just mentioned above, part of the point of a masters is so that you can have more focused learning time on your chosen subject after spending your time on a broad range in undergrad.

past meteor Aug 11, 2023, 3:06 PM

#

Doing Kaggle implicitly helps explain this discrepancy

#

Cause in Kaggle you actually have 2 jobs:

getting a good model
Finding a way to robustly evaluate your models

#

If you don't succeed at both you're bust

slim bone Aug 11, 2023, 3:07 PM

#

desert oar this is also why people tend to need a masters degree to even get into this fiel...

Makes perfect sense honestly
I generally believe a masters degree is the way to go if you want to "hone your craft" in most cases
Then again, I'm too much of an academic newbie to have this opinion

past meteor Aug 11, 2023, 3:07 PM

#

School teaches you point 1. Unless you go to production and your model fails you won't learn point 2 either at work

desert oar Aug 11, 2023, 3:07 PM

#

school ought to each 2 as well

#

some curricula cover it. traditional stats does to some extent

past meteor Aug 11, 2023, 3:08 PM

#

They ought to, but they don't cover it well enough

#

Explaining what a roc curve and cross validation is, isn't enough

desert oar Aug 11, 2023, 3:08 PM

#

CV at least is a valid and useful technique

#

i remember we learned about cv, bootstrap, leave-one-out, AIC, etc

past meteor Aug 11, 2023, 3:09 PM

#

It is, but if you do like the people at work and you CV endlessly

#

It defeats the purpose of CV

desert oar Aug 11, 2023, 3:09 PM

#

it wasn't particularly well-informed introduction or instilled deep understanding, but at least i'd seen it before

slim bone Aug 11, 2023, 3:09 PM

#

Admittedly I don't look at school as a practical tool for the job market
I don't mind learning "useless things". To me school is just a foundation to obtain the ability to learn whatever's necessary

desert oar Aug 11, 2023, 3:09 PM

#

btw there's some pushback now against "unbiased" estimation in statistics as well. the machine learning concept of bias-variance has a lot of overlap with the use of priors in bayesian statistics.

past meteor Aug 11, 2023, 3:10 PM

#

People learn CV as a tool and idt the reasons behind the tool (and how you can still abuse it) are covered adequately

desert oar Aug 11, 2023, 3:10 PM

#

slim bone Admittedly I don't look at school as a practical tool for the job market I don't...

this is probably the right way to think. i just don't know if the economy is in good enough shape to allow people to think this way 😬 but i'm pretty out of touch with the job market for juniors. i hear it's rough right now.

desert oar Aug 11, 2023, 3:10 PM

#

past meteor People learn CV as a tool and idt the reasons behind the tool (and how you can s...

i'm curious what abuses of CV you've seen in industry

#

i've been lucky to work with very few knuckleheads and mostly people who are very conscientious about their work

slim bone Aug 11, 2023, 3:11 PM

#

desert oar this is probably the right way to think. i just don't know if the economy is in ...

Thankfully It'll be many years until I finish my masters, so maybe things will change for the better by then 😅

past meteor Aug 11, 2023, 3:11 PM

#

Well, someone at work is working with a medical dataset. Not a lot of subjects. They're making a model. They use their entire dataset as validation

#

Because test train splitting with a small dataset isn't great either

desert oar Aug 11, 2023, 3:11 PM

#

ah. do they not know about bootstrapping and cv?

past meteor Aug 11, 2023, 3:11 PM

#

But they've iterated too much on their dataset that they're overfitting implicitly now

#

They're using cross validation

#

Cross validation does not save you here

#

After 1000 rounds of CV you're essentially making new features to raise the validation score

#

Each evaluation on your test / validation set increases the bias on your score

desert oar Aug 11, 2023, 3:14 PM

#

oh so they'll do CV, make a change, do CV again, etc?

past meteor Aug 11, 2023, 3:14 PM

#

yup

#

Imo you can do that, but not too much

desert oar Aug 11, 2023, 3:14 PM

#

yeah that's always a tough one. in theory you're not supposed to do it at all, but how else are you supposed to iterate?

#

i've definitely fudged it with problematic datasets where we did things like CV simultaneously for hyperparameter selection and performance eval 😆 but we 1) knew we were overestimating performance and undersold our results to the business, 2) knew we would be able to get new out-of-sample data soon that we could use to evaluate the model properly, and 3) had good business reasons to believe that our data was "representative enough" (part of it was synthetically constructed anyway)

past meteor Aug 11, 2023, 3:15 PM

#

You can't iterate without doing it, but doing it too much means you're overfitting so the answer is doing it "a little"

desert oar Aug 11, 2023, 3:15 PM

#

yep. that seems like something you could maybe study with an information theoretic approach (how much is too much) but i haven't seen any papers on it

past meteor Aug 11, 2023, 3:15 PM

#

There are but they're tedious haha

desert oar Aug 11, 2023, 3:16 PM

#

i'd be curious what the literature says on it

past meteor Aug 11, 2023, 3:16 PM

#

This problem has a name, iirc it's "adaptive overfitting"

past meteor Aug 11, 2023, 3:16 PM

#

desert oar i've definitely fudged it with problematic datasets where we did things like CV ...

I think business reasons makes you exempt tbf

#

If I can sell it to myself that it's OK it's OK

desert oar Aug 11, 2023, 3:16 PM

#

hah yep

past meteor Aug 11, 2023, 3:17 PM

#

The problem is not knowing and having crazy inflated scores as a result

desert oar Aug 11, 2023, 3:17 PM

#

but that's why we need all this foundational knowledge: can you sell it to yourself in a way that's legit?

#

like you're saying, you have to know what's wrong with doing Bad Thing in order to ever coherently justify doing Bad Thing

#

also i didn't know the term "adaptive overfitting", i've heard about it before in cases like everyone training on the same reference dataset but not with a nice name

past meteor Aug 11, 2023, 3:18 PM

#

I don't think you even need to justify it? If you know it's bad and you can attach a "performance may be inflated" disclaiemr you're fine

#

Why? Let's say you cut some corners and the performance is 3 % higher than the baseline. I'm picking the baseline

#

If it's 30 % and the corners that I cut aren't too severe, sure I'm still picking my approach

desert oar Aug 11, 2023, 3:19 PM

#

fair

#

ideally you can get a numerical estimate though

#

that's not always easy. simulation studies can be hard to design

past meteor Aug 11, 2023, 3:19 PM

#

And to do that we'd have to look at our cousins from statistics

desert oar Aug 11, 2023, 3:20 PM

#

this was a good read and analysis of the adaptive overfitting problem https://gregpark.io/blog/Kaggle-Psychopathy-Postmortem/

The dangers of overfitting: a Kaggle postmortem

How I dropped 50 spots in one minute by overfitting in a Kaggle contest

past meteor Aug 11, 2023, 3:20 PM

#

This is multiple testing

#

It's exactly the same problem as multiple testing. 100 %.

desert oar Aug 11, 2023, 3:20 PM

#

fwiw i think multiple comparisons correction is controversial even in stats

#

you need a pretty well-formed decision criterion to do any kind of "testing" properly

past meteor Aug 11, 2023, 3:21 PM

#

The thing is, at least they know it's a problem

#

If I were serious about tackling it I know stats has been grappling with this for ages and I know what to read

granite atlas Aug 11, 2023, 3:25 PM

#

iron basalt It seems to be a redundant statement, ignore it.

Oh... thank you mate

desert oar Aug 11, 2023, 3:52 PM

#

past meteor If I were serious about tackling it I know stats has been grappling with this fo...

what reading do you have in mind? i've read a bunch of the older papers but i haven't seen any recent work on it

sleek harbor Aug 11, 2023, 4:10 PM

#

yesterday my plot was plotting.. I don't remember touching it.. today it's not plotting anymore.. :?

serene scaffold Aug 11, 2023, 4:23 PM

#

sleek harbor yesterday my plot was plotting.. I don't remember touching it.. today it's not p...

Can you show the code? This isn't nearly enough information to start diagnosing the problem

lapis sequoia Aug 11, 2023, 4:38 PM

#

can anyone provide me a good website focusing on ai ml dl data etc. staff?

unique star Aug 11, 2023, 4:39 PM

#

lapis sequoia can anyone provide me a good website focusing on ai ml dl data etc. staff?

+1 me too..

twilit tundra Aug 11, 2023, 4:41 PM

#

lapis sequoia can anyone provide me a good website focusing on ai ml dl data etc. staff?

Websites can do a lot of stuff. Without further information, Kaggle? Kdnuggets maybe?

lapis sequoia Aug 11, 2023, 4:48 PM

#

twilit tundra Websites can do a lot of stuff. Without further information, Kaggle? Kdnuggets m...

sites that would teach me these staff, would they do?

twilit tundra Aug 11, 2023, 4:49 PM

#

Kaggle has a pretty good introduction so definitely yeah

sleek harbor Aug 11, 2023, 5:20 PM

#

serene scaffold Can you show the code? This isn't nearly enough information to start diagnosing ...

fixed it 🖤 I guess I did touch the plot after all, and somehow forgot. I don't understand why, but adding transition_duration to my plotly figures layout somehow made it so that some traces weren't rendered.. 🤔 🤷‍♀️

sleek harbor Aug 11, 2023, 5:21 PM

#

twilit tundra Websites can do a lot of stuff. Without further information, Kaggle? Kdnuggets m...

Kaggle is a great site, but their tutorials..? not so much, in my opinion at least

twilit tundra Aug 11, 2023, 5:23 PM

#

sleek harbor Kaggle is a great site, but their tutorials..? not so much, in my opinion at lea...

It has its flaws but it's solid for beginners since it gives you a playground to test things out

#

I haven't really compared beginners tutorials though, feel free to add your own recommendations

sleek harbor Aug 11, 2023, 5:27 PM

#

twilit tundra It has its flaws but it's solid for beginners since it gives you a playground to...

it also has errors, or "too simple/short" examples with leakage and stuff like no normalization for models that need it, which can lead to confusion, especially for beginners. It's not bad, but it's definitely not great. I'd say udemy level (and I generally avoid udemy)

odd meteor Aug 11, 2023, 5:28 PM

#

lapis sequoia can anyone provide me a good website focusing on ai ml dl data etc. staff?

https://Kaggle.com/learn
https://aiplanet.com/courses
https://e2eml.school/blog.html
https://uvadlc-notebooks.readthedocs.io/en/latest/
https://d2l.ai/chapter_preliminaries/ndarray.html
Check the pinned post for more resources

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

AI Planet (formerly DPhi)

Free Data Science Courses from experts | AI Planet (formerly DPhi)

Learn Data Science for free through application oriented courses. Utilize our expert-curated resources as per your interest and pace.

Table of Contents

Brandon Rohrer post library

sleek harbor Aug 11, 2023, 5:29 PM

#

lapis sequoia can anyone provide me a good website focusing on ai ml dl data etc. staff?

StatQuest for simple explanations on Youtube (stats and ML)

lapis sequoia Aug 11, 2023, 5:52 PM

#

can anyone help me with install ing keras, I have installed python 3.12 and when I install tensorflow, it gives me these errors,

young granite Aug 11, 2023, 6:06 PM

#

lapis sequoia can anyone help me with install ing keras, I have installed python 3.12 and when...

u using correct env?

odd meteor Aug 11, 2023, 6:08 PM

#

lapis sequoia can anyone help me with install ing keras, I have installed python 3.12 and when...

You might wanna confirm if the python version you're using in your IDE is in same environment where Tensorflow was installed.

lapis sequoia Aug 11, 2023, 6:43 PM

#

its alr im using pytorch now

civic elm Aug 11, 2023, 8:01 PM

#

Is it confirmed that Keras will support pytorch this year?

serene scaffold Aug 11, 2023, 8:11 PM

#

civic elm Is it confirmed that Keras will support pytorch this year?

that sounds odd to me considering that keras is a wrapper for tensorflow

lapis sequoia Aug 12, 2023, 2:42 AM

#

odd meteor 1. https://Kaggle.com/learn 2. https://aiplanet.com/courses 3. https://e2eml.sch...

ah, thanks! that was what i was looking for...

lapis sequoia Aug 12, 2023, 2:42 AM

#

sleek harbor StatQuest for simple explanations on Youtube (stats and ML)

ooo thanks to you too...

topaz tree Aug 12, 2023, 2:53 AM

#

odd meteor 1. https://Kaggle.com/learn 2. https://aiplanet.com/courses 3. https://e2eml.sch...

Thanks man

desert oar Aug 12, 2023, 5:59 AM

#

civic elm Is it confirmed that Keras will support pytorch this year?

yes https://keras.io/keras_core/

Keras documentation: Keras Core: Keras for TensorFlow, JAX, and PyT...

desert oar Aug 12, 2023, 6:01 AM

#

serene scaffold that sounds odd to me considering that keras is a wrapper for tensorflow

keras started life as a higher-level framework over tensorflow. tensorflow then kind of ate keras and made their "keras api", but keras itself continued to exist. now keras is branching out again to actually support other frameworks as backends.

civic elm Aug 12, 2023, 8:43 AM

#

Great because I only need to learn one framework, hopefully

odd meteor Aug 12, 2023, 9:05 AM

#

civic elm Great because I only need to learn one framework, hopefully

Hehehe this was also what I said before; that I'm gonna learn Tensorflow. You certainly need to start with your most favourite framework but it'll be nice to be framework agnostic. Started with Tensorflow, but then I recently moved into ML Research, and here I am still learning PyTorch. It's nice to know at least two in my opinion. Tensorflow / PyTorch / JAX

odd meteor Aug 12, 2023, 9:13 AM

#

desert oar yes https://keras.io/keras_core/

Just read this. It looks like what Ivy is also trying to do https://unify.ai/

Ivy - Accelerate Your AI With One Line of Code

Access all models, libraries, frameworks, infra and hardware. Integrate them directly into your projects in seconds, and run your models faster than ever.

fresh harbor Aug 12, 2023, 1:27 PM

#

Is there a ((detailed)) pretrained model for audio samples used in music? AudioSet isn't that detailed and its the highest benchmark currently available for audio classifiers

odd meteor Aug 12, 2023, 3:32 PM

#

HuggingFace usually has one or two gems for almost everything ML. Maybe try checking there.

narrow flare Aug 12, 2023, 5:22 PM

#

Can someone tell me if I need Cuda 11 for tensorflow to work with my GPU? I currently have CUDA 12 and tensorflow is not detecting my GPU

unique flame Aug 12, 2023, 5:22 PM

#

lapis sequoia can anyone provide me a good website focusing on ai ml dl data etc. staff?

O’Reilly 🙂

lapis sequoia Aug 12, 2023, 5:22 PM

#

unique flame O’Reilly 🙂

?

unique flame Aug 12, 2023, 5:23 PM

#

You asked for sites that would teach AI

unique flame Aug 12, 2023, 5:25 PM

#

narrow flare Can someone tell me if I need Cuda 11 for tensorflow to work with my GPU? I curr...

I had the same problem. I then tried Pytorch and that worked. So i just used Pytorch now and most research are using Pytorch so yea..

narrow flare Aug 12, 2023, 5:28 PM

#

I wanna stick to tensorflow for now. Seems like this is really annoying for a lot of people lol

#

There's some docker thing that makes it easy apparently so im gonna look into that ig

steady basalt Aug 12, 2023, 5:58 PM

#

Landed my next Data scientist job! its been such a long journey I feel like sam on mount doom

orchid sky Aug 12, 2023, 5:59 PM

#

steady basalt Landed my next Data scientist job! its been such a long journey I feel like sam ...

Congrats

steady basalt Aug 12, 2023, 5:59 PM

#

🥳 ty

#

had a really weird interview question though about prior likelihood vs probability, I think they got the wording mixed up

#

gotta say the job markets so bad at the moment, was a real grind

dense crane Aug 12, 2023, 7:29 PM

#

i applied ```py
model = nn.DataParallel(model)

#

i am using kaggle gpu t4 x2

past meteor Aug 12, 2023, 8:32 PM

#

desert oar keras started life as a higher-level framework over tensorflow. tensorflow then ...

Didn't Keras start as a multi backend thing?

#

Also my hot take is that once you know one framework you can be productive by googling "How do I do X in Pytorch / Tensorflow"

unique ether Aug 12, 2023, 9:30 PM

#

Hello everyone!

#

I see this chat is a lot less populated than Python General which I have been frequenting recently

#

Quantity is no indicator of quality though!

#

Does anyone here have any insight into the best paid positions within the AI and ML field?

#

I've seen two commonly recurring job titles are "AI engineer" and "ML engineer" and the internet seems quite divided on who has the higher salary

serene scaffold Aug 12, 2023, 11:37 PM

#

unique ether I've seen two commonly recurring job titles are "AI engineer" and "ML engineer" ...

there is no consistency in what different AI/ML/DS job titles actually mean.

#

I've met "artifical intelligence engineers" who just flat out do not write code.

#

which means they are not "engineers" in the programming sense.

#

so even if it turns out that people who have the title "ML engineer" on average make more than people who have the title "AI engineer", that doesn't really tell us anything.

fallow frost Aug 13, 2023, 1:12 AM

#

whats the appropriate plot for displaying the min, max, and avg execution time of a function (the X-axis will display the amount of time VS the input size)

#

I was thinking of using three lines with the same color (but different shadow) and fill the area between the min & max with a color

#

and I want to display the benchmark for 4 functions, so 12 data points in total

iron basalt Aug 13, 2023, 1:15 AM

#

fallow frost whats the appropriate plot for displaying the min, max, and avg execution time o...

Often multiple bar plots.

fallow frost Aug 13, 2023, 1:16 AM

#

iron basalt Often multiple bar plots.

thats defintely easier than the classical multi-line chart, but I would rather the former

#

I was taught that I shouldnt use bar plots for continuous data

iron basalt Aug 13, 2023, 1:18 AM

#

fallow frost I was taught that I shouldnt use bar plots for continuous data

It's often discrete, you run with a couple of input sizes, and each input size has a group of bars.

desert oar Aug 13, 2023, 1:19 AM

#

odd meteor Just read this. It looks like what Ivy is also trying to do https://unify.ai/

i remember seeing ivy posted a while ago,i haven't kept up with it but it's nice to see it's still alive

wooden venture Aug 13, 2023, 1:19 AM

#

for chatbot just say like a fun, normal conversation which type of ml model should i be aiming for?

desert oar Aug 13, 2023, 1:20 AM

#

past meteor Didn't Keras start as a multi backend thing?

yes but afaik tensorflow was the only backend that is both still extant and was supported at the time. i think maybe it also supported theano but i might have also made that up

desert oar Aug 13, 2023, 1:21 AM

#

unique ether Does anyone here have any insight into the best paid positions within the AI and...

the best paying in AI/ML as with most tech-related fields are phd-level high-ranking individual contributor (as in, you're an actual known researcher being hired to solve hard problems) or senior management (you're finding/hiring/managing the people solving the hard problems)

iron basalt Aug 13, 2023, 1:21 AM

#

desert oar yes but afaik tensorflow was the only backend that is both still extant and was ...

Either way, Theano is basically dead no?

left tartan Aug 13, 2023, 1:22 AM

#

fallow frost I was thinking of using three lines with the same color (but different shadow) a...

we use candlestick charts for this type of stuff, when we want to show avg/hi/low plus stddev

desert oar Aug 13, 2023, 1:23 AM

#

iron basalt Either way, Theano is basically dead no?

yeah it was put gently to bed by its original developers, although i remember seeing something about the community continuing to fix bugs and keep the project alive, even if not advancing

#

or maybe that was pymc3 which was based on theano? idk

fallow frost Aug 13, 2023, 1:23 AM

#

left tartan we use candlestick charts for this type of stuff, when we want to show avg/hi/lo...

oh I completely forgot about that

iron basalt Aug 13, 2023, 1:23 AM

#

Even the about in the github page for it says "Theano was a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently."

#

Keyword, "was."

left tartan Aug 13, 2023, 1:24 AM

#

fallow frost oh I completely forgot about that

Or box and whiskers

fallow frost Aug 13, 2023, 1:24 AM

#

left tartan we use candlestick charts for this type of stuff, when we want to show avg/hi/lo...

but multiple candlesticks on the x axis, how would that work?

left tartan Aug 13, 2023, 1:24 AM

#

fallow frost but multiple candlesticks on the x axis, how would that work?

https://plotly.com/python/box-plots/

Box

Over 19 examples of Box Plots including changing color, size, log axes, and more in Python.

iron basalt Aug 13, 2023, 1:25 AM

#

Ah, it was continued as "Aesara," which was forked to "PyTensor."

desert oar Aug 13, 2023, 1:25 AM

#

@unique ether but more broadly, the best-paid positions are high-value positions in industries and at companies that have the capital to pay a lot of money for that high-value work. that would typically be ML/AI/data engineering supporting advanced research teams and/or critical production systems, or being an advanced researcher yourself. for positions that are realistically obtainable for normal people, "data engineering" and "data science" are still the two primary tracks. pay depends on seniority/expertise, region, and choice of industry