#data-science-and-ml

1 messages ยท Page 109 of 1

unique summit
#

I don't get it. What is turning into what?

#

So if I am understanding this right, basically shrinking a drawing/image to a 32x30 pixel board but you want to make sure it keeps the important details

vocal tartan
#

how should I start learning Python? I want to go forward with AI and Machine Learning with it.

rugged helm
#

and there will be no color (first image was just for visibility)

#

it can either be black or white (different in hardware, but basically binary values true or false)

unique summit
#

but it should still display it in a way that it is recognizable for the human eye?

#

try looking into 2d convolution

#

it may not help due to how small u want it

#

but it might work

#

google/medium idk just search it up and you'll figure it out in a month or two

unique summit
#

I think if you really dive into convolution people will point out the flaws and how recent studies improved it

keen trench
#

How big should a image classification dataset be if I'm using no pretrained models.

#

Like for a simple 2 class problem

vagrant root
#

you can always retrain if the accuracy is low

keen trench
vagrant root
odd meteor
odd meteor
final kiln
#

The single real number part is important, without it you can't do gradient descent because the algo relies on the "gradient" operation, which only acts on real valued functions. At each point the gradient tells you how fast the function is increasing and in which direction that increase is largest, so you use this information to find your way downhill.

final kiln
#

I don't think it's linear though, even if the error is linear it would still require the network to be linear, if "f" is the network,

loss(w1+w2) = error(f(w1 + w2))

you'd need to impose f(w1+w2)=f(w1)+f(w2)

I don't know if that's generally possible

#

In case of MLPs,

activation((W1 + W2)x + B)

activation is purposely not linear

long canopy
spring field
#

wonderful, thanks

hollow mortar
long canopy
#
  • zooming on grid when I want to search some specific area
final kiln
#

The exact method is possible, but counter intuitively, not what we want.

hollow mortar
long canopy
#

done after it looks like there's a place worth investigating on the total hyperparam space

hollow mortar
#

ah ok and that searching is automatic?

final kiln
long canopy
#

the random grid search is automatic, not the zooming

#

you zoom when you feel the force is strong with an area

#

i.e. it's not at all a rigorous thing

final kiln
hollow mortar
#

a grid search sounds neat, havent touched: tuning weight num... etc

long canopy
hollow mortar
hollow mortar
final kiln
final kiln
hollow mortar
final kiln
#

Tho it does depend on the complexity of the model, but in most cases you'll be choosing a model that is more complex than what your data requires, and then you do regularization to prevent overfit

hollow mortar
#

if we talk of the error over the entire data set then it should generalise between that data.
even a local minima over the entire data set could encode noise

long canopy
#

do people use pytorch and tensorflow in conjunction?

#

if they do, why?

final kiln
long canopy
#

identity function has 0 loss

hollow mortar
#

like what school exams require ๐Ÿ˜Ž

final kiln
final kiln
#

Oh wait I see

long canopy
#

it's perfect overfitting

final kiln
#

Yeah

long canopy
#

but useless

hollow mortar
#

perhaps some seperate func could be designed where the weights producing global minima of the func ensure generalisation, thats how id continue this exactness foray

long canopy
final kiln
#

You gotta be careful tho, because any information you give it about generalization will likely be a data leakeage

hollow mortar
#

they are grad descent no? or similar

long canopy
#

yeah

final kiln
#

Stochastic grad descent with momenta in the mix

hollow mortar
#

ye so theyre a local solution

final kiln
#

Uhm not necessarily

long canopy
final kiln
#

Because a local minima can be unstable due to the stochastic part

hollow mortar
hollow mortar
long canopy
#

the minima of the validation loss is maximal generalization

#

after that, you're overfitting

#

training loss keeps going down but validation loss goes up

final kiln
#

Usually you even keep 3 dataset splits, train for training, test for testing and eval to help you choose hyper parameters

The reason why you have eval is that you choosing hyper parameters constitutes information flow from the eval split into the training process, so it's a subtle form of data leakeage

hollow mortar
#

ok thanks!

#

ok but is that a problem, im assuming the eval dataset would be constructed to be conducive to successful training

final kiln
#

Ideally they are chosen at random

long canopy
#

check out the statistical learning stanford series, they give indepth explanation on exactly this

final kiln
#

If you used only one dataset, there would be no way for you to know how it would perform IRL

#

If the eval dataset is coupled to the training process, you can't use it for the same reason

#

Like, you chose it so it works well for that dataset. And to test it you gotta test it against something it hasn't seen yet

final kiln
#

I find most things rather intuitive

long canopy
long canopy
#

allows you to search specific things about the ML workflow, etc.

final kiln
#

Ah instead of hitting my head against the wall til I find the best way

long canopy
#

details everything we've been talking about today

final kiln
#

Awesome, will check it out

long canopy
#

100% worth the time

#

to take a break from practice to get that course done, it's the bare minimum

final kiln
#

I hated stats in school ๐Ÿ˜ญ

long canopy
#

you will hate not knowing that stuff worse lol

final kiln
#

ah I know my stats, ig I just learned it in a more applied setting like statistical mechanics or quantum mechanics

#

Tho it's always different ig, more engaging

#

You also get it in thermodynamics I think

hollow mortar
long canopy
#

@odd meteor the lightning profiler is amazing, thanks a lot for the recommendation extremely useful stuff

final kiln
#

yo

#

Gemini is free

#

The API

#

I'm gonna try it out, if it's even close to gpt 4 I'm canceling my open ai subscription

long canopy
#

NICE

final kiln
#

I'm tryna see if it's good, gonna hook it up to my discord bot in a minute

long canopy
#

from personal experience

#

did not know the API was free

final kiln
#

Not a available in Europe I'm dead

long canopy
final kiln
#

Aaaah I'm either gonna be paying a VPN or open ai

#

Ig VPN is worth it regardless

#

But for now I'm gonna try to spin up an AWS machine

#

Which like, makes me wonder y restrict the regions in the first place, that's never gonna be a thing on the internet

long canopy
#

probably works for a significant part of non-computer-savy people

#

me atm: bouta start making some synthetic datasets with the free gemini API

final kiln
long canopy
#

hm yeah

#

gemini API refused to respond to the query "Write a story about a magic backpack." because there was a high probability of sexually explicit content

#

wtf

#

it's probably what they're using to do load balancing

midnight harbor
#

guys i wanna know, about vectore databases, i mean like which give good and fast result like i have heard about pgvector and its indexing

i know there are other vector db too
but wanna know why still some people are taking side with pgvector, what it gives better

i have also study that its recall is good but not accuracy

kindly tag me if anyone give me the answer

agile cobalt
agile cobalt
midnight harbor
#

but idk why, on top vector databases blog, pg_vector still remain underrated and i wonder why

agile cobalt
odd meteor
midnight harbor
#

im just here to take an advice if someone already has use pg_vector and they are good with it

agile cobalt
midnight harbor
agile cobalt
midnight harbor
#

the thing i like about posgres, already as developer im little bit experience with it, 2nd i dont have to use these kind of multi database arch
and can use prisma ORM easily for it, but just wanna make sure before choosing it

midnight harbor
final kiln
agile cobalt
final kiln
#

It's still Gemini pro 1.0, but it's paid

midnight harbor
#

Free tier Gemini get 500 internal error sometimes ( i check this today)

final kiln
#

That's wild actually

agile cobalt
final kiln
#

Well it's different insofar that I have to pay it right

#

I'm gonna try to confirm the data policy

agile cobalt
#

To help with quality and improve our products, human reviewers may read, annotate, and process your API input and output

final kiln
#

Right that's the Gemini API, I'm finding it hard to find the policy for the gemeni that vertex AI offers

#

Barely matters though, won't trust it regardless

long canopy
#

synthetic datasets

#

that's all that matters with this API

final kiln
long canopy
#

of course, it was just a hypothetical thought of things that could only happen in my wildest dreams

#

at no point am I actually going to use it for creating synthetic datasets

final kiln
#

I got my discord bot connected to Gemini pro

#

It's uncool that they make Europeans pay, but ig it makes sense since we regulate them

slow panther
#

Hello

#

Does anybody know good open source AI photo and video generator? Preferably open source, with no limits etc

bronze prism
#
data.csv: (journeyTime is ms and length is mm)
roadNo  time    journeyTime length  speed
1544    2024-01-01 00:00:00 1832    34439   18,7986
1582    2024-01-01 00:00:00 1524    28660   18,8058
1585    2024-01-01 00:00:00 1789    33634   18,8004
2063    2024-01-01 00:00:00 1987    38666   19,4595
2064    2024-01-01 00:00:00 1987    38666   19,4595
1544    2024-01-01 00:05:00 1830    34439   18,8191
1582    2024-01-01 00:05:00 1522    28660   18,8305
1585    2024-01-01 00:05:00 1788    33634   18,811
2063    2024-01-01 00:05:00 1984    38666   19,4889
2064    2024-01-01 00:05:00 1987    38666   19,4595
1544    2024-01-01 00:10:00 1833    34439   18,7883
1582    2024-01-01 00:10:00 1523    28660   18,8181
1585    2024-01-01 00:10:00 1789    33634   18,8004
2063    2024-01-01 00:10:00 1987    38666   19,4595
2064    2024-01-01 00:10:00 1989    38666   19,4399

There are 300 different roads in total and I have 288 times of 5 minutes. I want to convert this data into minute and metric formats. For example, the speed at the 400th meter at 01:17. I tried using interpolation and regression to fill empty data but I couldn't

earnest shell
#

'''
Python
Print the type of engaging conversation
print(type(prediction["engaging_conversation"]))
print(type(dialog))

Print the engaging conversation
print(prediction["engaging_conversation"].dialog)

<class 'str'>
<class 'list'>

AttributeError Traceback (most recent call last)
<ipython-input-40-85ca52f38205> in <cell line: 70>()
68 # Print the engaging conversation
69 #print(prediction["engaging_conversation"].dialog)
---> 70 for user1_message, engaging_response in prediction["engaging_conversation"].dialog:
71 print(f"User 1: {user1_message}")
72 print(f"User 2: {engaging_response}")

AttributeError: 'str' object has no attribute 'dialog'
'''

serene scaffold
earnest shell
#

list

dusty valve
#

list

serene scaffold
#

keep in mind that I'm asking about prediction["engaging_conversation"]. not whatever prediction["engaging_conversation"].dialog is intended to be.

earnest shell
#

ok

#

Make the conversation more engaging my improving the responses of User 2. Keep each turn short and crisp.

Dialog:
[ "Diana , do you like the perfume I gave you ? ", " It โ€™ s good . But to tell you the truth , I don โ€™ t wear perfume . ", " I โ€™ m sorry . I didn โ€™ t know that . ", " That โ€™ s all right . Thank you all the same . " ]

Response Format:
Same format as the dialog - a comma separated array of strings with chats from User 2 improved in-place.
i wanted to do this like it

crystal fjord
#

can someone help me with a project of mine: I want to make a yolov5 model be able to capture the live feed of my web app camera and put the labels on objects on the web app it self

p.s already have made the yolov5 model only need a way to connect it to the web app

meager ridge
#

hey how do I keep my GPU from running out of memory?

stupid vague question I know -- I'm extracting data from a pdf one page at a time and it crashes after like 4 pages

I can just save the data and [insert something like clear GPU memory here], but I don't know that step is

(working in jupyter on a google compute VM running debian 12)

long canopy
#

what's the best option for file compression of checkpoint files?

#

these things get big really quickly lol

final kiln
# final kiln I'm gonna try it out, if it's even close to gpt 4 I'm canceling my open ai subsc...

Unfortunately even Gemini 1.0 ultra is not even close to gpt 4 https://news.ycombinator.com/item?id=39395020

needlesslygrim

Personally, I've given up on Gemini, as it seems to have been censored to the point of uselessness. I asked it yesterday [0] about C++ 20 Concepts, and it refused to give actual code because I'm under 18 (I'm 17, and AFAIK that's what the age on my Google account is set to). I just checked again, and it gave a similar answer [1]. When I tried Ch...

#

According to Gemini, c++ is too unsafe for underage kids

#

So funny

long canopy
#

yeah i think it's doing load balancing with this

#

so many extremely dumb examples of these safety precautions from gemini, but sometimes it performs REALLY well

#

so it has to be load balancing

final kiln
#

I sometimes think chat gpt is doing something similar, because at times it just acts dumber than it is, missing context cues a la gpt 3.5

#

But lying to your user about which model is being used is a recipe for disaster

#

So I think the variations come from the tweaks they likely make from time to time

#

And in the case of Gemini, well they probably just went overboard with their "safety" fine tuning

#

Like

#

A lot of bad stuff is possible with language models, so might as well have these dumb things happen than to say, have the model make someone's mental health crisis worse.

rugged comet
#

If you wanted to view the relationship between crime rate and temperature, you'd want to factor out the overall trend of the crime rate. Decomposition of a time series results in a trend, seasonal, and residual component.
Would you find the correlation between temperature and crime rate seasonal, or temperature and crime rate residual?

long canopy
#

@odd meteor i love lightning you are a blessing unto this world

rugged comet
#

Here is an example of decompsed crime rate.

odd meteor
rugged comet
#

Comparing temp resid to crime resid gives a correlation of 0.36. Comparing temp seasonal to crime seasonal gives a correlation of 0.93. I'm very suspicious of such a high value for comparing the seasonal components. So much so that I'm wondering if I'm illogically comparing them when I'm trying to find something else.

strong wasp
#

It's hard, but you can make a automation chat but not ai

supple inlet
#

Not sure if this is best place to ask but are there any upcoming data science online hackathons? Would love to colab with others in the space to build a project. Or not even a hackathon where would be best place to find others to build a project

final kiln
#

If it is there, it actually doesnt look very pronounced, it's easy to see this in Europe, Portugal vs Sweden for example

#

Or Australia vs Russia. The stronger correlation seems to be poverty levels and perhaps even historical reasons.

Maybe the most reasonable hypothesis is that temperature levels exacerbate the crime that is already present, or people don't go outside as much so there's less opportunity for crime to occur

long canopy
#

what's up with jax

desert oar
desert oar
desert oar
# long canopy what's up with jax

differentiable array programming framework. like what pytorch does, but without all the neural network helper stuff on top of it. also a lot like what the old theano framework did.

long canopy
desert oar
#

i don't think it's significantly faster than torch. i think the idea is that it's easier to use for advanced custom things, so it's popular among researchers and ML engineers. i've certainly never needed to use it.

#

i guess there are some higher-level frameworks built on jax now as well

#

jax itself i think also glues together existing C++ libraries (XLA) instead of building it all from scratch, so that's nice too

long canopy
#

looks like the main selling point is good TPU support

tepid tartan
#

Question, do I need to do discrete math for data analytics

long canopy
#

both are usually covered in discrete math

rugged comet
tepid tartan
burnt coral
#

hi, does anyone have any tips to help with overfitting for an lstm binary classifier? we've implemented several drop out layers but the best val accuracy we can get to is about 85%

mint shard
agile owl
#

can someone help me understand why the dimensionality changes and the second assert raises here

    def _get_reward(self) -> np.ndarray:
        """
        Calculate the reward for the trader.

        Returns:
            float: The calculated reward.
        """
        ret = self._get_return()
        rf_ret = self.rate / 252
        ret_vol = self.return_volatility or 0.05
        reward = yeojohnson((ret - rf_ret) / ret_vol, self.risk_aversion)
        assert isinstance(reward, np.ndarray)
        normalized_reward = (1 / (1 + np.exp(-reward)))
        assert isinstance(normalized_reward, np.ndarray), f"reward: {reward}, normalized_reward: {normalized_reward}"
        return normalized_reward
untold bloom
#
In [16]: k = np.array(2)

In [17]: type(k)
Out[17]: numpy.ndarray

In [18]: type(np.exp(k))
Out[18]: numpy.float64
final kiln
odd meteor
# burnt coral hi, does anyone have any tips to help with overfitting for an lstm binary classi...

When it comes to combating overfitting, collecting more data yields the best bang for the buck.

If collecting more data in your case is feasible and not so expensive, then start from there.

If it's not possible to collect new data, just try other techniques like batch normalization, learning rate scheduler, changing optimization algorithm and your activation function, or combination of all these stuff, etc

Or better still, check online to see what configuration worked best for similar task you're trying to solve.

final kiln
final kiln
left tartan
agile owl
#

why yes of course

#

why WOULDN'T you do this

lapis sequoia
#

I certainly want to know

#

is there any field or anything like that where Data science , ai and ml work together ?

left tartan
lapis sequoia
# left tartan ? All those terms are tightly linked.

yeah but there certain jobs like someone who works w data science will work for a data related role someone who learned or invested in ml will get the role of ml eng and same gose for ai like the roles are separeted is there any role that combin these 3 together?

left tartan
lapis sequoia
mint shard
#

can someone guide me to start machine learning python? idk where to start.

final kiln
final kiln
#

I learned that one on the last veritassium video, really good

agile owl
#

people who think the market is efficient are ignorant

#

I was about to say something much meaner

#

for over a decade the GSCI commodity index exhibited predictable rebalancing behavior every month end

#

many such cases

final kiln
#

I figure it's approximately efficient given how much it looks like a random walk

agile owl
final kiln
agile owl
#

this is just one very obvious example

#

it is being exploited every day

#

some people win trying to exploit it some people lose

agile owl
#

the people who pay are the ones who are price insensitive

#

usually

#

so hedgers pay a risk premium to speculators

#

because they are willing to take a worse price to offload risk

#

and the speculators are EV maximizing

final kiln
#

Doesn't look very predictable to me tho

agile owl
#

it is very predictable that is an anomaly

#

if you were sitting with cash ready to invest and you saw that happening

#

you would be an idiot not to buy into it

#

because it was so anomalous

final kiln
#

In April 2015, Navinder Singh Sarao, an autistic[64][65] London-based point-and-click trader,[66] was arrested for his alleged role in the flash crash.

agile owl
#

another example, not to get political, but the equities market crashed 6% overnight when Trump was beating Clinton and it rallied back to unch by morning

final kiln
#

He got a sentence too

agile owl
#

stuff like that happens not that infrequently

#

I just point those out because those are the most irrefutable examples of inefficiency

final kiln
#

It was possible to map the ups and downs to what was happening

agile owl
#

well the market is very momentum based there's a lot of insurance companies that need to hedge liabilities

#

and their hedging behavior looks very stupid to everyone else

#

because they will just keep selling into selloffs

#

that's the type of stuff that's exploitable

final kiln
#

I think that belief that it's not efficient is what keeps it as efficient as possible, cuz you guys are always trying to exploit it so any deviation quickly gets corrected.

agile owl
#

the core concept and the reason why EMH can't be true is that it pretends there is one class of actors

#

or that actors all share the same risk preference

#

or that risk preferences are static

#

none of those things can be true

final kiln
#

Wait, but isnt the hypothesis saying that an efficient market generates a random walk ?

#

And not that the market itself is efficient

#

Ig not, that would be this right

#

Wait no, I'm still digging

agile owl
#

the EMH is actually distinct from a martingale

final kiln
#

Omg this is all super confusing stuff

agile owl
#

the risk neutral probability measure is a chimera

#

it doesn't exist in reality

#

people have their own risk preferences in reality

final kiln
#

Yeah it sounds very complex to model using equations alone, maybe with simulations and such

#

Never did enjoy this application of math, idk why so many mathematicians like it

agile owl
#

don't forget that actors are sampled from a finite human population

#

not these demigod "rational actors"

#

there's a lot of stupid people with a lot of money

#

and a lot of smart people with no money

#

anyway the first thing I learned at the hedge fund I worked was CAPM was wrong and they made a pretty strong case I haven't really troubled myself with purely academic perspectives on the issue since

#

this is a hot take but I consider Crypto to be a refutation of the EMH as well

#

I think the theory lacks method too

#

so it's kind of... a dead letter in any case

left tartan
agile owl
#
Peter Lynch, a mutual fund manager at Fidelity Investments who consistently more than doubled market averages while managing the Magellan Fund, has argued that the EMH is contradictory to the random walk hypothesisโ€”though both concepts are widely taught in business schools without seeming awareness of a contradiction. If asset prices are rational and based on all available data as the efficient market hypothesis proposes, then fluctuations in asset price are not random. But if the random walk hypothesis is valid, then asset prices are not rational.
#

there's so many ways to demolish the EMH I think it's intellectual malpractice to teach it

left tartan
agile owl
#

here's the thing both sides of this argument are talking their book

#

but who actually needs to prove it

#

the asset manager, ironically

#

if markets are efficient then asset managers wouldn't exist

#

the whole thing is simply ridiculous

#

academics need this to be true so their math has merit

#

asset managers need this to not be true so they can continue to sell a service

#

if you look at how the rest of society responds I'd say the asset managers have the thumping majority of support

#

which again on its face invalidates the EMH

#

it's kind of scary to think of what academics get away with when there isn't an industry to disprove them

#

and that is why there is a place for AI in markets

#

people are really really dumb

#

it's quite the opposite of EMH

#

which supposes people are really really smart

#

the returns and volatilities of assets are not on a capital market line either

#

the capital market line is a big joke

#

so yeah the market adjusts to events quickly, but does the "risk neutral probability" have any relation to the actual probability?

#

the starting price I mean

final kiln
# agile owl people are really really dumb

Playing the devil's advocate for a bit, I think the argument is that people who don't know what they are doing get discouraged from participating due to their losses, or maybe even get removed by losing everything.

So there's this enormous selection bias, the people participating in the market are super rational, and if they are not, they get eaten alive by those who are.

past meteor
#

I feel like I reinvented the wheel for a common use case, basically bridging Torch and sklearn (having a .fit and .predict method on neural nets).

What do you guys do for this?

agile owl
#

that's not true because again, not every class of actor is driven by value maximization in their market activities

#

in fact, their market activities might just be facilitating their main business

final kiln
#

Idk what that means

agile owl
#

there is a difference between hedgers, investors and speculators

final kiln
#

You have to use layman terms

agile owl
#

hedgers would be like, an oil company needs to trade oil so they know what their profit will be ahead of time

#

an insurance company needs to buy bonds because it sold annuities

#

speculators would be people who are trying to maximize value by taking risk

#

the market is a risk exchange mechanism

#

hedgers pay a risk premium to speculators on average

#

some speculators win, some lose

#

the ones who win take value from the losing speculators and the hedgers

#

but in theory everyone could just be taking value from the hedgers

past meteor
agile owl
#

the reason it's different is the hedgers are relatively price insensitive, they just want to get rid of their risk

#

anyway pretty OT

past meteor
#

But that beats having to maintain 2 codebases: one for neural stuff and one for non-neural things

final kiln
#

I'm not sure how that addresses my argument tho

agile owl
#

there is no "selection bias" that makes everyone one type of actor

#

it is perfectly sustainable going on with hedgers and speculators acting in different ways forever

final kiln
#

I certainly feel discouraged, and I keep seeing people I know perfectly well don't understand this stuff but keep losing their money cuz they still bet on it

agile owl
#

the problem is that humans have a hard time being rational with certain things

#

the more important things are, the harder time people have being rational about them

#

that's why it's ripe for exploitation by algorithms

final kiln
#

Even so, they can keep trying, but they'll just keep losing their money

#

Thus their impact on the market is minimal, cuz that approach only scales to 0

#

The people who understand the market they're investing in are able to predict more or less where it's going.

#

That's what I'm saying

agile owl
#

people are rarely right more than 60% of the time when they are consistently taking views on markets

#

it's about risk mangement and consistency

#

repeated experiments to reduce the variance

#

if you are right 60% of the time trading 10y treasury yields, let's say on a daily timeframe, then you are a god

#

but in aggregate there has to be value transfer from those who are price insensitive to those who are price sensitive

#

that's why markets work, that's what they are for

potent sky
#

Any other reason?

past meteor
#

Like test train splitting that make sense based on the domain, metrics that make sense based on the domain etc

final kiln
past meteor
#

So I want a common interface so I only need to implement these once

potent sky
#

Fair enough

#

Why are you having to move more data from cpu to GPU?

left tartan
#

Forget which, I guess pmdarima is close but not exactly sklearn-y

desert oar
final kiln
#

I've fully automated all my infra into the prefect UI, feels good, feels very good

#

I click a button and that increases the aws scaling group capacity to 1, which automatically boots me up the most cost effective GPU spot

desert oar
#

@rugged comet if this isn't a time series model, you have discovered the known phenomenon of higher crime in summer. if you replace "temperature" with "deviation from historical average temperature" you should see the relationship mostly disappear, although historical average temperatures are rising so you will still inadvertently capture historical crime trend (because both temp trend and crime tend are correlated with time)

desert oar
final kiln
#

I've also setup my laptop as a self hosted runner for GitHub actions, now I don't need the runner to rebuild any env each time (torch images tend to be large)

#

And I also created a workflow that deploys a dev env remotely using vscode tunnels

desert oar
final kiln
#

It's not fully done yet, but I'll be able to develop on any AWS machine

final kiln
#

The laptop is always on anyway, might as well ig

past meteor
final kiln
#

Tho the irony of deploying a remote dev env through GitHub actions on my laptop and then using that to code - is not lost on me

#

But ofc the point is that I'll be able to develop using spot instances

#

Which is more cost effective than buying a GPU

#

I don't need one most of the time since I have the pipelines

#

So I just need a gpu to make final touches, making sure all fits together when I change device to CUDA

final kiln
#

And it also gets me an isolated environment like with dev containers, and cleans after itself so it saves me some space locally

#

And I also can code anywhere via the browser and a GitHub session

past meteor
#

Right now I have a singular ModelWrapper that takes in a specific interface and wraps it as such that it can accept pandas DF and outputs Numpy arrays

#

If it proves to be too slow I'll just move all the data to GPU once but I doubt it'll work properly

final kiln
desert oar
final kiln
desert oar
#

is it? you have to set up all this github actions runner stuff instead of just kicking off a background process

final kiln
#

GitHub Actions encapsolates everything already and has pre prepared workflow steps in the actions marketplace

#

I didn't even to code the rust installation for example, I just used an actions thing

#

Setting up the self hosted actions was actually trivial, so that's what's making it better ig

#

They give you a short list of commands, and that's it, no extra setup

desert oar
#

there is probably a self-hostable offline runner you could use, but it makes sense that GHA is just right there

#

i just don't really like GHA itself, it's very clunky and the docs are not great

#

we've been using Teamcity at work which is far better than i expected any enterprisey Jetbrains product to be

final kiln
#

Yeah it could be better for sure, being able to run them locally for example would be good, like locally locally not this proxying type stuff I'm doing

#

I also like that the logs stay there, with the makefile it would be something that I run and then lose, so I'd have to do more setup to get logging going

#

It also integrates with the GitHub image registry, that I'm using so I can reuse environments, there's my laptop, the AWS machines and the prefect runners.

#

Prefect is where I'm gonna keep all the automations to control the EC2 machines from now on, and also ofc the training loops with hyper parameter search strategy. Whereas GitHub actions is mostly there to build the images and deploy the code to prefect

desert oar
#

yeah great points on all counts

#

you're doing all your ML on your laptop? isn't that stupidly slow for CNNs?

#

i tried fitting a siamese network a while ago on the LFW dataset and it was processing like 2 images per second on the M1 Mac w/ Torch

#

meanwhile my 1060 (which is literally dying and randomly crashes the PC) was doing like 10 images per second ๐Ÿ˜†

final kiln
desert oar
#

ohh you're using the EC2 spot for GPU

#

that makes more sense. are you using a framework for that? or just saving checkpoints and monitoring + restarting if they get interrupted?

#

i think i asked you this before but forgot the answer

final kiln
#

I've changed my approach several times, am trying to zone in on the best way to do it

#

Everything is now automated on prefect.

There's this thing called work pool, which you can configure to contain all the info needed to run a task and in my case is tied to a deployment cluster on ECS

When I turn it on, it triggers a workflow on prefect that increases the cluster capacity to 1, it will then find me the most cost effective GPU spot instance available from the list I gave it.

At that point, I can trigger pipelines on that machine. I can literally input the hyper parameters and it does the thing.

Then I can turn the pool off and that decreases capacity to 0.

The turning on and off part is important because setting up the env each time is a nightmare.

So the answer is that it's managed by prefect and ECS. But I'm missing a bit of code for fault tolerance. Like the cluster automatically brings the number of instances up when it loses one.

desert prism
#

Recently working on comparative sentiment analysis of harry potter seven books and Robert Langdon five books

desert oar
desert oar
#

how does that work with the interruptibility of spot instances though?

final kiln
final kiln
#

Actually, I'm quite sure I managed to run one for three days

desert prism
desert oar
final kiln
#

It's all tied together with MLFlow

desert oar
#

ahh i see

#

and you're using MLFlow, also makes sense

final kiln
#

That's likely what's gonna enable fault tolerance in the end. Get the 2min warning, save state to MLFlow, restore everything in the next run

But I'm not gonna dedicate time to it yet, it's not an urgent feature at all. It'll be relevant once I'm training a model for a week or so

I swear, I really want to get this MLOps stuff out of the way, it's so much work ._.

pine pagoda
#

hey iโ€™m making an app that combines decision trees, ai, neuropsychology, and a feedback loop system. does anyone have experience with decisions trees and feedback loops/ ai?

long canopy
#

you should check skypilot for automating spot management, apparently it's good for that

long canopy
final kiln
final kiln
long canopy
#

yeah heard people train multiple days in a row with only spot instances using it

final kiln
#

In fact, looks like I'm coding that thing lool

long canopy
#

lol, the classic

final kiln
#

why didn't you show me this earlier ๐Ÿ˜ญ

long canopy
#

heheh

#

btw do you use anything special for distributed inference/training?

#

am trying to figure out if this is conditional on the specific models i use

final kiln
#

Do they have a managed solution

final kiln
final kiln
#

Omg we were thinking different things, but also, the same

#

I'm so tired

long canopy
final kiln
#

I think I can use it to provision the infra, there's nuance in this for sure

#

Prefect is both more general and more flexible, I'm unsure I'd be able to get the same level of observability with my rust/python split while using sky

#

While it's true that what I'm doing is a bit unorthodox

long canopy
#

oh were you building your project on prefect?

final kiln
#

It does highlight that there's gonna be stuff that will not be possible

final kiln
#

It can do two tasks in an asynchronous manner in which one task is baby sitting a shell process while the other is pre processing data

#

Which is actually a pattern I wanna keep, rust split or not

#

So what I'm imagining is that I'll still use sky, but for turning on and off the work pool

long canopy
#

you should prob make a git repo this stuff is likely to be useful for general mlops

final kiln
#

Yeah I have one, this is all an open source project funded by my non profit org

long canopy
#

oh nice what's your github profile

final kiln
#

Ah, can't give you cuz it would dox me

long canopy
#

oh np

final kiln
#

Tho it is a bit counter productive to my cause to not be able to share it

desert oar
#

i gave up and assumed i'm easy enough to dox anyway. i don't post my real name very often, but if you're motivated you could figure out who i am

neon crystal
#

have anyone done interpolation with LSTM?

desert oar
#

i went through a phase where i thought i wanted to be one of those people who used their Real Name On The Internet, so it's my github username & some of my email addresses, but i kind of regret that

#

cat's out of the bag now at any rate.

final kiln
#

I think I'm doxed in a couple servers, but they're much smaller communities so it's less risk

past meteor
#

I might switch my Discord name as well eventually

flint stirrup
#

why calling this function dont work

#

i took a break from py for a long time and i think i did a coding error somewhere

#

can someone help me please?

final kiln
#

I think "choice" is a string, so none of your if statements evaluate to true

#

Try to cast it to an integer, or use "1" instead of 1 in the if statements

flint stirrup
#

oh yea fax

#

thanks

final kiln
#

This is why I miss having a compiler around when the project gets big, this kind of stuff just keeps happening but in more subtle ways

flint stirrup
#

yea ๐Ÿ˜…

final kiln
left tartan
desert oar
final kiln
#

I agree, gpt4 also does meta cognition. They just didn't fine tune Claude 3 to deny that stuff about qualia and etc

late ruin
#

any one got good resources for text mining/analytics , how to start and where? i have a txt file to work with

hollow sentinel
sour socket
#

hello i want to make name detector or something like this because in my school the teacher keep saying the names of who is absent and who is not when in a meeting so i want it to work like this

i take a screensht of everyone in meeting and then the program says me who is NOT here in the names

final kiln
#

That's somewhat dystopian imo

#

I think classes are mostly counter productive, mass education is a necessary evil cuz the opposite is total mayhem. That's my take and I stand by it.

uncut crescent
#

is anyone here fmailiar with constrained MST probelm and how can i go about using rl and gnn to find approximate solutions ty

hallow hatch
#

Can anyone recommend some sub-orgs to apply for in GSoC under the "Python software foundation" banner? My interests are in ML and DS.

spring field
#

can ml be thought of as a pretty complex curve fitting machinery?

long canopy
final kiln
#

Dealing with Nvidia drivers on Linux+docker has been a bad experience overall

wooden sail
#

more of "function estimation", since that is more general than curve fitting

buoyant vine
#

Nvidia drivers are just a bad experience

final kiln
#

I gave up and just went to the pytorch's dockerfile and copied every line that mentioned Nvidia

#

Idk if it will work

#

It'll work I think, the AMI I'm using comes with the Nvidia docker think setup. So all it needs is the label and the env variables

I assume

#

Sigh

#

So this is the pattern I'm gonna be using to develop and deploy ML in a cost effective manner:

  • setup production images, ideally this would be just one
  • a GitHub workflow that:
    • lets you select which machine you wanna use (local or any AWS machine)
    • uses the selected production image as the base image
    • installs the development dependencies on top
    • sets up any extra dev config
    • finally deploys a vscode tunnel that you can open anywhere with a browser and GitHub session
  • a GitHub actions workflow that deploys the pipelines to prefect
  • and a final workflow that actually builds the images and publishes them
#

so it's basically a substitute for gitpod that lets me use gpu

#

Eventually I'll use the sky thing to extend this to other providers, but it's not needed at the moment.

agile owl
#

I realized I had an error in my ETL so my dataset was getting messed up and I spent all this time trying to figure out why my model wasn't working well anymore

#

๐Ÿฅด

past meteor
#

How do your ratios of RAM to VRAM look like, I want to get an idea of what others are running here

#

Because mine is likely quite dumb, I'll have to ask IT to reallocate

left tartan
#

Most of my systems are 1:1, just out of laziness.

past meteor
#

Yeah, we're at 7gb ram and 48gb vram

#

Sometimes IT is mindboggling

agile owl
#

but yes RAM being smaller than VRAM sounds really bad

past meteor
#

IT or whoever provisions machines has a skill issue honestly

#

I think I should sit down next to them whenever they're doing it

#

Their lead times are too large to afford to get stuff wrong and it always seems like a back and forth to get a half decent machine lol. The most annoying part about this is that it's unutilised on premise compute anyway

final kiln
#

I'm on 16gb ram to 16gb vram

frigid anchor
#

This may seem random but does any know why euler's number shows up so often in ML equations

agile owl
#

if you're fully using your VRAM then I think RAM should be at least a bit bigger than VRAM

#

everything that goes into VRAM has to go through RAM right

final kiln
#

My whole setup might be a bit more efficient than usual

agile owl
#

and you need RAM for other things too

final kiln
#

So I actually could get by with less ram

past meteor
agile owl
#

isn't that enough to kill the GPU vectorization benefit

#

reminds me of blocking in async

final kiln
#

I load model, then I load data

past meteor
#

IT giving me toasters instead of real machines is what taught me a lot about programming efficiently fwiw

final kiln
#

So the available GPU memory is less than the total GPU memory, and the difference is the ram that is needed

past meteor
#

Ironically, after fighting with them to get resources for machines that aren't being used I got it when I had perfected my ETL to run with a fraction of the resources

agile owl
#

do IT people actually understand how GPU computation works

#

can you even expect them to

past meteor
#

Doing this was a nice experience but honestly, it was a waste of time

left tartan
agile owl
#

why don't you just tell them what to give you

#

instead of having them decide what to give you

#

having IT decide sounds ass backwards because how can they know

past meteor
#

It's their territory, you can't decide for them

agile owl
#

oh hell

past meteor
#

That's what IT does

agile owl
#

is IT trained on how data feeds into the GPU

#

๐Ÿค”

past meteor
#

No

agile owl
#

then how can it be their job

past meteor
#

They don't understand it's a special case

#

So they say, "it's our job to provision hardware"

left tartan
#

I think most of our systems are 2-3x. We just take whatever lambda or ec2 provides. My workstation is more like 8:1โ€ฆ I maxed out my memory

agile owl
#

I always feel like exposing the irrationality of the org structure in situations like this but I usually just end up contributing less effort to punish my employer for putting me in an impossible situation

left tartan
#

Ram is basically free, relative to my salary

past meteor
#

Aside from that, a lot went south like them giving me a machine that has a tiny amount of disk space allocated to /

final kiln
#

I'm actually not sure why you need more than 1:1 or 2:1 since you can swap with disk

past meteor
#

I have another physical drive mounted but it's not on /. Whenever I want to install large packages like Torch I need to do all sorts of magic like sim linking the cache etc

#

Total waste of time

left tartan
past meteor
#

Basically yes

#

For my interns I just got API keys and a budget for LLMs

#

We can run them on premise

#

Why? The price to get them SSH access is just more than paying for the compute in the cloud

final kiln
#

You can also deploy vscode tunnels, even less time

#

I got all this stuff down, but I fear the next org I join will likely have its own less efficirnt ways of doing things

past meteor
#

It depends, we also have people that are brilliant internally on the IT side.

#

They just don't work for our team

final kiln
#

For the longest time I thought we were all IT

past meteor
#

Depends on the language

#

In Dutch we call everything that has to do with tech IT, so even SWEs

final kiln
#

I have no idea how it is in PT actually, but I think there is a distinction yeah, for people who haven't gone through college and don't code but know some bash and know their way around a computer

left tartan
desert oar
#

oh and of course, it's part of the gaussian distribution

#

not to mention the exponential distribution as well

#

it tends to be important mathematically because of its relationship to logarithms, and logarithms show up a lot in probability and statistics as well

final kiln
#

From what I recall, It's the purest exponential

desert oar
final kiln
#

It solves f(x) = f'(x)

desert oar
past meteor
#

I was talking about a sklearn interface over Torch recently. I found out skorch does it but it looks sus

agile owl
desert oar
past meteor
#

That's what I did. I was curious how others did it

#

I just use torch as normal and have a very thin (20-30 loc) wrapper that can take any model

agile owl
#

this fly vision video is really interesting

#

talking about the organic cognates to data structures

final kiln
#

good days work, with all my infra finally done

#

ah, I do need to hookup MLFlow, shouldn't be too hard tho

#

after that I'm gonna do the hyper parameter search thing, and prepare the rest of the datasets

#

they just need to obey a certain data schema and it all works out

past meteor
#

I went caveman this project and essentially rolled my own system using streamlit

final kiln
#

I just need to pass an http url to a parquet file and it knows what to do with it as long as the schema is proper

past meteor
#

Sometimes you hedge your bets on being able to write a subset from scratch faster than reading the docs

final kiln
#

this is how it looks on prefect

tall panther
#

Hey all, what is the most cost effective way to save training data for a model? I'm talking about 1-2 TB of data. I'm the only one who will use it, so I would be ok with non-cloud storage options.

agile cobalt
#

depends on where you will train the model

tall panther
#

If it's possible I prefer to train it locally, but the ssd in my laptop is not big enough.

agile cobalt
#

you could just purchase an external hard drive then

tall panther
#

yeah, that was my original plan. I was just checking in case there is a better alternative

final kiln
#

Uhm, depends on how long you'll need to store the data, S3 might cost less in total

tall panther
#

S3 is 25 USD/month for my requirements. If I need the data for more than 4 months it would be better to purchase a separate drive. Since this is my first project I think it's likely I will work with the data for more than 5 months.

final kiln
#

Yeah then ssd seems to be the way

agile cobalt
#

you can also use the drive for backing up normal things after you are done with the project, and/or just permanently keep that data backed up

agile owl
#

it's a lot harder to do development cost effectively on cloud

#

until you get to a certain state anyway

#

if it's your first project then definitely buy hardware imo

#

worst case scenario you just reuse it instead of getting cloud in the future

final kiln
#

In theory cloud is cheaper until a certain threshold of usage

#

But there's also initial setup cost

#

And you need to know what you are doing

agile owl
#

yeah but there's also the aspect that your sunk cost on the hardware now means it's free later

#

whereas if you need storage again you have to pay s3 again

final kiln
#

I think storage in general is cheap so it's pretty clear yeah

#

But stuff like GPU is a lot harder to see imo

tall panther
#

Yeah,the case for an external ssd seems very clear. Thanks everyone.

agile owl
#

the real question is whether you can expect to amortize the cost of hardware over the lifetime of the hardware vs S3 not whether for one project it's cheaper to use S3

#

so it's all about how long+how much you use the drive, not how long you use S3

buoyant vine
#

There is also the arguement of performance and durability

#

You can pull at lot of data from S3 on effectively any amount of data, while also not needing to worry about drive failures etc

agile owl
#

right but drive failures are kind of a problem that cloud makes for themselves

#

to a large extent

#

a single user using a single SSD drive, you don't have to even think about it for years

final kiln
#

There's also the risk of losing the data

#

There's no fault tolerance locally, no deduplication, etc

agile owl
#

it's also bro's first project

final kiln
#

trial by fire

agile owl
#

and I think in principle you can worry about moving your stuff to cloud once you've developed it locally

#

if the data is valuable you should have an independent backup copy anyway

#

but it sounds like the data can be obtained at will it's just storing it

final kiln
#

No I think SSD is better in this case

tall panther
#

I think so too. I'm sure other options have their advantages, but in my situations I'm really prioritizing cheaper solutions.

agile owl
#

I just tested my stuff on cloud and tore it down but that's because I also have a local copy

#

I can very precisely control my cloud usage as long as I tear it down every time I'm down

#

and store everything locally

#

basically spending like $10 for 5 hrs

#

but with s3 you gotta pay per month to store it

#

and then you have to deal with s3

final kiln
#

With interruptible instances like Spot you can get some pretty good prices, hard to argue for buying a GPU

#

But then again, it depends on usage and on scale

#

Rn if I average 4h per day of GPU usage, it leads to about 200dolar at the end of the year

agile owl
#

where are you getting decent gpu server for 14 cents an hour?

final kiln
#

Some of the Asia regions also have good prices

#

I've gotten as little as 6cents an hour, but it went up

#

On vast.ai you're able to get even better prices with interruptibles

#

I've seen 4 cents an hour at times

#

I'm not using vast cuz I have credit on AWS

agile owl
#

how often do they get interrupted

final kiln
#

AWS, not very often, only got mine interrupted once after like 24h or more

#

But I think it might depend on the instance type

agile owl
#

does it give you a grace period

final kiln
#

Yeah

#

And there's like a thing you can use to handle the fault tolerance, haven't tried it yet the tho

#

It's sky something

#

I'm still not fully convinced tho, I think it looks excellent for provisioning cloud infrastructure, but, it's not gonna be as flexible as something like prefect

#

But I haven't tried it yet, so I can't speak too much on it

desert oar
final kiln
copper zodiac
#

I made a GUI frontend to CoquiTTS using Tkinter in a few hours because I was bored lel

#

Oh wait it's already been made :(

#

Well it was fun to make anyways

zealous spear
#

Hi, can anyone tell me why I cant import from llama_index.core.readers import PDFReader ?

I got this error: ImportError: cannot import name 'PDFReader' from 'llama_index.core.readers' (C:\Users\barte\Documents\GitHub\Python\ai\Lib\site-packages\llama_index\core\readers_init_.py)

#

Or can anyone help me changing this import that this code can work?

import os
from llama_index.core import StorageContext, VectorStoreIndex, load_index_from_storage
from llama_index.core.readers import PDFReader

def get_index(data, index_name):
    index = None
    if not os.path.exists(index_name):
        print("building index", index_name)
        index = VectorStoreIndex.from_documents(data, show_progress=True)
        index.storage_context.persist(persist_dir=index_name)
    else:
        index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=index_name)
        )

    return index


pdf_path = os.path.join("data", "Canada.pdf")
canada_pdf = PDFReader().load_data(file=pdf_path)
canada_index = get_index(canada_pdf, "canada")
canada_engine = canada_index.as_query_engine()
buoyant vine
#

A google seems to suggest it is PyMuPDFReader not PDFReader

#

Never mind

#

I see they have both types

#

Their doc search is terrible...

zealous spear
buoyant vine
#

Have you installed llama-index-readers-file

zealous spear
#

nope, I think that this inlude llama_index install

#

but I am instaling it now

buoyant vine
#

I guess worth a try

#

Other than that, I am not sure

open raven
#

CPython 3.12.2 pip canโ€˜t build statsmodels wheel while installing one bdist package (streamad).

On another side it was possible to install statsmodels to same environment explicitly - by doing pip install statsmodels.

streamadโ€˜s meta.yaml sticks to one single version number regarding setuptool requirement. I will try to relax it, however no idea if it is of any significance as far this problem concerned - just shooting blindly.

Eventually any ideas else? I appreciate all coming in.

buoyant vine
#

Are you using any sort of package manager like pipenv or poetry which do their own resolving?

zealous spear
buoyant vine
#

If you are maybe it is resolving a different (wrong) version

buoyant vine
zealous spear
#

not working both of this option

#

install should looks like this: pip install llama-index-readers-file ?

zealous spear
rugged comet
# desert oar you posted a screenshot of what looked like a time series decomposition, that's ...

Yes, it is a time series decomposition using statsmodels.tsa.seasonal.seasonal_decompose. Do people also call that a "time series model"?
It was my assumption that we would want to compare temperature to crime rate residual. It didn't make sense to me to include the trend because the overall decrease in crime is likely not influenced by day-to-day temperature. Where I'm confused is the seasonal part of crime. I want to believe that there's also a seasonal part of the crime rate that we want to factor out. Such as "more crime on the weekends". My instructor seems to think we should still include the "ups and downs" of both temperature and crime rate found in the seasonal parts of both.
Sorry about the random stream of thoughts. This is my first exposure to time series', so I could be making assumptions that I shouldn't be.

buoyant vine
zealous spear
# buoyant vine Pretty sure under the hood it uses pypdf

Raplace it, and get this code

import os
from llama_index.core import StorageContext, VectorStoreIndex, load_index_from_storage
from pypdf import PdfReader


def extract_text_from_pdf(pdf_path):
    with open(pdf_path, 'rb') as f:
        pdf_reader = PdfReader(f)
        text = ''
        for page_num in range(pdf_reader._get_num_pages()):
            text += pdf_reader._get_page(page_num).extract_text()
        return text

def get_index(data, index_name):
    index = None
    if not os.path.exists(index_name):
        print("building index", index_name)
        index = VectorStoreIndex.from_documents(data, show_progress=True)
        index.storage_context.persist(persist_dir=index_name)
    else:
        index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=index_name)
        )

    return index


pdf_path = os.path.join("data", "Canada.pdf")
canada_text = extract_text_from_pdf(pdf_path)
canada_index = get_index(canada_text, "canada")
canada_engine = canada_index.as_query_engine()

and that kinde of error:
PS C:\Users\barte\Documents\GitHub\Python> python main.py
building index canada
Traceback (most recent call last):
File "C:\Users\barte\Documents\GitHub\Python\main.py", line 10, in <module>
from pdf_reader import canada_engine
File "C:\Users\barte\Documents\GitHub\Python\pdf_reader.py", line 30, in <module>
canada_index = get_index(canada_text, "canada")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\barte\Documents\GitHub\Python\pdf_reader.py", line 18, in get_index
index = VectorStoreIndex.from_documents(data, show_progress=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\barte\Documents\GitHub\Python\ai\Lib\site-packages\llama_index\core\indices\base.py", line 136, in from_documents
docstore.set_document_hash(doc.get_doc_id(), doc.hash)

AttributeError: 'str' object has no attribute 'get_doc_id'

buoyant vine
#

I am guessing what you give to from_documents is not the right type

#

is it supposed to be a sting, or does llama_index expect a object with different attributes

desert oar
#

what's the actual goal here?

rugged comet
# desert oar what's the actual goal here?

The goal was very broad: "Does weather affect the rate of crime in Chicago? If so, how much?"
We were given the following data sources:
Crime - https://data.cityofchicago.org/ (To export, click "Public Safety", then click "Crimes - 2001 to Present")
Weather - https://climexp.knmi.nl/gdcntave.cgi?id=someone@somewhere&WMO=USW00094846&STATION=CHICAGO_OHARE_INTL_AP,_IL&extraargs= (To export, click "raw data" just above the graph of the temperatures)
Each row in crimes is one crime. Each row in weather is one temperature measurement for a day.
I started by inner joining the crime and weather data on the date. I added a "Crimes Per Day" column by getting the value counts for each date in the dataframe. Calculating correlation between temperature and crimes per day gave a very bad result. The instructor said that we should shoot for an r^2 value of around 0.48. I was getting an r value of about 0.19.
He did a sort of manual decomposition of the time series by subtracting the rolling average from each day's crime rate. He then turned that into a percentage. I assume he means percent increase.

desert oar
#

i understand why your professor wants you to keep the original un-decomposed temperature series. if anything, the natural changes in weather throughout the year allow you to make interesting within-year comparisons

#

it sounds like this is intended as more of a general data analysis exercise, than an exercise in causal modeling, is that right?

#

if so, it looks like "affects" is used loosely to mean "is related to / correlated with", rather than "causes"

#

in which case it doesn't matter so much whether you've accounted for confounding variables etc. correlation is what it is. you just need to resist the temptation to interpret it as causality.

#

it's a bit weird that you're being expected to manipulate the data until you get a high correlation

#

is that meant to prove some kind of point about data processing affecting results?

#

i think your original instinct is still correct: weather alone might not cause more crimes. but that might be the point of the exercise, to demonstrate that just because two things in the data are statistically related doesn't mean they are causally related

rugged comet
#

@desert oar

it sounds like this is intended as more of a general data analysis exercise, than an exercise in causal modeling, is that right?
He didn't mention anything about causal modeling. I think you're correct in that it's intended to be more of a general data analysis exercise.
if so, it looks like "affects" is used loosely to mean "is related to / correlated with", rather than "causes"
I would agree with this.
it's a bit weird that you're being expected to manipulate the data until you get a high correlation
I think by that he meant that if you're not getting anywhere close to that, you may not be considering something that you are supposed to consider. Say I got an r^2 value of 0.25, but I didn't know that the relationship between temperature is actually stronger than that. If I worked hard, and everything I did to reach that value made sense to me, I might not have any reason to try other things such as time series decomposition.
is that meant to prove some kind of point about data processing affecting results?
He didn't say that explicitly. It seemed to me like the point of this project was to learn how to remove an overall trend from data so that the underlying relationship is clearer.
i think your original instinct is still correct: weather alone might not cause more crimes. but that might be the point of the exercise, to demonstrate that just because two things in the data are statistically related doesn't mean they are causally related
Of course weather alone does not cause more crimes. It likely isn't the only factor even affecting the rate of crime. If he wanted to demonstrate that statistically related variables doesn't necessarily mean causally related, he certainly didn't spell that out or even make any allusion to whether that was his intent.
In short, "affects" was likely used to refer to correlation. What I still don't know is whether it even makes sense to correlate seasonal temperature with seasonal crime.

mint mist
#

Got this for HW and i have no clue how to do it pithink

In Exercise 1, we will group the dataframe by birdname and then find the average speed_2d for each bird. pandas makes it easy to perform basic operations on groups within a dataframe without needing to loop through each value in the dataframe.

Instructions
Fill in the code to find the mean altitudes of each bird using the pre-loaded birddata dataframe.

Here is the code:

      # First, use groupby() to group the data by "bird_name".
grouped_birds =

Now calculate the mean of speed_2d using the mean() function.
mean_speeds = 

Find the mean altitude for each bird.
mean_altitudes = 

What is the mean speed for Sanne?
mint mist
#

its solved now nvm

orchid geyser
#

Hello friends, can anyone introduce me to a course so that I can learn mathematics related to artificial intelligence and data science or machine learning and it will be enough, thank you.

errant bison
#

What are some required skills for ai developer job

serene scaffold
desert oar
# rugged comet <@389497659087650836> > it sounds like this is intended as more of a general da...

thanks for the clarifications. i guess all i can offer is that sure, why not? you can correlate anything with anything else. in this case you're talking about the correlation between temperature residual from decomposition and crime residual from decomposition? then you're specifically looking at a correlation between deviations from baseline levels, because that's basically all a residual is. does that make sense? yeah sure, why wouldn't it?

final kiln
#

Had to implement REST requests to MLFlow in rust, not happy about needing to have a server up for the purpose, but it I can just get a long running free tier as I had b4

#

I'm also unsure why rust complicates strings

#

It's quite unlikely that I'll reuse rust for this though. It's not bad, but having to worry about an integration across two languages is an unnecessary hurdle.

#

I have to communicate the MLFlow session to this second process, when I could just be passing a python object around

errant bison
final kiln
#

I'm actually yet to see one which doesn't require masters

#

Some ask PhD

desert oar
final kiln
final kiln
# final kiln Some ask PhD

I think in some cases it's not a hard requirement though, I still send my resume and get responses. I not only don't have a PhD but also have an incomplete masters.

#

I think I do a good job of showing continued education and practical experience. I also don't plan for the MSc to remain incomplete.

#

But the point still stands

buoyant vine
#

We do all our training in Python then export to onnx

#

Then the onnx file can be compiled to rust and handled from there

#

Or using onnx runtime directly depending on needs

buoyant vine
# final kiln Yeah I guess that makes sense. Still, why not just have one safe type

Because rust must function on a variety of setups, in particularly the language must be able to function without an allocator for things like embedded systems.

So you end up with one type to represent the "string of UTF8" pointer type which can be backed by any memory i.e. stack or heap

And then you have String for "if you have an allocator" which provides the allocing of strings on the heap

pseudo pasture
#

hello guys,
need help

#

Hello Devs,
I have Containerized MY ML Project ( FLASK BASED) and I want to deploy it on AZURE but locally on My pc it is working fine as i hit the endpoint but When I containerized the whole project the Docker Logs Show Following Warning( Note: I'm on Windows):

2024-03-08 09:53:03 2024-03-08 04:53:03.627033: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0`.

2024-03-08 09:53:03 2024-03-08 04:53:03.629253: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.

2024-03-08 09:53:03 2024-03-08 04:53:03.671447: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered

2024-03-08 09:53:03 2024-03-08 04:53:03.671542: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered.

2024-03-08 09:53:03 2024-03-08 04:53:03.671585: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered.

#

2024-03-08 09:53:03 2024-03-08 04:53:03.679435: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.

2024-03-08 09:53:03 2024-03-08 04:53:03.679764: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.

2024-03-08 09:53:03 To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

2024-03-08 09:53:04 2024-03-08 04:53:04.893043: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
`

desert oar
#

make sure your linkedin page and portfolio emphasize your recent project work, programming experience, and strong math background. your work experience and education details are less important by comparison

odd meteor
#

You'll get to meet people and ask questions that could help you put a lot of things in perspective.

untold dove
#

not sure if it is a program issue or a data issue

#

new to all this ai stuff and teaching myself someone told me to message in this channel and that this may be of assistance to me

long canopy
#

what do you guys use to navigate huge profiler outputs

#

custom parsing scripts?

#

trying to debug why training is going slower than expected

pulsar fjord
#

What knowledge i have to gain to create a AI from scratch?

#

Like Jarvis from IRON MAN

left tartan
#

(just like tony stark)

pulsar fjord
left tartan
left tartan
pulsar fjord
#

What languages or modules should ik?

#

If i want to learn from python

untold dove
#

i wish u luck

#

cause its borderline impossible for a single man to achieve

devout python
#

Hey guys, my postgres queries through SQLAlchemy and psycopg2 to Supabase has randomly started to fail with the error "server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request." - I have not really change anything materially in my code but all of a sudden it doesnt work like 1/5 time I query the database. As far as I can tell Supabase is fully up and running, anyone has any idea what is wrong?

left tartan
pulsar fjord
#

Yah some basic knowledge gain from youtube and whitehatjr (144 classes ๐Ÿฅฒ๐Ÿ˜…)

left tartan
#

!kin

arctic wedgeBOT
#
Kindling Projects

The Kindling projects page on Ned Batchelder's website contains a list of projects and ideas programmers can tackle to build their skills and knowledge.

pulsar fjord
#

Okk thanks a lot

left tartan
pulsar fjord
#

๐Ÿ˜ฒ๐Ÿคฉ thanks bro @left tartan

long canopy
#

native torch running at 3.5 it/s, while torch lightning running at 0.05 it/s

#

any suggestions?

agile owl
#

polars dateparsing is pissing me off

#

try_parse_dates works but then it can't write to a database because pandas can't work with arrow extension dates

#

if I don't use try_parse_dates when I try to cast it to a date after I read it to the database it fails for some weird reason even though I have the right parse string

left tartan
#

You may need to use a timestamp or something, I use dates with pandas/polars, but I know I had to do some thing to make it work. ๐Ÿคท๐Ÿป canโ€™t remember the specifics

rugged comet
# desert oar thanks for the clarifications. i guess all i can offer is that sure, why not? yo...

I know we can correlate anything with anything. And I think I have a good grasp on what comparing the residuals means. However, my instructor and I seem to disagree on whether we should also be including the seasonal component of either or both temperature and crime rate. He is of the opinion that we want to include the seasonal components for temperature and crime rate because it contains the "ups and downs" in the data; the "reason why crime rate varies". He goes on to say that he thinks the residual components are more akin to noise and "why would those be correlated".
I think that we don't want to include the seasonal components because it's my understanding that it holds the cyclical behaviors of each of the variables. I would think that if temperature rises and falls throughout the year every year, we don't care about that part. I'm not really basing my opinions on anything besides gut feeling/common sense.

left tartan
untold dove
#

dont use spacy it isnt that good

serene scaffold
#

but in all seriousness, what do you dislike about it?

#

(forgive my immature retort--I use spacy a lot and have contributed to it.)

desert oar
#

yeah isn't spacy basically industry standard for quick and easy NLP work?

#

i've used it a bunch for things like lemmatization

left tartan
unreal flicker
#

hey guys looking for an text to speech python package. similar to elevenlabs. Lightweight so it can run on just a cpu. I tried setting up the elevenlabs package but I ran out api tokens. The program just runs after a user asks a question and reply to it by looking up the most closest result

odd meteor
final kiln
final kiln
# buoyant vine I would probs not worry about using rust until you want to deploy

Actually, my hypothesis is that a compiled language is better for ML, because the compilation step can be used for catching errors in the calculations without having to instantiate a model

My experience with rust has been: love -> hate -> hate -> love -> hate, I get frustrated when it doesn't let me do something but then I look back at the resulting code and see that it is better than what I was trying to do. I suspect things will become different as I start to become fluent in the language tho

mint shard
#

can someone guide me to start machine learning

final kiln
# desert oar make sure your linkedin page and portfolio emphasize your recent project work, p...

That's an interesting take, I had considered just removing my education section so I can have more space for my practical experience.

Presenting an incomplete masters in a resume is actually kinda hard, I try to balance it out with the achievements I've made during that time, and also there's the fact that I just have one presentation pending. Which is actually not the first one I've done.

Tho, I would welcome not having to think about it and just place my projects and achievements instead

final kiln
#

Which would be totally fine, but rust is very sensitive to the &

dusty forge
#

Quick question, how do I get my Google Colab to display this information?

#

Because mine only shows this

tidal bough
#

you could print it instead of just letting it show

#

Or look in the settings - what you're looking for is to make that cell's output be interpreted as text instead of a fancy widget.

dusty forge
tidal bough
#

No real idea, I don't use google collab. I'd try rightclicking at the cell itself, maybe; in vscode's jupyter support it's in a menu near the output

dusty forge
#

Alright, I'll keep searching thanks

tidal bough
dusty forge
tidal bough
#

Have you tried printing it, or perhaps printing its repr?

dusty forge
tidal bough
#

had to look into the code to find it, but it's another sklearn config that skips default values of estimators in their repr.

#

so sklearn.set_config(print_changed_only=False, display="text")

#

(the second one can be omitted if you like those fancy diagrams)

dusty forge
# tidal bough aha, I got it

๐Ÿฅณ that was it! Ahhhh that cheeky instructor must have done this in a stealth way as he shows opening the notebook without having this line of code lool ... but yes, now I can see all of it, weird that it's on (or True) by default

#

Thanks a lot, found it in the documentation, bookmarked ๐Ÿ™‚

mighty thicket
#

Does anyone know how to build trading algos using ict concepts?

long canopy
#

anyone got stats/papers on performance wrt. degree of quantization

lapis sequoia
#

Hiho, i am currently reading Think Bayes 2, and have some questions about the python code in the book. In Chapter 19 (https://allendowney.github.io/ThinkBayes2/chap19.html), there is a function call like this.

from scipy.stats import gamma
from scipy.stats import poisson

alpha = 1.4
sample_prior = gamma(alpha).rvs(10)
sample_prior_pred = poisson.rvs(sample_prior)

gamma(alpha).rvs(10) gives me 10 random samples from a gamma distribution with alpha set and the poisson.rvs uses the values in the array for lambda and then random samples the distribution?

agile owl
#

so my kernel death on GPU-> CPU dump came back on my threadripper system but it doesn't happen on cloud. I wonder if it's because of ECC memory or something like that

#

i think it's a low level bug because I can't really narrow it down to one situation where it happens

long canopy
lapis sequoia
#

I did and i don't get it really, i tried to make sense of it by just extracting the code above but this is just what i assume now.

tidal bough
lapis sequoia
#

Ok i was not sure if the values in the array are used as mu (lambda) for the poisson distribution. Thank you very much ๐Ÿ™‚

final kiln
#

Im uninstalling MLFlow and writing my own client to their rest interface

#

They do have a client class, but I swear, it does everything except log params and I don't understand why

#

The context manager that starts a run defines a global context somewhere, and then you use a globally defined log_params function to log parameters

#

It works fine but once you get into multiprocessing you needlessly need to reconstruct their global context without interfering with the state of the run

#

It's just a rest API

#

It's a barebones rest API

#

Why not just a class with a couple methods and some fields encapsolating a session

#

So I can pass it down to my subprocesses, it's just so simple, I must be missing something

odd meteor
final kiln
#

I had to define an env variable so that the other processes pick up on the run that the parent created

final kiln
arctic wedgeBOT
#

mlflow%2Ftracking%2Fclient.py line 727

def log_metric(```
final kiln
#

I'm taking a break, I am not sure how I missed that, but it was frustrating

dusty forge
grand breach
#

how good is rtx 2050 mobile for inferencing AI models

serene scaffold
#

when looking at a CUDA-enabled GPU, the main question is "how much memory does it have?". It needs to have enough for the model you want to run on it.

long canopy
#

what are people calling "agents"?

serene scaffold
# long canopy what are people calling "agents"?

an agent is a system that is part of and interacts with some environment. which can be the real world, or a simulation. so a self-driving car is an "agent". or a an automated opponent in a multi-player game.

long canopy
#

makes sense, but at a python-level, what is it? just a fine-tuned model?

serene scaffold
#

an agent doesn't have to be created from machine learning.

long canopy
#

hmm I see

#

does it have a specific meaning in the context of ML?

serene scaffold
#

it could also be a model that isn't a fine-tuned model. do you know what sets a fine-tuned model apart from one that isn't?

serene scaffold
#

that something is an agent is a description of its external behavior. not a statement about its implementation.

long canopy
#

lora, classification layers, etc.

serene scaffold
#

hmm, that isn't quite right

long canopy
final kiln
long canopy
#

from what I've read up, fine-tuning is making a model more adept for a task. there's a couple of techniques out there, but i think more training is also fine-tuning no?

serene scaffold
#

when you create a model initially, you start with random weights.

fine-tuning is when you take a model that's already been trained (so the weights are no longer random) and continue to adjust them. either for the same task, or a different task.

long canopy
#

lora stuff, finetuning layers, prefix tuning, etc.

long canopy
#

most of the newer ones don't even use the original weights

#

or even touch them

#

(lora)

serene scaffold
lapis sequoia
#

Hi does anyone knows how I can choose an ml algorithm for my dataset and any resources to build an end to end project

agile cobalt
# lapis sequoia Hi does anyone knows how I can choose an ml algorithm for my dataset and any res...

the very first thing you have to think about is which problem you are solving (regression, classification, forecasting etc)

after understanding the problem, look up models for it on popular libraries like sklearn or pytorch (if it's something simple you may want to start with sklearn ; if it requires freeform text or images, disregard most of the rest of this message)

after that, you may have to do some transformations, e.g. transform categorical string data into ordinal or one-hot encodings, as well as consider some feature engineering (adding new columns based on others such as calculating area given width and length)

after that, throw it in a model from the library you picked, try cross-validation, start playing with hyper parameters, test different models etc.

As far as resources goes, sklearn's documentation and their INRIA Mooc Course are a good place to start (assuming you're going to be working with sklearn, no images/free form text/alike)

lapis sequoia
#

Thank youu so much

lapis sequoia
agile cobalt
#

you can browse Kaggle or collect data yourself

final kiln
#

so this is how I deploy a server, while getting free observability

I set up a self hosted runner on the remote machine, and I get a workflow that runs the program, the logs display here and I even get an email if it goes down, dont have to pay no cloudwatch no anything

#

ofc, secret sharing is breeze too

lapis sequoia
iron basalt
long canopy
#

worth keeping an eye out for the new galore technique

grand breach
serene scaffold
#

Llama2 models range from 13 to 139 GB

solemn hedge
#

hey, guys i'm unable to install packages in virtual environment that i created for a project.i was trying to download tensorflow but it's throws error unable to create process "python.exe" and "pip.exe"

grand breach
#

not sure about the correctness

final kiln
unkempt dew
#

any sources (books/courses or any other source) to learn statistical inference with python ? I'm an undergraduate BS Business data analytics student and I took an introductory course for python last semester. The title of my current course is Statistical inference for business analytics. What I know is that it involves working with data frames/datasets - using libraries like numpy pandas etc.

unkempt dew
versed pilot
#

but what is your course involving in a practical way, is your course using Python or using other tools? It makes sense to focus on your course first before looking at other books that might be going too far for you

final kiln
# final kiln So this is the pattern I'm gonna be using to develop and deploy ML in a cost eff...

Some adjustments for this.

Instead of deploying the prefect flows to prefect cloud, I'm gonna use GitHub actions to provision cloud and deploy a prefect worker there. Same effect but the cloud management is moved back from prefect. The objective is to use skypilot to automate the provisioning instead of me having to go through the console to configure stuff

So how does this look like.

Two critical points:

  • secrets are all centralized on the GitHub repository
  • once stored, they never get moved by a human again

This means that not even prefect is gonna see any secrets. I'm reducing the atack surface to GitHub and to the human developing stuff, which can be further minimized by propper segregation of secrets (dev vs stage vs prod).

All workflows will follow the same logic, that is, the development deployment (which is a vscode tunnel) is very similar to staging which is very similar to prod, cuz well, they all build on top of the same base image, the one that's meant to be used for production.

For training:

  1. Click on GitHub actions workflow dispatch trigger (manual trigger)
  2. A GitHub hosted runner starts and uses skypilot to provision infrastructure, it then sets up a self hosted runner on the new machine and yields control to that new runner
  3. Runner will just do minor setup, stuff like logging in to prefect cloud, since this is technically the production phase of the project, it's using the production image
  4. A prefect worker is deployed from the pipeline, this appears in the prefect cloud UI, where I can trigger the pipeline how many times I want and with the hyper parameters I want
  5. When I'm done, I turn it off, control is yielded back to GitHub, which starts a new runner and de provisions cloud
  6. Fault tolerance might be tricky, but skypilot handles the transfer of checkpoints when an instance is removed. So all I have to do is to code check pointing through files. - this is not critical though, the instances last long enough for my purposes.
#

This looks like I'm giving myself a lot of work, but actually, most of it is already coded, I'll just have to take my current development deployment workflow and remove steps from it. Well, the skypilot thing will take work, but really, I don't need it until I have a need for fault tolerance. My quota for GPU doesn't really give me much choice on the machine type anyway and I don't really want to increase my GPU burn rate if I can help it

bronze prism
native bear
#

Could someone guide me on creating machine learning algorithms? I want to create a letter recognition software as my first project.

dusty forge
#

I'm following a tutorial on XGBoost, and apparently np.int was removed in the newer version of numpy, how can I solve this? Is there a line of code I can run that converts all np.int to int? I don't use/see np.int in my notebook, so I guess it's somewhere hidden.

tidal bough
#

Is there a line of code I can run that converts all np.int to int
Use find-and-replace in your IDE? :p

#

I don't use/see np.int in my notebook, so I guess it's somewhere hidden.
If it's in a library, then it's not really up to you.

#

by "was removed", you mean the library fails with an error when trying to use it? if so, probably the best solution is to downgrade numpy.

#

(the other solution is to, well, modify the library you're using to be compatible with modern numpy, by replacing np.int with int everywhere)

#

(or, hmm, technically speaking there's a cursed solution - you could try doing np.int = int. Just make sure you do it before importing those libraries that use it.)

dusty forge
unkempt dew
# versed pilot but what is your course involving in a practical way, is your course using Pytho...

Here are the course contents.
My instructor mentioned that we will be using R as well but for now it's just python. One more thing - I don't trust my instructor as it's been four weeks since the start of my semester and I have sensed that she knows absolutely nothing about programming (about python, to be more specific). Her background is all statistics but she knows nothing about programming. Last week we restarted from scratch and this time she started with the theory from the book "STATISTICS UNLOCKING THE POWER OF DATA" (written by Robin H. Lock and three more ppl)
Thank you for the response. Are there any video tutorials/courses I can use with this book to enhance my learning ?

final kiln
#

so this is a runner on the aws machine

#

and this is a runner on my laptop

#

why does the aws machine take a full minute to init a container ?

hollow sentinel
#

i'm trying to think of a story i can tell with this data

final kiln
#

it's a list of awards ?

hollow sentinel
final kiln
#

would be cool to show if it is distributed equitably,

yeah no further ideas yet

#

but a density map could be cool, since you have location

hollow sentinel
hollow sentinel
#

i can maybe steal a few

#

but i'd have to contextualize them in terms of this dataset

versed pilot
# unkempt dew Here are the course contents. My instructor mentioned that we will be using R as...

ok your course is fairly introductory, the book I linked goes way beyond, I don't think it's appropriate for you at this point. Focus on understanding what is in your course, these are fundamental concepts on which to build on later. Your instructor will know statistics, this is the point of the course. Try and get what you can get from the course, and if you have extra time, try and find the python way of doing what you are learning in the course. Python being python, there are multiple ways, https://docs.scipy.org/doc/scipy/reference/stats.html is one of them.

final kiln
final kiln
#

it's cool that it is interactive, but like, I can select the nodes, but then nothing changes, so why let me select

desert oar
desert oar
hollow sentinel
#

tableau is huge

desert oar
#

the idea is that someone with some engineering knowledge sets it all up, and then non-engineer business analysts can do fairly powerful data analysis with it

final kiln
final kiln
final kiln
desert oar
#

include your thesis in your project list of course and indicate that it was a thesis project

past meteor
#

Tableau and PowerBI are important tools

desert oar
#

@hollow sentinel if you want an in-python data viz suggestion for geospatial data, check out Geoviews

past meteor
#

I've seen multiple work projects squander months on something that could've been done in a week

final kiln
past meteor
#

This predates me but someone even set up a system where $customer could make their own graphs in the frontend because he was sick and tired of dealing with new questions/modifications

past meteor
#

The downside of PowerBI is the licensing fees

#

Enterprise starts at โ‚ฌ3995 per month last time I checked

desert oar
past meteor
#

Honestly, there's probably no better way because running superset or other FOSS BI tools will cost you an half an FTE or more and that'll potentially be more than just paying for PBI enterprise

final kiln
odd meteor
# unkempt dew Here are the course contents. My instructor mentioned that we will be using R as...

๐Ÿ˜€ She probably knows enough to be your prof. More so, if her background is indeed in Stats, then she most definitely will be very good in R.

Bottom line: Just keep an open mind. Whether you do most of your class in R or Python, you'll surely enjoy your class - - so long as your prof. background is in Stats, and she can explain / breakdown seemingly hard stuff very well, and above all she's pretty solid in either R/Python.

junior stone
#

This is the best hybrid Multimodal Search I've seen yet

What do yall think?

https://skm.ai/

long canopy
#

am wondering if I can make opening python files with emacs trigger cloud instances in a cost-efficient way, and close them when i'm done

final kiln
past meteor
arctic wedgeBOT
#

6. Do not post unapproved advertising.

long canopy
#

dang

final kiln
# long canopy 1 minute is the shortest?

I'm not entirely sure why, I think it might have something to do with whatever technology uses the AMI's, I suspect it's docker based but I haven't looked into it

#

On my laptop it's 2 sec or so

#

Don't matter. I just have a bug to fix and I can finally start doing some training. It's all pieced together rn.

#

I do have to implement the other attention mechanisms, like the scaled dot product.

#

Then mine is gonna be implemented directly in cuda kernels, which I'll have to figure out how to bind into cpp torch, and then into rust torch

#

I might also try to find the best implementation of scaled dot product, so I can compete with it.

sinful surge
#

!pip install "tensorflow-text>=2.11"

ERROR: Could not find a version that satisfies the requirement tensorflow-text==2.11.* (from versions: none)
ERROR: No matching distribution found for tensorflow-text==2.11.*

why

tidal bough
#

my guess would be "you tried to install it on a python version it doesn't support, like 3.12".

feral kernel
#

Hi! model = AutoModelForCausalLM.from_pretrained( "microsoft/phi-2", quantization_config=bnb_config, device_map = 'auto', trust_remote_code=True, use_auth_token=True, ) able_parameters() print(model) ```ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes -I keep getting this error even after I updated bitsandbytes

sinful surge
tidal bough
sinful surge
tidal bough
#

For this version of tensorflow-text, python 3.10 would do.

sinful surge
sinful surge
#

Anyone has idea about this ?

final kiln
#

funny how it doesn't overfit, I wonder if I have data leakage

past meteor
#

or nightmare, now you have to dig for leakage

#

Y'know what I do? Aggressively split data and have a final final final holdout/test set

#

If I leaked along the way it's "OK" because the "official" metrics are those on the final test set

spring field
final kiln
#

they are getting close to 100%

past meteor
#

Ah nice, then it's legit

final kiln
#

this is a very small mataformer too, 1 layer, 2 attention heads

past meteor
final kiln
#

the eval is just a small subset of the test dataset, which I'd then remove from the test dataset ofc

#

the original doesn't split into eval

#

so I chunked it from the test split

past meteor
#

What do you mean with eval?

final kiln
#

the split you use to do hyper parameter tuning

past meteor
#

All I mean is that I split off say 20 % off of my full dataset and never touch it to the very end

final kiln
#

my understanding is that those 20% are your test split

past meteor
#

The rest of the 80 % is then split off into train val and test

final kiln
#

yes

#

that's what I did effectively

past meteor
#

The key is to touch the 20 % you split off first just once or twice max

#

With a small set of models

final kiln
#

yeah it's not even coded yet, I'm not touching it at the moment

past meteor
#

If you leaked within the 80 % it's totally ok, then the results will be representative on the 20 %

#

Unless you mega leaked into that ๐Ÿ˜‚

final kiln
#

I'll still check for leakage, just for peace of mind

#

but I'm quite happy with this

#

I'm gonna have to upgrade the MLFlow machine, the aws free tier can't handle more than one request at a time

past meteor
#

This is the nice thing about on premise

#

But the rest sux

final kiln
#

I can deploy a service locally in the runner

#

but it starts to overload the pipeline runner

past meteor
#

I wanted MLFlow? Set up Caddy on a machine behind my VPN in <15 mins, reverse proxy MLFlow and it's done