#data-science-and-ml

1 messages ยท Page 308 of 1

serene scaffold
#

it's really hard to read

lapis sequoia
#

or both

serene scaffold
#
with open((('ep'+(str(int((list(set(list(f['episode_id'])))).index(id)) + 1)))+'.txt'), 'w') as episode:

this could be

with open(f'ep{id_}.txt', 'w') as f:
lapis sequoia
#

my system is too weak

#

can anybody run the code on their system

#

and send me a tar or zip of all the txt files generated

#

if the file is too large for discord then upload it to github or google drive and send me a link

#

I'd be forever grateful if anybody did this for me

whole vortex
#

@lapis sequoia

#

What does that do

#

Also, can someone help me with my problem

#

I'm trying to create an invoice generator based on a spreadsheet with the columns "name", "product", "quantity" and "date".
Each product has a price and each row represents a customer's purchase

#

I want to set the price locally such that the invoices generated reflect this

#

I was thinking about listing each product once in a csv file with the corresponding prices

#

Is this the best way of tackling this problem?

#

I would have a separate column but it's just more time

flint mason
#

how can I return graph inline on a webpage from my local python function
I am getting a pop out instead
I tried using matplotlib inline

#

anyone

indigo skiff
#

Hello. I am trying to find projects (dissertations which involve Autonomous systems) so i can pick a specific topic for research. Here are some quick details about projects involving autonomous systems. For persistently autonomous systems, planning to achieve goals and executing plans is not enough. A persistently autonomous AI also requires the means to formulate, select, or reject goals over time. In addition, goals may be suggested by human operators, or other collaborating agents. To perform predictive decision-making online, autonomous agents must also be able to parse what data they can sense into a coherent model of the world. For example, by parsing a 2-dimensional map as a topological semantic map, or historical navigation data to estimates of navigation durations. Projects in this area investigate the tools required to make persistently autonomous systems.
Example Project: Automated Modelling for Autonomous Systems
The goal of this project is to develop tools to automatically build models from data, to be used with automated systems that use search-based AI to operate in the real world. The project could employ a variety of techniques such as clustering, recommender systems, neural networks. Application areas include (but are not restricted to) robotic inspection, disaster recovery, and operating in ocean and space domains.

mint palm
#

is there a way to avoid this loop over all parameters......cuz in reality there are too many parameters .......wont it be slow to apply gradient check

slate hollow
#

but shouldn't that be enough? all you have to do is keep driving, and you'll eventually reach the destination
but knowing the gradients doesn't really help- say we were right at the foot of a deep slope like so: / ---/ <-- we're herethe gradient would tell us to go really far, even though the minimum is only 1 or 2 units away

ripe forge
#

each jump has to be a teleport, because you dont know "how far" to drive. driving is a bunch of spontaneous instantaneous movements yeah.

#

where we could always look out the window

#

now imagine if suppose the windows got tinted and darkened. and you had to completely stop the car for them to un-tint

#

so you had to set a timer for yourself... i'll look out every 10km. or something like that

#

i dont even know. anyways. yes. jumps. not drives. you're not rolling down a pre-defined landscape, because if you knew the landscape you'd already know the destination

slate hollow
#

but still, all you have to know is the general direction (the sign of the gradient) and move that way right?

ripe forge
#

"move that way" doesn't exist.

slate hollow
#

oh yeah

ripe forge
#

how do we move if we only have to teleport

slate hollow
#

just adjust the weight a certain way

#

add or subtract it

ripe forge
#

yep. which are discrete shifts. not a smooth movement

#

you could "mimic" a smooth movement by making the learning rate super small

#

and then it just takes a few years

#

but you'll get there ๐Ÿ˜›

slate hollow
#

but that still doesn't explain why we multiply it by the gradient as well

ripe forge
#

well the gradient has the sign

quasi sparrow
#

Does anybody know an alternative to tensorflow "tf.io.gfile.GFile"?

ripe forge
#

and the steepness has the inclination of like the ...hot cold game.

slate hollow
#

true

ripe forge
#

you're steep when you're very far.

quasi sparrow
#

I can't use the API to open the checkpoints file

ripe forge
#

(or so we hope. its not actually always* true. local minias exist)

#

but it's still generally more useful to take the gradient than disregard it emperically

slate hollow
#

ok cool

small dew
#

is it possible to train GANs network on Google Colab?

spiral peak
#

Hi! I need opinions on a thing. Do y'all have a preference between PyTorch and Tensorflow? My use case is specifically for processing image data.

heady tide
#

Hey guys, I am trying to build a LSTM RNN for speech recognition, however the labels are not evenly distributed, basically there are a lot of "Silent" sounds in the train/test set
Train:

heady tide
#

Validation

spiral peak
heady tide
#

I was thinking about splitting the network into two parts, one for the silent label and the other for others

small dew
spiral peak
small dew
#

np

dapper halo
#

Was curious about anyones thoughts on the prediction values reaching some limit like is shown in the right most plot. Anyone have any ideas on what may cause this? Perhaps just bad data or could it be a certain activation function or any other part of the structure of the network?

Dashed line are the true values, colormapped scatter are the predicted values

glad raft
#

those are pretty graphics

#

I'm having issues trying to run abs(complex128) inside of a numba.jit decorator.
any suggestions?

#

how does the numba.jit or numba.cuda.jit decorator work, does it precompile something? is the issue that c doesn't have a complex128 datatype?

#

@dapper halo I'm just learning about all this now. If this is meant to be a plot of accuracy after training, isn't the point where accuracy trends off like that usually indicative of some sort of overtraining?

exotic maple
#

@glad raft I'm sure you understand no one is being paid to "help" in here, right? This isn't a paid tech support service...

People help when / if they know.

dapper halo
# glad raft <@!530207557252415508> I'm just learning about all this now. If this is meant t...

This is not a plot of the loss function, it is just an x,y generated from the predicted vs true values...so accuracy,yes but not what you're thinking of. So it is concerning that the predicting range doesn't encompass the full range. Also its only trained on 5 epochs so I dont think theres a shot in hell it's overfitting haha.

Its probably just due to the degeneracies in my data though.

dapper halo
glad raft
#

@exotic maple did I suggest that something included a fee? I didn't mean to if I did

exotic maple
glad raft
#

Oh no I wasn't upset just babbling to myself because I was tired. Sorry for the confusion

exotic maple
#

ha, no prob. Maybe I'm upset myself because of this stupid plotly chart -grumbles-

glad raft
#

I haven't even started with plotly yet, I'm still trying to figure out speedup and gpu parallelization. I think I'll always be jealous of the people that make pretty graphics

mint palm
#

Arent both 2nd and 3rd atatement correct

#

The answer given is just atatement 2

#

Why is statement 3 wrong

glad raft
#

I did make a 1000 level Mandelbrot pyplot contourf plot though

mint palm
#

Can someone plz clear it.....thanks

grave frost
#

A lot of models and techniques are supported out-of-the-box in TF, so it might be easier to use that. but for research-level models - pytorch all the way

glad raft
#

Isn't one iteration more than one epoch? Does one iteration run through all epochs?

#

What's wrong with statement one?

glad raft
#

An epoch has to do with dropouts right?

dapper halo
lapis sequoia
#

yo does anyone know an entry level tensorflow course which doesnt use a premade dataset

#

specifically: image recognition with keras

desert oar
#

im not sure you can teach machine learning without a premade dataset

tropic tendon
#

hey i am really new to machine learning and neural networks with python.. could anyone recommend an explanatory course?(free oc)

copper ridge
tropic tendon
#

thx

copper ridge
#

np

tropic tendon
#

a lot

neon marsh
#

Hey guys I really want to do something with HAR (Human Activity recognition) I seen something about a camera being able to identify different yoga poses someone does. Does anyone know what I have to learn/know to be able to do this or where to start?

thorn delta
#

Hello everyone, I am planning to do some project in DL( specifically image classification) and I am looking into depth perception(in an image), zero-shot recognition, optimal transport theory, something related to model compression etc. I am new to this field and a prof I discussed with gave me some of the above topics to look into. If anyone is familiar with the above topics or knows something related, can you tell me any related topics that are interesting to work on? (Also the project is research-based and I am an undergrad)

lapis sequoia
tidal bough
#

@lapis sequoia Have you tried Google Collab?

grave frost
# lapis sequoia yes

then something is prob wrong with your code. check out for memory leaks and incorrect data chunking/loading

lapis sequoia
grave frost
mint palm
#

Will cost optimisation be slower when using high value of beta (0.9) rather then small (0.5) when using gradient descent with momentum?

grave frost
#
The site at https://discuss.pytorch.org/t/how-to-convert-audio-e-g-wav-to-tensor-and-back/18345 has experienced a network protocol violation that cannot be repaired.

The page you are trying to view cannot be shown because an error in the data transmission was detected.

    Please contact the website owners to inform them of this problem.

And then people ask me why I don't use torch. Figures would happen to me especially when I am in a hurry

obtuse iron
#

Hey guys I really want to start with machine learning for my own do you have like a good documentation for pytorch or tensorflow ?
(I learned the theoretical part oft machine learning and deep learning but I can't find anything good about the 2 libraries)
I would appreciate if you could help me : )

tidal bough
# obtuse iron Hey guys I really want to start with machine learning for my own do you have lik...

I have this primer for some basic DS/ML libraries: https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/coursera/week1_intro/primer/recap_ml.ipynb
And these ones for TF and Pytorch:
https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/coursera/week1_intro/primer/recap_tensorflow.ipynb
https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/coursera/week1_intro/primer/recap_pytorch.ipynb

All of these come from the Practical RL course on coursera - it assumes the participants are already somewhat familiar with this stuff, so provides primers in case they aren't.

#

Though if you want "documentation" - well, both of these libraries have extensive docs, with guides and examples.

obtuse iron
#

Thank you I searched for this but I just wanted to learn how I use the libraries and I think my question is done now thx

grave frost
#

I can't open any PyTorch docs on my side ๐Ÿ˜ž (protocol violation error)
Can anyone tell me how to crop/pad a tensor?

#

basically, I have multiple tensors which I want to stack - but they have different shapes

tidal bough
#

uhh, can't you just use indexing?

grave frost
#
torch.Size([2, 1321967]), [2, 1323119] ....etc

so it's on on the horizontal axis. any idea how I can do that?

grave frost
tidal bough
#

like,

min_len = min(t.shape[1] for t in tensor_list)
shape = (slice(None), slice(None,min_len)) # this is a tuple of slices equivalent to [:,:min_len]
cropped_tensors = [t[shape] for t in tensor_list]
grave frost
#

ooh, so each index in the shape var corresponds to the axis eh?

tidal bough
#

Yup

#

It's just normal slicing, equivalent to t[:,:min_len]

#

actually, uhh

#

why didn't I just do that lol

#

you only need to use slice directly when, say, there's a variable number of axes too

#

here, you can just do t[:,:min_len]

grave frost
whole vortex
#

Hey guys, I'm going through some past paper questions and am struggling to make sense of it

stoic fable
#

Hey yall was just wondering if anyone is good with the Altair visualization library?

tidal bough
#

you can also just allocate a tensor of the target size and copy the original tensor into it using slice assignment:

res = torch.zeros(target_shape,dtype=orig.dtype)
copy_tup = tuple(slice(None,l) for l in orig.shape) # this copies to the left-top corner, change the positions if necessary 
res[copy_tup] = orig
thorny kite
#

any career advice here in this field ?

lapis sequoia
#

Hey , I just joined this server! Seems like a really great place for learning and exchanging. I got a question, I don't know if this is the right place but I was asking myself what could be the most fundamental, let's say book, that I could read on Artificial Intelligence. Like good fundamental information to start the whole process of learning something about AI, Coding, Computer Science... thank you in advance.

polar dock
lapis sequoia
polar dock
#

Hi yall!
Does anyone have any tips on how to optimize the below operation?

def calc_wm(W,X):
    N = len(W)
    sum_W = np.sum(W)
    sum_WX = np.sum(W*X)
    if N>0 and sum_W>0:
        return sum_WX / sum_W
    return np.nan

def calc_wmse(W,X):
    N = len(W)
    sum_W = np.sum(W)
    m_W = np.mean(W)
    wm_X = calc_wm(W,X)    
    if N>1 and wm_X!=np.nan:
        wmse = np.sum((W*X - m_W*wm_X)**2)
        wmse -= 2*wm_X * np.sum((W-m_W)*(W*X - m_W*wm_X))
        wmse += wm_X**2 * np.sum((W-m_W)**2)
        wmse *= N / ((N-1)*sum_W**2)
        return wmse
    return np.nan

wmse = (
    df.groupby(['grp'])
      .apply(lambda x: calc_wmse(x['weight'], x['val']))
      .reset_index()
)

It calculates the weighted mean average standard error. Takes +60 sec to run on test df with 10M rows. The full set has 1.4B rows.

The weighted mean average formula is taken from this stats stackoverflow thread (https://bit.ly/3t2hBch).

Things I've tried:

  • haven't gotten a vectorized solution working ๐Ÿ˜ฆ
  • numba.jit made the operation faster on a single group, but passed into groupby.apply was even slower

Things I haven't tried:

  • using pd.eval to handle the calculations directly in df memory
  • any attempts to use Cython to optimize the function

any advice would slap hard! ๐Ÿ™‚ thanks so much

polar dock
# lapis sequoia I got no experience

okay okay, ill give you my background and see if it helps clarify sources of questions

I have a BS in Math and CS and work in a proto-data-engineering role.

I don't do too much data science directly, and instead work with the data scientists to get their pipelines working. I get to have a more dev-ops type of role instead of an analytics one!

late shell
#

Hello, I'm a beginner at ML and I can't seem to grasp a few things about feature scaling. I have googled a lot but I can't seem to be satisfied with the explanation. I understand WHY feature scaling is important. But why do we have to scale the test set using the scaling parameters of the training set? It just doesn't make sense to me. Thanks

desert oar
#

more importantly @polar dock , how big is each group, and how many groups do you have?

#

if you have lots of small groups, you can leave each within-group computation alone and you will want to parallelize across many groups. if you have a few very large groups, you will want to make each within-group computation as efficient as possible using something like numba

lapis sequoia
polar dock
#

For this test segment there are 14,200 groups of 750 rows in each group.

This is already running inside a multiprocessing Pool. I have some 180,000 segments to be processed. Hence, I don't think I can parallelize the operations for any single segment.

desert oar
#

oh... you mean this is already a task being done inside a parallel "worker"?

polar dock
#

yeah lol

#

If I had more time I'd just try to learn Dask

but it leadership dumped this on me pretty last minute

nova widget
#

@polar dock possible to drop the ifs in some way?

desert oar
#

@polar dock i don't know if returning np.nan is a good idea, or if you should use != to compare anything to nan

#

i think nan != nan always

#

i agree that doing something different with these 'ifs' can help

#

maybe inlining these two functions to avoid having to use nan to 'signal'

#

not sure if numba supports python None in nopython mode

#
@numba.jit(nopython=True)
def calc_wmse(W,X):
    N = W.size
    sum_W = np.sum(W)
    m_W = np.mean(W)

    sum_W = np.sum(W)
    sum_WX = np.sum(W*X)

    if N == 0 or sum_W <= 0.0:
        result = np.nan
    else:
        calc1 = W*X - m_W*wm_X
        calc2 = W - m_W
        wmse = np.sum(calc1**2)
        wmse -= 2*wm_X * np.sum(calc2*calc1)
        wmse += wm_X**2 * np.sum(calc2**2)
        wmse *= N / ((N-1) * sum_W**2)
        result = wmse
    return result

def calc_wmse_grp(grp):
    return calc_wmse(
        grp['weight'].to_numpy(),
        grp['val'].to_numpy()
    )

wmse = (
    df.groupby(['grp'])
      .apply(calc_wmse_grp)
      .reset_index()
)

this is a bit cleaner imo, might be faster but you'd have to benchmark

glad raft
#

what kind of structure would you use to store 1000-1000000s of particle locations, velocities, rotational velocities, and physical properties so that you could make use of numpy's ufuncs?

desert oar
#

it saves a few passes over the data and caches a few calculations

glad raft
#

just an array?

desert oar
#

you can use "raw" numpy arrays too but pandas has some nice features that can make certain operations easier (e.g. groupby as shown above)

glad raft
#

do any of those come complete with kd tree nearest neighbor search?

desert oar
#

nope, none of them @glad raft

glad raft
#

wishful thinking i guess lol

#

would you think it would be a good idea to define a particle state class in which the primary object was an number particle x number property dataframe or "raw" array or to define a class of particle and build a dataframe of particle type object?

desert oar
#

if you have a dataframe, you can use the column names and datatypes as a "schema" that describes a particle state

#

it depends on your code though

#

if you're doing lots of complicated single-particle operations, then yes it would be a nice idea to have a class that represents a particle, and you can write results to the sqlite database

#

if you are doing a big vectorized operation across many particles at once, the single particle class won't help much

glad raft
#

thank you so much. That's what I thought. I suppose I was thinking to use a particle class and a state class in whch the state class that holds indices is responsible for the nearest neighbor searches and time stepping and the particle class holds information about the state of each particle. But I didn't know if there would be any overhead expense in or quite how i would things would be inherited

desert oar
#

for a nearest neighbor search, use a proper kd or ball tree implementation

#

like in scikit-learn

#

yes there will be significant overhead in using a big list of class instances compared to a numpy array

#

you can reduce that overhead somewhat by defining a class with __slots__ - but it might be more helpful if you gave more detail about the kind of computations you are doing

merry ridge
#

A friend of mine was going over my CV and suggested replacing the word "scrape" (in the context of scraping data sources on the web using beautiful soup etc.) with "ingest" or "ETL". I just wanted a sanity check that the terminology is common place and would be a reasonable hit on a keyword scan. Ingest in particular is new to me

desert oar
#

context?

#

web scraping is a known thing too

merry ridge
#

Same context

#

At least to my ear, scraping is the best word

desert oar
#

its a generally unpleasant sounding word

#

ingesting data is more abstract

#

theres nothing wrong with "webscraping" or "scraping PDFs" etc

merry ridge
#

Okay, thanks for the feedback

glad raft
#

@desert oar I'm trying to build a particle dynamics simulator. At first I'm going to be using a single particle that will bounce around in a cube. Then I will try to build on so that there is a non-interactive discrete element method where the number of particles will increase to 100s, 1000s, and probably no further on my laptop, but the premise will be maintained in that the particles will bounce around in the cube. If i can get that working i will start the nearest neighbor searches to identify which particles are colliding if any and then update their total force values and continue. The goal is to develop a fundamental understanding of class structures in python and molecular and discrete element methods at the same time. tangentially I've been brushing up on mpi4py, cuda, numba, and cython, but i've only got a few days so i won't take this experiment too far right away.

olive orbit
#

Hi y'all, I'm building an application to learn pandas better. I have a solution, and the code works, but I think it is a more Pythonic solution rather than a pandas one. I'm trying to utilize the pandas logic instead of the python logic. Any thoughts are helpful! Here is the code I have:

results = {}
grouped = electdf.groupby(["year", "state"])                            
for key, group in grouped:
    year, state = key
    group['vote_remaining'] = group['electoral_votes'] - group['vote_int'].sum()
    remaining = group['vote_remaining'].iloc[0]
    top_fracs = group['vote_frac'].nlargest(remaining)
    group['total'] = (group['vote_frac'].isin(top_fracs)).astype(int) + group['vote_int'] 
    if year not in results:
        results[year] = {}                                     
    for candidate, evotes in zip(group['candidate'], group['total']):
        if candidate not in results[year] and evotes:
            results[year][candidate] = 0
        if evotes:
            results[year][candidate] += evotes
serene scaffold
olive orbit
serene scaffold
serene scaffold
#

what is the shape of the data? like what columns are there and what data types are in them?

#

the best way to answer that question is to provide a few lines of the CSV

olive orbit
# serene scaffold the best way to answer that question is to provide a few lines of the CSV
year  state     candidate        percvotes  electoral_votes  perc_evotes   vote_frac      vote_int
1976  ALABAMA   CARTER, JIMMY    55.727269       9            5.015454     0.015454         5
1976  ALABAMA   FORD, GERALD     42.614871       9            3.835338     0.835338         3
1976  ALABAMA   MADDOX, LESTER   0.777613        9            0.069985     0.069985         0
1976  ALABAMA  BUBAR, BENJAMIN   0.563808        9            0.050743     0.050743         0
1976  ALABAMA   HALL, GUS        0.165194        9            0.014867     0.014867         0
olive orbit
serene scaffold
#

I'm just trying to think of a formula for proportional integer division

olive orbit
# serene scaffold That's alright.

there are some other columns that calculate the proportions to get to this data shown here. It starts with the total votes and votes per candidate

serene scaffold
#

I'll take a look at this again in a bit

olive orbit
olive orbit
polar dock
polar dock
# polar dock Hi yall! Does anyone have any tips on how to optimize the below operation? ```...

Okay okay, so I got a vectorized solution working!

def vectorized_weighted_mean(df):
    df['WX'] = df.eval("W * X")
    res = df.groupby('sid', as_index=False).agg({
        'W': ['count', 'sum', 'mean'],
        'X': 'sum',
        'WX': 'sum'
    })
    res.columns = ['sid', 'count', 'sum_W', 'mean_W','sum_X', 'sum_WX']
    res["X_wm"] = res.query("count > 0 & sum_W > 0").eval("sum_WX / sum_W")
    return pd.merge(
        df[['sid', 'W', 'WX']],
        res[['sid', 'count', 'mean_W', 'sum_W', 'X_wm']], 
        on='sid'
    )

def vectorized_weighted_mean_standard_error(df):
    df = vectorized_weighted_mean(df)
    df['A'] = df.eval('(WX - mean_W*X_wm)**2')
    df['B'] = df.eval("(W - mean_W) * (WX - mean_W*X_wm)")
    df['C'] = df.eval("(W - mean_W)**2")
    res = df.groupby('sid', as_index=False).agg({
        'A': 'sum',
        'B': 'sum',
        'C': 'sum',
    })
    df = pd.merge(df[['sid', 'count', 'sum_W', 'X_wm']], res, on='sid')
    df['coefficient'] = df.eval("count / ((count - 1)*sum_W**2)")
    df['X_wmse'] = df.eval("(A - 2*X_wm*B + (X_wm**2)*C)*coefficient")
    return df[['sid', 'X_wm', 'X_wmse']]

posting it here in case anyone was curious

serene scaffold
#

@olive orbit took into the apply method of GroupBy objects.

velvet thorn
serene scaffold
#

you would know

velvet thorn
desert oar
velvet thorn
#

vs agg and filter

desert oar
#

i totally forgot about numexpr before

#

there's also numpy einsum which can be ridiculously fast for certain operations

glad raft
#

@desert oar that sounds like a good plan saving just a portion of the particle properties instead of all of them will save me a lot of space too

velvet thorn
#

vectorising stuff is quite fun

desert oar
#

@glad raft i recommend using attrs for the particles

here's a very basic sketch of one of many many ways to write code like this

import math

import attr  # pip install attrs


# using slots=True will save memory and reduce runtime overhead
@attr.ib(slots=True)
class Particle:
    """ A simulated particle """
    xpos = attr.ib()
    ypos = attr.ib()
    velocity_angle = attr.ib()
    velocity_magnitude = attr.ib()
    mass = attr.ib()

    def step_forward(self, seconds=1):
        """ Update x and y based on velocity """
        ...

# hypothetical function to generate a list of particles
particles = generate_particles(n=1000)

# hypothetical code to "evolve" each particle state
for _ in range(10000):
    for particle in particles:
        particle.step_forward()
#

obviously im leaving out a lot of things and you probably can do this more efficiently by pre-filling a giant numpy array with 0s then running through the simulation row-wise but that would be way uglier

olive orbit
velvet thorn
#

so they are assigned like

#

uh.

#

how do leftover votes even come about

glad raft
#

@desert oar I'm going to look into attrs and slotting. thank you

olive orbit
# velvet thorn wait what

The whole numbers are portioned out based on the proportion, but because there are lots of candidates you end up with a lot of them with several that don't get a full point. So it becomes a ranked choice vote with the fractions to give out the full total

#

If that makes sense

exotic maple
velvet thorn
#

basically you just want to increment the n top entries (sorted by percentage vote) by 1

olive orbit
olive orbit
desert oar
#

i wish it had an object-oriented dsl instead of magic strings though

#

but one could easily create the former to emit the latter

misty flint
#

never have i ever seen so much bioinformatics before

#

time to try to do this analysis in R i guess

#

๐Ÿฅด

desert oar
#

use julia

#

yolo

misty flint
#

but im doing the model building in python

#

Julia doesnt have enough genomics packages

#

๐Ÿฅด

velvet thorn
#

would you like to see

desert oar
#
c = df.Num('count')
w = df.Num('sum_W')
df['coefficient'] = df.eval(c / ((c - 1) * w**2))
#

i like syntax checking, syntax highlighting, and generally being able to use language-level tooling

#

same reason i like "first class" regex objects, makes it easier to build tooling around it

exotic maple
desert oar
#

yeah this is hypothetical of course

#

internally it'd be something like

c = df.Num('count')
w = df.Num('sum_W')
expr = c / ((c - 1) * w**2)
df['coefficient'] = df.eval(expr.compile())
#

there is plenty of prior art for this kind of API, see e.g. pyspark

#

will have to put it on the todo list

#

another example of this kind of api https://pypi.org/project/glom/ (specifically glom.T)

exotic maple
desert oar
#

it's the kind of thing that gets easier the more you learn

#

it's like music or a language

lusty iron
#

I like sparksql's dataframe api when working on implementing business logic, but prefer all the short cuts that the panda's api has for the quick iterations that statistical/data work requires. I am hoping Wes McKinney turns apache arrow into a lazy dataframe with a sparksql like api(and the dask guys build a distributed version of it).....

#

if it cool to push your project on here, I wrote an implementation of the step wise feature selection algorithm that can scale with dask. Honestly, Mlxtend's is better, but it was removed . Scikit-learn will be getting an step wise feature selection algorithm soon, but they only parallelized the cross validation, and not the search itself. https://github.com/pr38/dask_backward_feature_selection

GitHub

Backward step-wise feature selection using Dask, scikit-learn compatible - pr38/dask_backward_feature_selection

hushed sail
#

Hi. I have a question about computer vision. I need to get the roi(region of interest) of image in coordinates for computing intersection over union. I wrote some code that computes iou, and it works fine on my examples. But how should I get the coordinates? I have the RoI for every image in my dataset. How to use it for something like roi-finding training?

vast thunder
#

Guys so I wonder, if I have to get an image input and return a string input, would I just use a Conv2d in keras for the input and a list of (Dense) numbers , representing the ascii codes of the output?

#

I can't assign ID to each image, they all are different texts

lapis sequoia
#

Anyone that could help me with a data science problem for uni? I have amazon review data and have to predict the score that the review gave using a number of variables, one of which is the text of the review. In training I'm able to get a very high accuracy with CV, but when I apply it on hackerrank I score very poorly.

languid nebula
#

Is there an equivalent of ml-agent in python? By this I mean a library which can perform reinforcement learning if yo set up the environment and the rewards system?

#

I just want to do a very simple game with pygame where the agent can move on a board and try some learning stuff on it

#

So as in ML-agents I'd like to be able to chose the informations sent to the agent, the amount of output etc

#

If you know a good library which can do that let me know

pulsar karma
#

Is the Sum() function syntax only usable during an iteration? Or can it be used in multiple ways outside a iteration(For loop)?

dusk gate
#

X_train[..., np.newaxis]
what does the ... mean?

tidal bough
pulsar karma
#

But, it doesn't accept ANY iterable. It must be converted into a float or integer.

#

But, thanks for the answer. :)

tidal bough
pulsar karma
#

Ah, ok. Gotcha!

stark ember
#

Could someone help me figure out what's going wrong with the WolframAlpha.py module?

#

I'm making a query of how are you, and here's the data I get:

#
{'success': False, 'error': False, 'numpods': 0, 'datatypes': '', 'timedout': '', 'timedoutpods': '', 'timing': 0.932, 'parsetiming': 0.301, 'parsetimedout': False, 'recalculate': '', 'id': '', 'parseidserver': '41', 'host': 'https://www4b.wolframalpha.com', 'server': '41', 'related': '', 'version': '2.6', 'inputstring': 'how+are+you', 'tips': [{'text': 'Avoid concatenation in math expressions'}, {'text': 'Use r*x rather than rx, and q*x^2 rather than qx2'}]}
#

However, when using the W|A API explorer, I get different results

#

I'm getting a proper result there

tough pecan
#

Hi! does someone now were to start learning to make a AI? i now the basics of python

vast thunder
#

First of all I would go for the math

#

Derivatives, Sigmoid, whatever

#

Then learn about Neural Networks

#

And after that learn a library

#

Like tensorflow

tough pecan
#

you now a web were to start or video?

vast thunder
#

I don't know any websites/videos for the math, just search them . But for neural networks, search Luis Serrano's Neural network tutorials on Youtube

#

I suggest you watch the neural network videos first, they don't contain much math

#

After that, go learn about libraries , and when you encounter a math thing, either ignore it and just let it be in your code, or go learn it

tough pecan
#

ok, thanks!

hoary wigeon
#

Hello

#

I'm searching for a cloud platform where i can work with SQLite dataset as my computer is not capable to handle the operation.

grave breach
#

Linode offers VPSs so you can access SQLite

#

Alternatively, if you're willing to use a different language rather than python, you can use Wolfram Cloud @hoary wigeon

#

They're both not too expensive

grave breach
hoary wigeon
grave breach
#

Oh, sorry, didn't got it

hoary wigeon
grave breach
#

Wait a moment

ornate hemlock
#

Hi

hoary wigeon
#

on my it takes more than 10min

ornate hemlock
#

can someone help me in a numpy issue?

grave breach
#

@hoary wigeon You can use google cloud

hoary wigeon
#

does it support sqlite ?

grave breach
#

Just have to convert your SQLite database to a MySQL one

#

There are scripts that do that

hoary wigeon
#

how ?

#

where ?

#

it is still runnning

#

and my CPU's buzzer is on now

uncut barn
#

is there a chat where I could ask data mining qs?

serene scaffold
uncut barn
#

ok thanks, im wondering what are the data mining tasks, do they include operations like count and sum etc. or just prediction and classification?

#

i.e. ML algorithms

glass cedar
#

Question about pandas, is one way faster than the other?

df = df.assign(col3=lambda x: (x['col1'] + x['col2']))
df['col3'] = df['col1'] + df['col2']
molten hamlet
#

Can I elaborate with someone on ginis index and information gain? and Why gini is slower in my implementation? Cuz I think it shoudl be faster

serene scaffold
slate hollow
#

is there any difference between normal dropout and alpha dropout? (ping 2 reply thx)

misty flint
#

@paper lake unironically have to do a bioinformatics data analysis this weekend

#

when you want to use python but forced to use R

rocky copper
#

i want to compare two pdfs for similairty
i did using tf-idf and cosine similarity
how do i do it with jaccard?do i have to user tf idf with it?

stuck socket
#

hi all

#

does someone knows the guy behind pandas_ta??

dusty turret
#

Any reference to python code for nlp based model for reading totals from restaurant invoice in pdf form?

lapis sequoia
#

Hey anyone knows how to easily make positive/negative skewed distribution? (Using it as an example)

grave breach
kindred radish
#

Just a bit confused about what it sklearn means by "flat" and "non-flat" geometry

#

It says that it's the "metric-used" in the column header

#

but how can this be flat or not flat?

golden pawn
#
firstdate = pd.to_datetime(['01.01.2004'], format='%d.%m.%Y')
seconddate = pd.to_datetime(['01.01.2005'], format='%d.%m.%Y')
print(type(firstdate))
print(f"the sum of orders for 2005, for sellers from Poland\n"
      f" {df2[(df2.Kraj == 'Polska') & (not firstdate > df2['Data zamowienia'] < seconddate)][['Utarg']].sum()}"))
``` Anyone know how to format this date to make this print work? In the Excell sheet column I have just the dates written in this way (I mean separated by dot). PYTHON [PANDAS]![logo_panda3d](https://cdn.discordapp.com/emojis/666631586845818900.webp?size=128 "logo_panda3d")
#
 raise ValueError("Lengths must match")
ValueError: Lengths must match```
kindred radish
soft salmon
lapis sequoia
#

Does the Flatten layer of a CNN turn the entire tensor into a mx1 vector?

balmy junco
#

hey guys i am trying to create a dataframe in parallel from random indices of an existing dataframe. i know how to use a dataframe in parallel but creating one that way is different. any thoughts?

desert oar
#

Normally i use pd.concat to combine dataframes

desert oar
desert oar
#

I think the non laziness of pandas is better by default

#

Laziness is only valuable for really big datasets and/or really expensive operations, in which case maybe dask is better anyway

#

It would also be nice to have more stuff in the form of portable libraries so you can use the same arrow-backed data frames in python r julia lua rust c++ whatever

#

Imagine, monadic dataframe operations in haskell backed by arrow!

grave frost
weary summit
#

Hi
I am working in an offline environment.
I currently want to work with geopandas (using conda and cp36)
After installing the dependencies (fiona, shapely etc), I tried to import the module but came up with the following error:
" from fiona.ogrext import Iterator, ItemsIterator, KeysIterator
ImportError: DLL load failed: The specified module could not be found."

I am currently using python 3.6.5
With packages:
fiona: 1.8.6
gdal: 3.0.1
geopandas: 0.3.0 (Have tried using 0.4.0\0.8.0 aswell)

I have looked here as well (https://github.com/Toblerity/Fiona/issues/402), but couldn't solve the problem.

Does anyone here know what is the problem and how can it be solver?

Thanks in advance

GitHub

Fiona reads and writes geographic data files. Contribute to Toblerity/Fiona development by creating an account on GitHub.

ornate hemlock
#

Hello

soft salmon
true plover
#

what libraries should i learn for data science? help#

kindred radish
#

@true plover Check out Sklearn, it's pretty beginner friendly since a lot of things work out of the box

#

then Pandas for data manipulation and numpy for general usefulness

#

I've been reading this paper( DOI: 10.1002/widm.30) which talks about density-based clustering algorithms in terms of probability density functions like this:

#

Bit confused though, as when I've been looking at DBSCAN it doesn't really seem to be doing anything like this?

#

I think i remember talking to @grave frost about this algorithm ages ago?

grave breach
heady tide
#

Does anyone know about pathwise coordinate descent for optimization? I am struggling trying to understand it

paper lake
#

reminds me of a meme we posted in julia about R and Python data scientists

#

never been active here or in this server really

#

im busy usually in julia server

misty flint
#

an empty data frame

#

p.s. - it wasnt supposed to be empty

#

i can just dm you next time. probs easier

dire frost
#

Hey!
Can someone tell where is a good place for a beginner in AI

misty flint
#

andrew ng's ai for everyone

#

you can start there

#

as a super beginner

slate hollow
#

hey, is there a way to see what's being passed from 1 layer to another as a model is being evaluated?

#

i just want to see what's actually happeneing in there

dire frost
worldly robin
#

I'm a wannabe Data visualization and tableau coder

slate hollow
#

cool

grave breach
glad raft
#

I'm going to give the textbook elements of statistical learning a try. Anyone have any experience with it?

#

It seems like Stanford hosts a free copy so it's worth a go maybe

grave breach
#

@glad raft Don't have much experience with statistics, but if you want to study it, Mathematica is a good software for it

grave frost
glad raft
#

I'm going to try to read the first 100 pages today. I'll let you all know what I think

grave breach
#

Great, thank you

late shell
#

Is it possible for 2 features to have high correlation coefficient but low VIF or vice-versa?

harsh shadow
#

Anybody has some tipps/resources on anomaly detection in high dimensional time series data?

grave breach
#

Use an auto encoder

#

there should exist a sequence one

red hound
#

Anyone has an idea, why for the generator no gradients are calculated?


@tf.function
def train_step(real_samples):
    z = tf.random.normal(shape=(batch_size, z_dim))
    gen_samples = generator(z)
    combined_samples = tf.concat([gen_samples, real_samples], axis=0)
    labels = tf.concat([tf.ones((batch_size, 1)), tf.zeros((real_samples.shape[0],1))], axis=0)

    with tf.GradientTape() as tape:
        predictions = discriminator(combined_samples)
        d_loss = loss_fn(labels, predictions)
    grads = tape.gradient(d_loss, discriminator.trainable_weights)
    d_optimizer.apply_gradients(zip(grads, discriminator.trainable_weights))

    z = tf.random.normal(shape=(batch_size, z_dim))
    label_zeros = tf.zeros(batch_size,1)

    with tf.GradientTape() as tape:
        predictions = discriminator(generator(z))
        g_loss = loss_fn(label_zeros, predictions)

    # tape.gradient returns None for alle trainable variables
    grads = tape.gradient(g_loss, generator.trainable_weights)
    print(predictions)
    print(g_loss)
    print(grads)
    # on executing this line the error occures
    g_optimizer.apply_gradients(zip(grads, generator.trainable_weights))
    return d_loss, g_loss, gen_samples

the lines, where the gradients should be calculated as well as the line where the actual error occurs are commented. The error is: " ValueError: No gradients provided for any variable: ['gan_generator/lstm/lstm_cell/kernel:0', 'gan_generator/lstm/lstm_cell/recurrent_kernel:0', 'gan_generator/lstm/lstm_cell/bias:0', 'gan_generator/dense/kernel:0', 'gan_generator/dense/bias:0']."

Outputs for the print statements are:

Tensor("gan_discriminator/dense_1/Sigmoid_1:0", shape=(128, 1), dtype=float32)
Tensor("binary_crossentropy_1/weighted_loss/value:0", shape=(), dtype=float32)
[None, None, None, None, None]
#

Dtypes and shapes are okay, everything used is a Tensor-object. No np.arrays used.

grave frost
#

I just realized that Elon Musk posts random jargon shite:-

A major part of real-world AI has to be solved to make unsupervised, generalized full self-driving work, as the entire road system is designed for biological neural nets with optical imagers
๐Ÿ‘€

#

Anyways, QQ: Does TF model not work with pre-batched tf.dataset? it's the first time I ever had this problem

red hound
#

Im doing actually this

train_data = tf.convert_to_tensor(train_data)
dataset = tf.data.Dataset.from_tensor_slices(train_data)
dataset = dataset.shuffle(buffer_size=1024).batch(batch_size)```
grave frost
#

Apparently, I can either buffer or pre-batch

glad raft
#

What are hyperparameters? Is it the number of layers and the number of neurons per layer?

daring kiln
red hound
#

But its not working

glad raft
#

@daring kiln to the best of my knowledge hyper parameter optimization happens as either a guess and check approach or classically looking at the posterior with bayesian inference. But I'm not sure what hyper parameters are so optimizing them is beyond me. I think Elisse Jennings from llnl gave a presentation on it that you can see on Youtube

daring kiln
#

@glad raft thanks a bunch for the reply. I'm gonna go check it out.

red hound
#

I can give you examples, maybe it helps. In the context of neural networks there are the trainable parameters, also called weights. And then there are the kinda defining parameters, which tell the model how to behave, which are mostly untrainable. Some of them are the no. of epochs, batch_size, learning_rate and so on

glad raft
#

I hope it's helpful.

#

Are these parameters the hyper parameters?

red hound
#

The trainable parameters get adjusted by the optimizer through backpropagation. The hyper parameters are the ones you have to optimize.

red hound
glad raft
#

That's super helpful thank you

daring kiln
#

Both you guys helped me understand a concept been struggling to understand for a while. Thanks!

oak jungle
#

So, I tried to import gym for reinforcement learning and it completely doesn't work no matter if I'm in a virtual environment or not or anything else

slate hollow
slate hollow
grave breach
#

Ah, sorry

glad raft
#

So far i have multiple words that possibly mean the same thing. First is there any recognizable difference between statistical learning, machine learning, and deep learning? And what's the primary difference between machine learning and ai?

#

Oh and neutral networks

tidal bough
#

AI: doing tasks that are normally done by humans
ML: programs that "learn" from the data, rather than being rigid algorithms

they have a ton of overlap nowadays, but there are AI algorithms that aren't at all ML, like the first ever chatbot programs that were hardcoded, or, for that matter, any kind of manually coded decision tree.

glad raft
#

Are there quality ais that aren't ml based? Would they be self modifying code based?

iron basalt
#

AI can also imply an algorithm that does not learn though, like a chess "AI".

glad raft
#

Ai chess as a predictive model that creates a state tree and then minimizes move risk?

#

Sorry if these questions are really rudimentary but I think developing a language is important to learning a topic

iron basalt
#

Deep learning is a buzzword that had no meaning, but then everyone sort of agreed on a meaning later. The term was first used in psychology, but it's first use in algorithms is something like the early 80s and it was used to refer to the depth of a search algorithm.

red hound
#

In the early days, AI pretty much meant everything described by @tidal bough s first point. For examples simple rule based systems
Today when someone speaks of AI its almost always at minimum a ML thing he is talking about, if not even Deep Learning. Most modern AI stuff is pretty much Deep Learning/ Reinforcement learning based, thats why.

Look at the aboves venn diagram, makes it much more clear

iron basalt
#

(Automatic theorem prover)

glad raft
#

Does reinforcement imply initially supervised followed by ongoing unsupervised?

red hound
#

Its a bit like conditioning a dog

tidal bough
#

reinforcement learning means that you make the program learn by giving it a reward signal and letting it explor, rather than by providing examples of what to do. The algorithms used in RL are very different from what you see in the other fields

glad raft
#

I probably won't be getting to unsupervised or rl for a few weeks at least

iron basalt
#

Reinforcement learning is arguable the most AI part of ML, it tackles head on the issue of creating an automaton (making it very hard / unsolved and also not very immediately applicable like unsupervised and supervised learning).

tidal bough
iron basalt
#

(It's also part of other fields outside of ML)

red hound
#

I have been collecting experience with ml for quite a while now and I think rl is another thing in itself. Its quite different

glad raft
#

@tidal bough I'll read that as soon as I can

iron basalt
#

ML does not imply AI, nor does AI imply ML. ML is really just anything that learns (probably approximately).

glad raft
#

@iron basalt that's the aesthetic of uncertainty quantification right? To shed light on the probably part

iron basalt
#

Technically any program that stores anything and uses it later is "ML".

#

But if someone says ML they most likely mean that its focus is on making the most of that data.

#

(pretty much always induction)

glad raft
#

Are most ml algorithms neural networks, or do neural networks comprise of a distinctly small subset of ml algorithms?

iron basalt
#

They are a small subset of ML.

#

And fall under the smaller subset of "biologically inspired"

tidal bough
#

Neural Networks are only a subset of ML, but just like with ML being only a subset of AI, it's so disproportionately researched that it's almost implied that if you're doing ML in this day and age, it'd be quite a surprise if it's not a NN

iron basalt
red hound
#

Couldnt say it better

glad raft
#

It is in my understanding that in neural networks the number of hidden layers effects the possible "geometry" of the solution space. If you had a simple binary classification problem, would you be able to get a circular dividing layer with any number of hidden layers using a squared error loss function?

iron basalt
#

An ML researcher probably does more than just NNs, but rather all kinds of strange ideas (both new and old ideas / often some of the best things are a revival of an old idea but with modern compute power).

glad raft
#

@iron basalt I may be one of those in the foreseeable future, but my first area of expertise is applied mathematics

iron basalt
#

(e.g. Tsetlin automata which have been shown to out perform NNs and also be much more computationally efficient, but the idea is from the 60s Soviet Union technology).

red hound
iron basalt
red hound
#

yeah, sure

iron basalt
#

(people have been doing ANNs for a long time)

red hound
#

Thats crazy. I wonder what these guys would say if theyd know what impact their inventions have today

iron basalt
#

A lot of the best stuff is from the 60s probably due to space race and all that cold war stuff that drove people to make some crazy stuff.

glad raft
#

Do you think the current exascale and quantum computing races effect the current drive for robust ml algorithms at all?

iron basalt
#

Quantum computers are not needed for AGI IMO.

grave frost
iron basalt
#

Much more useful and impressive would be large scale reservoir computers (not too hard to make and would allow for some ridiculously massive NNs, etc): https://en.wikipedia.org/wiki/Reservoir_computing

Reservoir computing is a framework for computation derived from recurrent neural network theory that maps input signals into higher dimensional computational spaces through the dynamics of a fixed, non-linear system called a reservoir. After the input signal is fed into the reservoir, which is treated as a "black box," a simple readout mechanism...

grave frost
#

SNN's aren't very compute effecient, and advances in HTM seem....slow (but breakthroughs nonetheless )

iron basalt
#

I don't think a cat has a quantum computer in their head.

#

we already know that the brain is a reservoir computer.

grave frost
#

I like to lean towards a Biological/HTM hybrid than it running on computers/QC IMO

iron basalt
#

Reservoir computers are as efficient as it gets. They happen naturally too.

#

Paddles on water is a reservoir computer.

glad raft
#

If you could use aurora next year, how might you adapt your ml codes to best utilize computational resources? Would you want to use it at all?

iron basalt
#

One neat way to make a reservoir computer is to simply stack a bunch of glass panes (optical reservoir computer), it's so efficient because it has no power source. The input (the light) gives it all the energy it needs.

grave frost
#

I don't get them ๐Ÿ˜ž

iron basalt
#

Anyhow this is starting to get tangential to the topic so I will stop here.

red hound
glad raft
#

I think in a stochastic gradient decent method they just guess a point and go with it

#

I read somewhere that sometimes people will stop describing them as decent methods because they don't necessarily pick something that results in a decrease in the ??? loss function ??? But with large data sets you can see speedup by doing it

#

.... just a guess..... @red hound

lapis sequoia
#

Hey anyone knows how to plot on 2 subplots with seaborn?

fig, (ax1, ax2) = plt.subplots(1,2, figsize=(10, 4))
sns.displot(ax=ax1, data=r, bins=20)
sns.displot(ax=ax2, data=i, bins=20)```
main moat
#

this is probably the better place to ask this question.

So I have been working on a typing test app that allows me to record the key up and down events for my Machine Learning class.
The goal is to determine users by how they type.

The original prompt I mad was WAY too long/difficult
See it here.
#python-discussion message

#

The new prompt is much easier, 3 posts down.

#

My question is, how do I get a lot of people to take the test?

lapis sequoia
#

Does the Flatten layer of a CNN turn the entire tensor into a mx1 vector?

red hound
marsh berry
#

Anyone know of a good course on how to choose best types of visualizations for particular data. For example if my dataset involves Products (toys), what would be the best way to visualize all the types of manufacturers in that data? Are pie charts the way to go?

glad raft
#

What like multiple toy items with many manufacturers producing the same item?

exotic maple
#

In general, there's no straightforward "best way" to visualize something. It all depends in your use-case, the story you want to tell and your audience

#

It's better if you ask people in your industry / field for tips on visualization

slate hollow
#

so this is what happens when i try to send a file rn

#

the thing is, the "pictures", "videos", and stuff aren't there as i use onedrive for my files & stuff

#

so they aren't the default directories

#

anyone know how to add this stuff to the file picker?

exotic marsh
#

Hi, would this be the right place to ask about how to get started with machine learning python? My goal is quite specific is to find a combination of numbers for the best result

ripe forge
#

Yep!

#

Could you elaborate the goal a bit? What does best result mean

exotic marsh
#

so i am trying to implement machine learning to my indicator in tradingview

#

best result is Sharpe Ratio > 1.2

#

so trying to optimise my parameters for different markets and getting data feeded into it

#

apologies i am very new to ML and trying to learn my ropes around

#

thank you again

glacial sparrow
#

how can I help myself regarding finding a research project as my final project for uni? There are no tutor suggested topics and I'm 90% I'm incapable of finding one

hard hound
#

Hey I was getting an assertion error in one of my jupyter blocks even when I has already asserted that float in my earlier blocks without any error

#

Could someone help?

grave frost
#

that assert is in the function you are using, not in the lines below

hard hound
#

Its also in the type of b It didn't fit in the screenshot

grave frost
hard hound
#

AssertionError Traceback (most recent call last)
<ipython-input-29-4f77d2aa60f4> in <module>
1 dim = 2
----> 2 w, b = initialize_with_zeros(dim)
3
4 assert type(b) == float
5 print ("w = " + str(w))

<ipython-input-28-ddd6819e9a1a> in initialize_with_zeros(dim)
20 b = 0
21 # YOUR CODE ENDS HERE
---> 22 assert(isinstance(b,float))
23 assert(w.shape == (dim,1))
24 return w, b

AssertionError:

grave frost
#

nope, it's in the function.
<ipython-input-28-ddd6819e9a1a> in initialize_with_zeros(dim)

hard hound
#

what should i do then?

grave frost
#

---> 22 assert(isinstance(b,float))
This is the assert you are failing. try to figure out what and why that argument doesn't serve your needs

hard hound
#

I just added .0 to the number and it worked

#

thanks @grave frost

true crag
#

any idea how to use spacy

serene scaffold
true crag
#

I want to create a bot like chat bot in cmd

serene scaffold
#

what do you want to be able to talk about with the chat bot

true crag
#

I need like to be about to have a convo pretty basic

#

and the main thing

#

to train it show it can help me with like stock trading thoughts

#

or smt like htat

#

but for starters just basic convo

serene scaffold
#

so you want a chat bot that can have basic conversations, and help you trade stocks?

true crag
#

yes

#

stocks will come after

serene scaffold
#

I would make the stock thing a separate project

true crag
#

ye ye sounds good

#

but I dont even know how to use spacy to do the convo

#

I tried to use chatterbot

#

but corpus wasnt working for me

serene scaffold
#

you probably wouldn't use spacy for this, no

true crag
#

and how am i going to do it then

#

what is the use of spacy?

serene scaffold
true crag
#

so i can use it for a more advanced like chatbot

serene scaffold
#

you can also use it to get embedding representations of words and sentences, which gives you an entry point to do deep learning with language.

#

I'm not sure that making a chat bot is a great first project, unless you have a really specific set of topics in mind.

true crag
#

I need to make a chatbot like its a project

serene scaffold
#

you know what would be fun? an ngram sentence generator.

true crag
#

I basic chatbot and then I can train it to somehow helping in trading aspect

#

ngram sentece?

tidal bough
#

markov chain sentence generation is fun and pretty quick to code

serene scaffold
#

ngrams are contiguous sequences of n tokens

#

[(ngrams, are), (are, contiguous), (contiguous sequences), (sequences of), (of n), (n tokens)]

#

these are bigrams.

true crag
#

oh ye i googled it

tidal bough
#
i should have thought , quite clear . he was heading back to gryffindor tower , and harry , somehow struck anew by how tall krum was , elaborated . were friends . shes not my girlfriend and she never has been . its just that dobbys plans arent always that safe . dont you remember , moody told us to be careful what we put in writing . we just cant guarantee owls arent being intercepted anymore . all right , all right . flint nearly kills the gryffindor seeker , which could happen to anyone , im sure 

here's what I generated the last time I tried my hand at them; trained on the first HP book

true crag
#

So any idea guys on how I can start doing the chatbot

#

like any good library

#

or like a way to fixx this error

#
  File "c:\python38\lib\site-packages\chatterbot\trainers.py", line 135, in train
    for corpus, categories, file_path in load_corpus(*data_file_paths):
  File "c:\python38\lib\site-packages\chatterbot\corpus.py", line 84, in load_corpus
    corpus_data = read_corpus(file_path)
  File "c:\python38\lib\site-packages\chatterbot\corpus.py", line 58, in read_corpus
    with io.open(file_name, encoding='utf-8') as data_file:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Kyriakos\\chatterbot_corpus\\data\\english'```
kindred radish
#

Geometry seems to be a "metric", but how is it non-flat?

tidal bough
#

Suppose all your points lie along a sheet of paper (a 2d manifold in 3d space). Now suppose that instead of being flat, that sheet is rolled into a tube, the ends almost touching. In 3d space, the distance between the ends is almost zero, but if you only count the distance along the sheet (i.e. the path needs to lie inside the paper) then these points are very far from each other, on the opposite ends of the sheet.

#

To capture the manifold properties in cases like this, you need to use a metric that uses the points, rather than just normal 3d distance.

kindred radish
#

Ok wait so:

#

like that?

tidal bough
#

yup, that's exactly what I'm talking about

kindred radish
#

ok awesome, thank you. So when an algorithm uses "non-flat geometry" it's talking about this cylinder?

#

When it's flat geometry then it's like a normal plane

tidal bough
#

Basically the non-flat metrics it's talking about are those that don't use euclidean distance

#

Like measuring the distance only along shortest links between known points.

#

These metrics will try to capture the red distance instead of the blue one, which is more correct - after all, if the dataset is actually distributed along the plane, the points at the two ends are going to be not similar at all, despite being close in 3d space (blue distance)

kindred radish
#

Thank you this explanation is really lovely

#

How could you tell if a situation required this non-flat metric then?

#

Would you have to do some plotting first to figure that out?

tidal bough
#

That's basically what manifold learning is - it's a branch of unsupervised learning that tries to find a not necessarily linear hypersurface the data fits on. I'm not sure how to detect whether the results you get from it are good - presumably there are some scores you can get. Plotting to see in general isn't possible if you have more than 3 features (it's not exactly easy to plot 4d space).

kindred radish
#

Have each point in the fourth dimension represent a frame in an animation maybe?

#

but anyways, thank you for the explanation!

untold oriole
#

how can i plot my pd df in a tabular form? and also save it as an image. i use plt.table but i have this:

desert oar
#

@untold oriole what do you mean by "plot", what output would you want?

desert oar
tepid rapids
#

i have a dataset that contains job ads that are fraudulent and non fraudulent. what kind of algorithm could i use to make a model that predicts whether the ad is fradulent?

#

i tried using KNN but the lack of binary categories made it near impossible (atleast for me )

desert oar
#

@tepid rapids logistic regression

tepid rapids
#

im unable to use logistic as its for a group assignment and my other group member is using it x)

desert oar
#

that doesn't make sense, what's the project?

#

there are lots of different models that are similar to logistic regression in that you have input features, predict a yes/no (binary) target, and minimize a loss function to find optimal parameters

tepid rapids
#

we have an idea and we have to use 3 different algorithms to create it

desert oar
#

doesn't necessarily have to be linear

#

do you know what logistic regression actually is? how it works?

#

maybe you can use traditional logistic regression, xgboost, and a basic neural network with one hidden layer

#

theyre all different "algorithms" but the models still have the same basic shape

#

but of course this is the content of your project - evaluate different models, try them, see what works and explain why

tepid rapids
#

i will try that thank you ๐Ÿ™‚

crisp wing
#

Anyone know of a good way to deal with unique nan-values in a 2D data array? I wanna do SVD on the data, but since some rows have different nan values it complicates things.
I was thinking of simply setting them to a constant value, but I am unsure if this causes the outcome of the SVD to change

glass cedar
# crisp wing Anyone know of a good way to deal with unique nan-values in a 2D data array? I w...

someone posted an interesting iterative method to approach this problem on stackoverflow: https://stackoverflow.com/questions/35577553/how-to-fill-nan-values-in-numeric-array-to-apply-svd

crisp wing
spare vortex
#

hello

inner sparrow
#

hi

#

how do i use chatbot API

spare vortex
#

so about chatbot api's

#

you need to find it

#

or better you can create your own chatbot api itself

#

do you have an idea of how chatbots work?

#

and what kind of chatbot's there are?

inner sparrow
#

no not much

spare vortex
inner sparrow
#

!pypi chatbot

arctic wedgeBOT
spare vortex
#

Lol not this

inner sparrow
#

found it

#

oh

spare vortex
#

So there are I guess 5 kinds of chatbot

inner sparrow
spare vortex
#

it's very easy lol

#

the api part

#

only hard part is the chatbot itself

inner sparrow
#

isn't using an existing API easier

spare vortex
#

So chatbots are of many kinds
Question based chatbot
Rule based chatbot
Contextual based chatbot
NLP chatbots

spare vortex
#

and the good ones are paid

#

or not for public use

#

!pypi chatterbot

arctic wedgeBOT
spare vortex
#

this is a chatbot engine which uses NLP and its self learning

#

!pypi chatterbot-corpus

arctic wedgeBOT
spare vortex
#

noe

#

nope

inner sparrow
#

then i shall use it

spare vortex
#

you can use to create your own Chatbot

#

do you know what NLP is?

inner sparrow
#

but how do I get started with this API

inner sparrow
spare vortex
#

NLP is called Natural language processing

inner sparrow
#

oh

spare vortex
#

It is a combination of Rule based and Contextual chatbot

#

Now what is Rule based chatbot you may ask

#

Rule based chatbot are chatbot that's have rules, and a database of prewritten answers

So a chatbot engine will take message input and then Use it algorithm to choose out the best response from its database

#

And it can even learn stuff

inner sparrow
#

Yes, i need this

spare vortex
#

But when it comes to large and complex chatbots, Rule based chatbots dont work properly

inner sparrow
#

i want anything that works

spare vortex
#

for a rule based chatbot

#

!pypi python-aiml

arctic wedgeBOT
spare vortex
#

this is a chatbot ai and uses AIML

#

AIML is a chatbot langauge you can say

#

like Html

#

you can create tags of responses

#

and make it learn responses too

#

and the best thing about this is that
the author and the creator has a prewritten large database of conversations

#

and responses

#

you can get easily

#

for eg

inner sparrow
#

i want something purely ai that uses the API

spare vortex
#

wait not this

#

I will send you the link

inner sparrow
#

wait why is this so complex

spare vortex
#

chatbots are complex

inner sparrow
#

why can't i use a simple Api and get started

spare vortex
#

There is Kuki_ai
Gpt3

#

the best chatbots you can get your hands on

#

Gpt3 is so good that it can generate and read code

#

poems

#

etc etc

#

But Gpt3 is not opens

#

source

#

and not available for public

#

Gpt3 has an api service than you can enroll for

#

but you need to wait months to get it

#

and it's not a 100% guarantee you get

#

it

#

now you want a simple free api

#

there are a lot of apis

#

some random ml chatb9t

#

bruhapi xyz

#

but they are bad

#

Even cleverbot

#

ehy

inner sparrow
#

Yes i need anything that's simple and that can work

spare vortex
#

they give out random responses

#

you want a chatbot that gives out random responses half the time

inner sparrow
#

bruh

spare vortex
#

then you can use it

#

that's how it is

inner sparrow
#

that's not ai and that doesn't work

spare vortex
#

^

#

you answered yourself

inner sparrow
#

i want something that works good

spare vortex
#

then you need make it yourself

#

or pay

#

for a good one

inner sparrow
#

this is tough

#

i quit

spare vortex
#

Ofc it is tough and complicated

#

but chatbots at the end, when they work

#

the efforts are worth it

spare vortex
#

of how to make a simple chatbot

#

let me get it wait

inner sparrow
#

it's okay now, i have to go sleep

#

thank you for explaining me everything

#

I'll ping you tomorrow if i plan to work on this

spare vortex
#

Python Chat Bot Tutorial - Chatbot with Deep Learning (Part 1)

#

bruh

inner sparrow
#

I'll check that out too, thanks

#

oh tech with Tim

spare vortex
#

this guy made a very good chatbot

inner sparrow
#

nice

#

yeah

spare vortex
#

and you can create your own response model

inner sparrow
#

hmmm

#

ok tyvm

spare vortex
#

I didnt explain you much today

#

but get some slee0

#

sleep

#

We discuss this tomorrow

inner sparrow
#

okie

oak violet
#

hello, I'm doing a curse about data science in the university. i need to do a project of data science, i need to use crawler/api (or both) to create a database to run models on the data. and i need to use machine learning after i crawl the data. i'm looking for a subject for 2 days and i didn't find something with enough information to crawl and to use machine learning on and that it wont be a pain in the a@@ (sorry)... i would love to hear you boys and girls if you have any subject for me and maybe a data websites ๐Ÿ™‚ ๐Ÿ™‚

#

i would appriciate it if you dm me so i wont miss your answer ๐Ÿ™‚

desert oar
#

@oak violet don't worry too much about the topic, spend more time worrying about doing the work. just use the wikipedia API, you can "crawl" it by following links in the articles

oak violet
#

@desert oar where can i find this wikipedia API?

desert oar
#

wait, thats the wrong one i think

#

maybe i mis-remembered. either they have a read-only content API, or they are just willing to let you scrape the html site if you don't abuse their servers

#

it's somewhere in their ToS i think

oak violet
#

alright thanks alot ๐Ÿ™‚ @desert oar

topaz garnet
#

Anybody know of a python library that can extract MGDF from audio?

grave frost
idle summit
#

Hello help me

exotic robin
#

Hi all. College freshman interested in NLP and second language aquisition. Don't have any relevant experience with it. Ideally I'd like to learn how to develop/improve existing segmentation techniques for languages without spaces and detecting multi word expressions. I hear the Jurafsky book, Stanford cs224, Coursera machine learning, deep learning thrown around a lot but no idea where to begin. Ideally looking for resources/resources that will put me in the right spot as soon as possible without too much detour. Anyone have any experience in this/suggestions? Again, I have no relevant experience in this. Thank you very much

serene scaffold
#

@exotic robin so you're interested in how to identify word, sentence, and morpheme boundaries in languages that don't separate words with whitespace?

exotic robin
#

precisely

serene scaffold
#

@exotic robin let me think on that and get back to you.

#

What language btw? Hindi? Arabic?

exotic robin
#

Japanese, chinese primarily for now, but Arabic Hindi would be a cool task later on

serene scaffold
#

Look to see if spacy has models for either of those

exotic robin
#

Yes for JApanese Chinese

#

No to Arabic Hindi

#

I've tried working with spacy before and it doesn't really do what I want

#

At least for japanese it uses a 3rd party morphological analyser

#

I think I have to reinvent or improve the existing technologies at a fundamental level

#

I'm also not just interested in segmentation but, also semantics as well, recognizing multi word expression and saying, hey, this part of teh sentence is a common colloquialism, but doesnt actually have grammatical breakdown

#

Japanese dependency stuff isn't very high accuracy at all, it fails even more in spoken language. Arabic especially with dialects is poor accuracy

serene scaffold
#

Aren't the writing conventions for arabic dialects not really standardized?

exotic robin
#

That's a problem, yes

serene scaffold
#

That was what I was told when I studied arabic

exotic robin
#

I suppose that's a way of putting it, although I haven't looked too much into the specific implications

serene scaffold
#

So your overarching goal is to identify idioms?

exotic robin
#

My overarching goal is to develop a second language aquisition system that can intelligently, with the aid of NLP AI etc reccomend a user content that is only slightly more difficult than what they can understand, with an internal grading system that keeps track of understood and known words with user feedback. In the case known words or grammar strucutres are homophonic/spelled the same, the system keeps tracks of dependency and use types to see what scenarios the user has seen the words before

I want to be able to extract and annotate information in this manner:
"Yeah, bro. Whatever floats your boat."
is comprised of
"Yeah (Interjection), bro (noun). Whatever (determiner) floats your boat (expression) ."
which can be further broken down into
Floats your boat -> comprised of Float (verb) your (pronoun) boat (noun)
So if the user knows the words (float, your, boat), the algorithm will suggest this sentence, to allow the user to develop another meaning of the words within an expression.
If an expression can not be gramatically broken down (which is a specific type of multi word expression (I can link a paper about this stuff in Japanese)), then the system knows that.

The system will ideally need to understand, perhaps with lots of training data and human-built categorization, how to separate between different gramattical contexts when the word is spelled/sounded the same way.

serene scaffold
#

This sounds like a really cool project, actually

exotic robin
#

There are many details but I condensed for brevity

#

An example in english
"I want to be a clown for Halloween"
"When will supper be ready"

#

Are two different uses of the word be

potent nymph
#
from sklearn.svm import SVC
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.metrics import plot_roc_curve
import matplotlib.pyplot as plt

X, y = load_wine(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
clf2 = SVC(kernel="linear", random_state=0)
clf2.fit(X_train, y_train)

plot_roc_curve(clf2, X_test, y_test)
plt.show()

I keep on getting the error ValueError: SVC should be a binary classifier in line plot_roc_curve(clf2, X_test, y_test), what's the problem here?

exotic maple
#

looks at its arguments in sklearn

#

i dont remember it taking the classfier as an argument

potent nymph
#

i did already

#
import matplotlib.pyplot as plt  
from sklearn import datasets, metrics, model_selection, svm
X, y = datasets.make_classification(random_state=0)
X_train, X_test, y_train, y_test = model_selection.train_test_split(
    X, y, random_state=0)
clf = svm.SVC(random_state=0)
clf.fit(X_train, y_train)

metrics.plot_roc_curve(clf, X_test, y_test)  
plt.show()          
#

is the official example (i literally copy pasted)

exotic maple
#

how amny targets do you have?

potent nymph
#

target?

#

i just loaded the sklearn wine dataset

#

sorry im rly new to ml i have no idea what target means

exotic maple
#

In a classifying problem your target is what you're trying to classify

potent nymph
#

i also tried the iris set but same error

exotic maple
#

classification problem

#

yes because those ar enot binary datasets

#

ok think it like this

#

a binary target is

#

"WINE" "NOT WINE" the algorithm can only have 2 outcomes.

#

that's why it's binary

#

a multiclass (non binary) problem is when you have many targets. for example the iris dataset (setosa, versicolor, etc)

#

if the plot only accepts binary, yeah iris wont wok

potent nymph
#

ohhhh

#

i see i see

#

so what i was trying to do fundamentally doesnt make sense

willow quarry
#

hey guys i am getting a weard thing here

#

yesterday i made a keras model and now i am folowing the same path

#

but in the output i am getting 1 outputs instead of 1

#

this way i am being unable to train with my dataset

arctic wedgeBOT
#

Hey @sharp turret!

It looks like you tried to attach file type(s) that we do not allow (.svg). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

fringe tide
#

just a dumb question, opencv is related to this channel?

livid oar
vestal pilot
#

do y'all recommedn the google coursera courses for data analytics / cleanup / manipulation for a relatively new beginner?

coarse loom
#

do i post help question here for pandas?

#

I have a data frame in Pandas that I would like to add a dynamic variable to every row of one of the columns. Please, could someone advise how this can be achieved?

abstract swift
true crag
#

Hey, I am trying to make a discord AI chatbot for general convo

#

Is there any good already made

#

or like an open source I can download to help me

#

If not, what is the best wai to make one?

wind panther
#

Is there any kind of solution out there which I could tweak a little bit according to my needs?

heavy cargo
#

We're looking for a Lead Consultant - Data Scientist to join our Capgemini Invent team in Melbourne. This role will see you developing insightful models for various projects with a focus initially on risk modelling using Python and R.

You'll have 2+ years experience in Data Science with a history of successful data science implementations, ideally in a consulting environment.

dm me for more information, feel free to share!

inner estuary
#

how can i pass all my columns wich are float64 type to a string with pandas?

#

i have 867 columns, if i try to do it with one by one would be a impossible task

desert oar
inner estuary
#

As i have this lot of columns, it doesnt display all columns types for me, so i want to select with a command wich will localize all columns which is float64 type and then use .astype() to pass to a string

paper niche
inner estuary
devout sierra
#

How do I learn Python?

versed sleet
maiden saddle
#

ima learn through books...

serene scaffold
serene scaffold
arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

versed sleet
#

Just visited it. Lots of changes since I've been there. I can't believe Python 2 is still up to learn for free. smh

lone birch
#

I'm working on a discord bot and I'd like to embed charts from pchartjs. Is anyone aware of a way to export chart objects straight to a .png or something? I've used matplotlib for it in the past but would like to try out pychartjs for this project. The documentation doesn't seem to mention it at all.

Nevermind. I found out about quickchart.io ๐Ÿ™‚

misty flint
#

eww python 2

grave frost
#

Information retrieval/NER?

grave frost
#

but it's pretty fast on CPU too in my expereince

grave frost
uncut barn
#

what are the reasons that we use loglikehood rather than likelihood apart from underflow?

true crag
grave frost
true crag
#

I need to make a chatbot as I said, any tutorial or useful article, i need to make a chatbot for discord

#

like a pretty basic one

grave frost
#

or you could hard-code the responses too

#

anything more "Ai-ish" would be fine-tuning or using pre-trained models

true crag
#

I need like a bot that will keep a conversation with the user

true crag
grave frost
true crag
grave frost
true crag
#

OK where do I get that?

grave frost
serene scaffold
desert oar
#

for context: making a chatbot that can have a "normal" conversation is still an open area of research and not something we actually know how to do, even with the huge amounts of research time an computing power available to the top researchers at tech companies

#

there are limited situations in which you can make kind-of functional chat bots (like SmarterChild from back in the AIM days...) using traditional AI techniques, as well as bring some deep learning to bear. but i agree, it's kind of a rabbit hole as far as hobby projects go.

abstract swift
true crag
#

And ofc I know its not ez thats y I am asking ;p

wind panther
# grave frost Information retrieval/NER?

Thanks! will have a look into this? Are there any python libraries you would recommend that deal with that? Or is there some kind of library? I just need some buzzwords to get me started :))

desert oar
wind panther
icy walrus
#

Hi

#

How can I use pyplot but add a title, a xlabel and ylabel, and the actual plot data all in one line?

#
plt.title('Volume en fonction de n')
plt.xlabel('dimension')
plt.ylabel('Volume')

plt.plot((D := np.arange(0,20,1)),[(2**i) * sum(np.sqrt(sum([val**2 for val in p])) < 1 for p in np.random.rand((10000*int(np.exp(i/5))),i))/(10000*int(np.exp(i/5))) for i in D])

plt.show()```
in one line, if possible
exotic maple
slate tree
#

Please help me solve this error..

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from xgboost import XGBClassifier
from sklearn.model_selection import KFold, cross_val_score
from sklearn.metrics import accuracy_score
from xgboost import plot_tree

df = pd.read_csv('D:/Python/pima-indians-diabetes.csv')

x = df.drop(columns='Class variable (0 or 1)')
y = df['Class variable (0 or 1)']

model = XGBClassifier()
model.fit(x, y)
plot_tree(model, num_trees=5)```
slate tree
mint palm
#

if we do binary classification on animals(say cats) do we know which node of NN is responsible for which learned part.......
i mean if we train on black cat classifier and then apply to white cat it wont work...........so theres gotta be some node that are responsible for the color part............or is it that the probability on testing is collectively contribution of all nodes for all feature collectively like shape color eye etc etc?

#

seems like a great research topic if we know which node it contributing to what if each node has different things to access in image.

ornate hemlock
#

Hi

#

Can you help me in matrix?

#

I need some help

n = 6
mtx1 = []
print("Input elements as per rows: ")

for i in range(n):
   y = []
   for j in range(n):
      y.append(int(input("Input elements: ")))
   mtx1.append(y)

for i in range(n):
   for j in range(n):
      print(mtx1[i][j], end=" ")
   print()

def check():
    if (mtx1[i][j]==1)%2==0:
        print("even")
check()

User must input only 0s and 1s every row
It should display the matrix and print whether every row and every column have an even number of 1s or not.
I'm blank. completely. I have only wrote the code of making the matrix. It displays it but I can't display whether a row or a column has even 1s or not

exotic maple
misty flint
#

SHAP is a library thats good for ML explainability

lapis sequoia
#

Hi, could anyone help me how to test my NLP model for Sentiment Analysis on newly added data? I have trained my dataset on labeled data with 3 categories and i have downloaded data for test from twitter where i would like to predict a class. test data set does not have any label . Both of the dataset were treated equally. I preprocessed them and trained them on word2vec model(BUT SEPARATEDLY) where i used embeddings of train dataset in my classfier in embedding layer. I was considering transfer learning in this. but i do not know what i should do . I was thinking about replacing weights of pre-trained classifier model with test embeddings weights. because when i do predict on my actual sequences from sentence there is always an error with indices or am I doing something wrong?

mint palm
grave frost
#

I wouldn't go too deep, but if you look up "visulaized attention maps" for CNN's, you can also see which part the NN 'focuses' on

wicked mantle
#
label = label.to(device)
AttributeError: 'tuple' object has no attribute 'to'

do i need to transform label to tensor for fix this error?

serene scaffold
#

I assume you're trying to do GPU computation?

serene scaffold
#

what library?

wicked mantle
#

pytorch

serene scaffold
uncut barn
#

why is the final clustering arrangement better than the first one for KMeans?

serene scaffold
uncut barn
#

yh this

serene scaffold
#

So the question is "why is the second one better"?

wicked mantle
uncut barn
#

yh

serene scaffold
#

Do you have any feelings about why the second one might be better so far?

uncut barn
#

is it because the intra cluster sample scatter smaller?

serene scaffold
serene scaffold
uncut barn
#

i.e. this

wicked mantle
serene scaffold
#

This is a pretty obvious divide in the samples that the first image doesn't account for.

serene scaffold
uncut barn
#

also what are the reasons that we use loglikehood rather than likelihood apart from underflow?

serene scaffold
uncut barn
#

These are not from an assignment but from a lab

#

but I think that -ve log likelihood has derivs that are easier to calculate but what is the advantage of using it as a quality metric rather than using the likelihood?

lapis sequoia
#

hi

lilac raven
#

why is it helpful to use transpose on matrix things such as x = np.random.normal(0, 1, 500) y = np.random.normal(0, 1 ,500) X = np.vstack((x,y)).T

#
    xbar, ybar = x.mean(), y.mean()
    return np.sum((x - xbar)*(y - ybar))/(len(x) - 1)

#covariance matrix
def cov_mat(X):
    return np.array([[cov(X[0],X[0]),cov(X[0],X[1])], \
                     [cov(X[1],X[0]),cov(X[1],X[1])]])
#calculate covariance matrix
cov_mat(X.T) # (or with np.cov(X.T))``` just wondering why when you don't transpose it becomes fucked
wild dome
#

does plt.show() dispose figures and axes? I have nested loops, there's a scatter plot before the inner most loop, and inside it there are more plots and the call to show(), but at every iteration only the inner most plots are still visible, the scatter outside the loop is not, why?

flint mason
#

can we put a title on a 3-d scatter plot ?

red hound
#

Is there a way to have multiple tensorflow versions installed (and ready to choose) on one system?
I have several conda envs and the only thing reliable is the pip version as conda keeps making weird things. The pip version installed is global (activating a conda env with say 2.3 still results in running on pips 2.4.1)

I need multiple versions as tf produces weird bugs depending on code and version. Some things run in 2.3 which crash in 2.4.1. Some run in 2.3 but not in 2.1 or 2.4.1 and so on

desert oar
wild dome
#

is there a way to make like layered plots?

velvet thorn
wild dome
velvet thorn
#

that depends on what you're trying to do.

#

with vstack each source array becomes a column (axis 1) in the result

#

by convention, covariance is calculated over axis 1

#

and an array of shape (2, 500) means you have 2 samples and 500 features

wild dome
velvet thorn
#

transposing this means you have 500 samples and 2 features

velvet thorn
velvet thorn
#

between the object-oriented approach (using Figure and Axes objects directly) and the state-based approach (everything through plt)

#

the latter of which I believe mimics the MATLAB style?

#

also, what environment are you running your code in?

velvet thorn
#

wait

#

Jupyter LAB

#

or Jupyter NOTEBOOK?

desert oar
#

you can overlay multiple things on the same Axis

#

as well as plot multiple axes on a grid

wild dome
desert oar
#

what "wasn't working" about it?

#
fig, ax = plt.subplots()
ax.scatter(x, y)
ax.scatter(u, v)
wild dome
#
for loop:
  plt.scatter(...) # this kept disappearing
  for loop:
    plt.scatter(...)
    plt.show()
desert oar
#
fig, axes = plt.subplots(2, 2)

ax[0,0].scatter(x, y)
ax[0,0].scatter(u, v)

ax[1,1].plot(a, b)
ax[1,1].scatter(c, d))

etc.

desert oar
#

can you share your actual code?

wild dome
desert oar
#

ok then. in general: to plot on the same set of axes, use the underlying Axis object and its associated methods, instead of plt. Use plt.subplots to create a new axis, or multiple axes in a grid

wild dome
desert oar
#

if you do show() inside the loop it will create a new plot at each iteration of the loop

#

are you trying to create an animation?

desert oar
#

in that case, you need to re-draw it every time

wild dome
velvet thorn
wild dome
desert oar
#

im surprised that jupyter doesnt support matplotlib animation

#

at least try it

velvet thorn
#

it does

wild dome
velvet thorn
#

you could

#

use Jupyter widgets

#

to interactively control the "flow of time" of your plot

#

that's probz what I would do

#

but you have to be comfortable working with that kind of stuff

wild dome
velvet thorn
#

added complexity

#

anyway the point is

#

in interactive mode

#

you don't call plt.show

#

or fig.show

#

you just make changes to the Figure/Axes

#

and things happen

#

but you need to modify the backend

#

MPL uses

#

%matplotlib notebook

#

it's been like a year+ since I worked with this though

wild dome
#

okok gotta look into that, thanks

#

there should be a channel for plotting related stuff

velvet thorn
wild dome
velvet thorn
#

this is the best channel for MPL, plotly, etc.

#

also a bit niche

#

a channel just for plotting stuff...

wild dome
desert oar
#

people also post in help channels, but you will see more relevant activity here

#

(and have more relevant help)

#

the "how do i chat bot" questions dont usually go anywhere anyway

willow quarry
#

guys

#

i done something

#

some of you may remember about me talking i was going to make a reinforcement learning

#

i had built my own environment to work with retroarch

#

it still has manny rough edges

#

but it is kinda working

#

unfortunately i am really bad at making agents and models so it is taking a wille to be able to show you my results

#

and it needs a little set up so its not just post here and you are ready to try yourself

#

but i am working in some way you will be able to easly build models for it with simplified parameters and working examples

#

but i am more focused in modular score read where with some keywords it will be easy to mout it for other games

#

i will soon be launching it on git hub

#

and i count with your help to expand it for more score methods and more agents so we can make tournaments on twitch of ai competition

exotic maple
astral path
#

would fuzzywuzzy be on-topic for this channel?

#

as data manipulation

willow quarry
#

fuzz wuzzy?

desert oar
#

@astral path close enough

astral path
#

nvm i solved it already

desert oar
arctic wedgeBOT
#

fuzzywuzzy/process.py line 81

logging.warning(u"Applied processor reduces input query to empty string, "```
exotic maple
desert oar
#

they use other libraries for the actual edit distance implementation anyway

latent blaze
#

How to fix this.
UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure. plt.show()
I have python-tk installed