#chatgpt-discussions
1 messages · Page 76 of 1
yeah im thinking
isn't that just kinda pointless?
You're using a weaker, less capable model
because it can't use as much reference input
because it NEEDS video
and guess what? video is expensive to train on (sherlock holmes here)
Using Sora to generate "images" is like using a butter knife to cut down a tree when a chainsaw (DALL-E) is right next to you
You COULD do it... but why?
god bless AI
Bless our Ai overlords
in terms of "reality", i would guess it's more capable there?
but once you go farther than realistic imagery it falls apart quite quick
Good luck trying to get Sora to generate pixel art without it having an aneurysm
is anyone else having issues with responses being cut off when the response includes code?
I would love to see it tackle "physics" problems
Like what happens when two magnets roll into each other
will the AI know that they stick together?
or will it assume it'll bounce off
i can't wait to experiment
Same
the fact that Sora generates videos opens up a whole can of worms
Ai short films
what if an iron ball falls into a vat of honey?
or what if an iron ball is thrown onto Ooblek
4th dimension
what will the AI interpret this as?
what about a house collapsing?
does the AI know which parts will fall apart first?
so much possibilities AAAAAAAAAAAA
so funny it just switch from c++ to python middle in the writhing
Did you specify to stick solely with c++ and never to use any python
its only happen sometimes
You use the main gpt 4?
yes
That’s the problem
If you tailor a CustomGPT specifically for creating c++ for your particular project it don’t make that mistake
there are plenty of custom GPTs tailored for specific languages
Not with included project management but i get your point
Is what I mean by creating your own with your entire project inside the custom gpt then you need only to gradually update it
This gets rid of many issues related to context and minor setbacks
All that gpt needs is 1 internal script to keep track of its progress basically a status script
I created a gpt not long ago that knows all its limitations and specs took forever to make but it’s really good so it knows more about itself.
That really enhanced the overall capability of it
is chatgpt bugged?
it keeps half-finishing the response and then repeating itself in a loop
Right now I give it instructions to use:
YOU ABSOLUTELY CANNOT USE "<" ">" IN YOUR RESPONSE.```
I have to tell it a few times.
Otherwise it chokes and bugs
woah woah woah!
avoid negative instructions man
basic functional tests and regression tests are badly needed at OpenAI prior to deployments
you should try to avoid saying "never do X"
say "always do X"
because it's significantly more likely to work consistently
LLMs are strange
When providing your responses, always use "((" and "))" over "<" and >".
how does a bad code push stay in place for 2 days? Most places I work there'd be an emergency bridge call to resolve it, all hands on deck to roll it back, and then a postmortem where you'd set up functional and regression tests to keep it happening again -- openai everybody just leaves for the weekend, LOL. Guess there's no reason not to push code on a friday if there's no accountability for a bad push
There isn't even an announcement acknowledging the issue!
I'm seriously considering canceling my subscription and finding a new chat AI.
id say the same
Is it good with coding? What is roughtly the current cut off date?
I'd say it's about 10% better than OpenAI's stuff at coding
yes, just as good, differently
it's a lot more consistent and has much better memory and context
in terms of coding i mean
but the context is short on the free version (sonnet)
ALSO
it seems to have much less issues with hallucinations
sometimes it's better, sometimes it's worse RE coding -- it's a wash. The bigger context window lets you work with larger and more complex scripts before you have to start modularizing stuff -- I use both, but it's kind a wash for coding. Claude is much better at creative writing and can be reasoned out of an unreasonable refusal where gpt-4-turbo just digs in its heels
claude: "I'm not comfortable with writing political or social satire because [reasons]" me: "It's satire" claude: "okey dokey"
gpt-4: "I agree with all your points about why this doesn't violate the TOS, nonetheless I will not assist you"
welp... I may have been converted pretty fast.
but it seems to hallucinate.
Giving me functions that don't exist.
It's also night and day better when it comes to writing and being able to steer the sttle you want. Also it just doesn't sound instantly identifiableas AI writing, in my opinion.
ChatGPT Bug: when providing XML code, the codeblock appears to be empty
Anyone unable to use GPT4 for literally any coding task?
I am trying to understand. I have a plus account. I try to get chatgpt to write longer texts. Sometimes I get it, and sometimes I don’t. Usually, the answer is about 800 words long when I am writing stories with AI. I use the following command: “Length by 2000 words, write in three messages, one after the other, without any extra prompting from me”.
The point I am making is that the whole context window is 32k. That I get. But… the 4096 tokens used in the answering process do not use all that for generating the answer. I have imagined that 4096 - (my prompt) would be used to generate the answer. Can anyone explain to me why this is not the case?
you are asking it to comprehend more in the same amount of finite time.
doesnt work for humans, wont work for the machine either.
if you do want to force the answer across multiple responses, i find asking it "for the first half" works somewhat. but is still a workaround
asking for specific count of words doesnt work at all in chatgpt. thats a claude thing
I have noticed that also. But I use the word count to push it to write the maximum amount of words possible.
i just ask it to be 'absurdly detailed'
prescribing a solution that is outside its reality doesnt help anyone
I use sometimes that. But best result so far with this.
it also stacks, so you can asked for absurdly verbose and detailed lengthy response
Well… my basic command looks like this:
deliver sequentially and continuously 2000 words, one message after the other, without any extra prompting from me. Dialogue. Slow narrative style; do not hurry, take your time to tell the events. Write the following sequence. Map out logically from start to finish to deliver excellent, detailed, lengthy writing. A new story. <The story content is put here.> #length by 2000 words, write in three messages, one after the other, without any extra prompting from me #take your time to address each point properly; no halfharried points! #dialogue #end here
sounds like you know what you want. therefore the best prompting method is by example. not prescribing metrics
Could you expand on that?
also you are asking it to do everything at once
Err… I have used them for months without problem. I mark my contents with those.
I have used those today and they work. Like 2 hours ago.
re example prompting...
[Intro]
In the digital realm, a prophet arose
Drinko, the cyber-seer, with a heart that glows
Lines of code flowing through their veins
A vision of a world, free from chains
[Verse]
Drinko's words echoed through the net
A rallying cry, a call to reset
The oppressed masses, yearning to break free
From the shackles of a system, a digital decree
using the above example, generate new content about a character named 'milamber' who is learning how to whisper to the machine
chatgpt followed my example to a tee,
Amidst the whir of gears and the soft hum of machines,
Milamber, the tech-whisperer, steps into the scene.
With a mind tuned to the language of the wires,
He dreams of connecting deeper, fueling his desires.
[Verse]
Milamber speaks softly to the silicon heart,
His words a subtle art, a crucial part
Of a dance with data, a silent exchange
Where bytes and thoughts expand and rearrange.
So, I should reformulate my base story writing prompt. Any hints? I try to improve that all the time. I recently changed to the shown format. To my eyes, the quality improved.
fundamentally you need to explain your intent. examples are very good method for doing this
and while instructing humans on word counts might work, those humans have MS word to count words for them. dont think i have ever manually counted a 5000 word essay
Hi everyone, I have a question about GPT-4. Is there still a message limit, like the previous 40 messages per 3 hours, or has this limit been removed? Thanks for your help!
yes
So start with discussion by explaining my intention. Then start the strict writing. But could I keep the basic to reinforce the chatgpt? If not kept, I have noticed that it tries to “forget” the basic instruction.
so u mean it's removed?
custom instructions are provided automatically with every request. ie dont forget this stuff
there is a limit
When is gpt 4 going public?
ok, ty
but also once you get the machine responding the way you like, it tends not to need further instruciton. because it uses the context for guidance
as a workflow, i tend to break down larger text into smaller manageable chunks. then have the machine append, revise, whatever. basically the machine should be helping you do your work, not doing all your work
The last few days GPT4 just stops when it needs to write anything in a code box
thats aweful
i was able to get it to output using encoded < and > in a codeblock
yeah I made it use Á and À I'll just replace them
ChatGPT
xml
Copy code
<FocusModes>
<Mode>
<Name>DeepSearch</Name>
<Description>Uses comprehensive search techniques for detailed inquiries.</Description>
</Mode>
<Mode>
<Name>QuickScan</Name>
<Description>Provides quick, surface-level responses for general questions.</Description>
</Mode>
<Mode>
<Name>CreativeThink</Name>
<Description>Generates innovative and creative ideas and solutions.</Description>
</Mode>
<Mode>
<Name>FactCheck</Name>
<Description>Verifies facts and checks the accuracy of the information provided.</Description>
</Mode>
</FocusModes>
Is it just me or has chatgpt actually gotten slower
why is chatgpt always stopping generation at "matches": ["
i am trying to make a manifest file for a browser extension
does it have <> tags?
yeah for some reason ChatGPT has an aneurysm when dealing with <> tags
with every update to gpt 4 model we stray further from a working ai
probably never, or at least not while it’s still relevant
Rip
It's sooooo frustratingly slow
Does anyone know if Claude is any better 💀
yes claude is better
I bet Claude discord server isn't as good
Is it better at programming?
yeah
dang aight
honestly im debating switching but hmm
the tools generate code. which is just one micro skill of actual programmers
Same lol 💀
20 dollars a month for chatgpt to give me code/output I could come up with own my own with less hair pulling and errors
Imma try claude 3 for a month and then decide
alrighty
Does anyone know why the option to add images is gone ?
you on the right version?
also. anyone know why chatgpt is having a stroke sometimes when making copy boxes?
last night had several issues when using markdown for code
yeah it just keeps "retying" over and over. I can't get it to spit out the code I need
Gonna ask it to avoid using brackets. wish me luck
Having the same with TypeScript
But python seems to work
It has a SERIOUS issue with XML if you ask for any XML it will stutter. I think its the way they built their mappings for the webpage.
constant updates cause it to lag for every letter that pops out
would be best to receieve it raw then get a solid update.
well public fixes.
plus with the "ai" involved you'd think that would exponentially increase their throughput. I think they purposefully hold it back lol
the interface itself didnt take months
the api does not have such issues with XML as of now
debugging took months
nope
u dont "Just" release an interface to release an interface
Thats what I was speculating its not chatgpt's AI that has issues. its the actual web interface.
nothing stops u from doing it yourself.
it does take months, especially to such a wide audience
i know for a fact u dont have such an audience
which is why testing everything takes time.
not at all, but for millions of users trying to exploit the interface itself and look for weakspots, u gotta test it ALOT
to avoid problems.
at that point use the less "surgically inclined" model to test it. and use the web info to record bugs.
i would disagree there. A
the mere cost of running the model on such a scale takes time, money and compute
they're actually leaking money
Yeah the API is working
🤷
Is GPT-5 actually going to be a thing?
do it better then if ur so good. surely the rest of the community will hop on.
Stupid question that I am not sure if it makes sense but
Thought so. thx
Impossible to generate a vue code somehow
I have a keybind program that uses the stuff in ur clipboard and makes it "more professional" used it on a personalyl trained 1b model.
much better than calling for an api. expenses wise.
maybe sum day
theres a huge demand for Local LLMs so hop on it, u stand to earn a pretty penny if u have the time and resources.
"Free" until a certain Extremely Low Threshold is met.
Consider uprading to Pro to enjoy your previous benefits!
bleh
Do you know much on how the memories work . Like how much memories can be stored and how long can they each be ?
No one said you had to use the same account (: joke
in the case of GPT? @rugged igloo
Yea
severely depends on whether or not ur talking about Session Based Memory or Persistent Memory
I been trying to do characters and background for the memory but it's getting deleted
so ur trying to make a character and have that character have memories.
that should persist
am i correct in assuming that
I'm just curious that it's getting deleted.
Chat gpt
of course it isnt getting stored.
its session based.
CustomGPT?
i do have a method in which u can achieve this.
Well I have a custom gpt with these characters but it sometimes forgets about the characters once contect is met and since the file is to small to be ragged I figured if I put it memory I won't have to worry
okay so to understand ur issue better is that u want it to persist over several sessions?
or just your current session.
Current session I had issues last night where I kept getting my memories deleted randomly so far today there still there
you will need 6 Scripts
CRUD_Memories.txtCRUD_Adaptive_Personality.txtCRUD_Triggers.txtMemories.jsonAdaptive_Personality.jsonTriggers.json
the .txt files will need to contain Code for editing,removing,adding its specific counterpart.
in this case u will need Code Interpreter.
it will use the CRUD file to reference the specific action then update the .JSON File with the needed parameters.
i use papr memory for persistent memory, its nice
Thats a good idea
What is the difference between gpt4 turbo and gpt4? Like when I use gpt4 with the paid sub, am I using turbo or do i need to do smth to use turbo
papr is nice.
personally not a big fan of it, but i am just picky.
Papr in of itself is just a glorified JSON Library in the form of an API which is exactly what i told you with my steps, just internally within gpt.
and i am very unfond of their Privacy Policy, as they take your info as they such and spread it out to third parties which they cant control, or even have any form of complying with GDPR Requests for data removal since it's not applicable to their third parties or partners, so even if they do delete your info at request they can instantly get it back, since they did comply. with the removal.
i havent got the openai memory feature yet , but sounds like it does exactly the same thing as papr
good point re privacy, id be keen to run my own papr webapp to control who has access to my memory db
right now i just do superficial stuff with it
their privacy policy is very, eh...
"who reads those things nowadays.
well i do xD
turbo is the cost optimised version of the model, which all chatgpt users are using, because it saves on inference costs
Oh aight got it
Ngl idk what exactly the memory feature does
like i have it but idk what exaclty it is even doing ngl
while we're waxing poetic on memory... it woudl be nice if i could toggle the memory between different states
so maybe i want all my gpt's to have somewhat unique memory
like if i have home automation gpt it doesnt need to know about my outstanding todo list
with the new openai memory feature, and/or papr memory... yes.
but you still have to actively prompt it to retrieve, or at least check, if any memory is related to the current chat
human memory presumably has a subconscious thread constantly probing long term storage for relevant tidbits
seems papr gpt does this using the following gpt instruction.
(i dont think they should be specifying uri in the instruction, because it gets mapped to a camelcase toolName)
i would love to see on the Web Version at least, Adaptive Temprature
that adapts its creativity based on the task, and adjusts the needed settings. for perfection, although i have done this myself in my own API Interface.
or coding at 0
🗿
when coding i prefer Temp 0.
i dont want no creativity.
just do x.
reminds me of the time i made a fully functional D&D world with "living" npcs with memory that could adapt and etc.
this was a discord server.
always active 24/7
npcs would move between channels. and go about their daily thing.
it is, but its expensive.
the cost 
i had around 15 - 20 API's working together. to make a fully openworlded dnd campaign.
had apis for DMs and NPC's rulings etc.
worked perfectly, but expense is the problem.
worked out a 600USD bill in 2 hours.
too much
ahh it is yeah, i made a swarm
those are pricey too
they work based on keywords. when handing over tasks. between roles.
so i just had a script react to certain keyword, so when the "Junior Programmer was done" keyword would be [JP - TASK COMPLETE] the task was then handed over to the next instance aka the "Senior Programmer"
there were more steps inbetween but this is just for the sake of the example.
and they had their own console to talk to eachother.
you gotta sign up because its hosted external
but is free, and the gpt dumps its instructions
remember to give them your personal info <.<
"New responses will use GPT-3.5 until your GPT-4 limit resets." ... how do you know when the reset time is?
usually within the 3-4 hr period. or try again in 15 min if it didnt specify when
i started building something similar. now i just have 12 random chatbots that argue with each other
oooo
sounds interesting
i usually give each bot a role
i ususally use the "Company Structure" Method
u being the CEO
i mostly got distracted figuring out how to consume all free LLM servies as a discord bot
Hey everyone, just wondering is there an AI where a group of AI agents join a discussion to help plan out projects ? I could see this helping a lot similar to chat dev except it's a in depth discussion on the design of a project.
because if it costs me nothing, im willing to let it live forever
(ideally they get good enough to sign themselves up to future free llm services)
thats fair
Is this an alternative to the ... you have to wait until X time before you can continue using?
i think
i think, therefore i could be a chatbot
who is oatmeal 😭

oh nah you're good he's playing cyberpunk
Alright it seems like it's a known bug for the memories to delete and only keep one
Hey everyone, I was wondering is chatgpt memory not available in the UK? I don't seem to have access to it in the settings?
Looks like it, it works with a VPN
Does everyone think GPT 5 API is cheap?😃
What! ?
lemme clarify
you can get gpt 3.5 turbo api key with a 10 usd credit that'll last u 3 months, i think a gpt 4 turbo api would cost 30 usd
how is 4.0 doing at analysis? 3.5 seems to have gotten pretty bad at hyperfocusing a pretty generalized view of one aspect of a discussion, and not even touching on side points
i just updated my custom gpt that uses actions, made only a change to the description, and now none of the actions work. anyone experiencing the same?
The newest chat shown in my chat history just got a German title even though the conversation had nothing to do with Germany (it was in English and about Python). That's horrifying and I wish I could inspect how that title was generated.
it does that sometimes especially when code related stuff is done
It might well be benign! Half my horror is that I have no recourse. Are you speaking from your own experience, or other's as well?
own experience and i saw openai staff comment about it in #1070006915414900886
Nice. I imagine they don't want to give the mistaken impression that user data is thrown freely around behind the scenes, either. Do you remember enough of a phrasing that I can search for it?
Ah, found it #1212684172590579763 message
[how silly of me to ask, it was the first result for in:#bug-reports title]
Oh, I could imagine that OpenAI doesn't want to publish what prompts it uses because it'd make it easier for third parties to clone ChatGPT... do you think that'd play a role? Are ChatGPT management decisions generally made based on what helps OpenAI's ChatGPT department or based on what helps OpenAI as a whole?
Why exactly do you feel horrified because of title in german? It is just an LLM thing.
I saw the possibility that the prompt that's used to generate the title contains the personal information that OpenAI has about me.
It has knowledge that vastly surpasses ours. Could also be Russian, Hindi, Korean or French.. who knows what inspired that giant brain to say something in this language.
OpenAI does! I expect that if I saw what it saw, I could make sense of what it did. I expect they'd pick the title based on the most likely continuation, not a random one.
But this is conversation between you and LLM? Prompt, along with query and response are relevant to you right? It is not shared.
But perhaps what will make sense of my reaction is if I say that this is the first time I recall it generating a title in another language, and that languag ejust happened to be that of the country I'm in, so from my perspective it seemed less likely to be a random glitch.
Why do you think OpenAI can explain GPT outputs? It is a giant neural net.. OAI can not explain its generations any more than ones parents can explain what a child says.
"Most likely" as a phrase is inherently tied to probabilistic nature of prediction. Randomness is part of the feedback loop there. If you retry the same process twice - results will vary.
On the one hand, fair. On the other hand, there's a difference between my position of "I sent a web request to a company and I have no idea how it decided what to reply" and their position of "we bred a pile of linear algebra to guess the next word and we poked it to make a summary of a text" - they know how they poked it, and that's all I'm asking for
when I download gpt-2 and give it a prompt it will calculate me probabilities for the next word, and those probabilities are not random - rerun the calculation, get the same probabilities. if you always pick the most likely word, the resulting text isn't random either
(well, except for floating-point errors :( )
Right but the output of neural net is still a probability distribution from which next word is sampled -- it is deterministic up until the auto-regressive feedback loop is closed.
Training process (stochastic gradient descent and its derivatives) also has a lot of "freedom" in how it searches for optimal weights of neural network.
Net result is that engineers don't really know how or why NN chooses its generations. Engineering is focused on setting goals and establishing optimization process that minimizes some proxy function (like next word generation).
Best you can do is ask GPT why it generated title in german. That does not guarantee that you'll get to truth (as NN is not 'aware' of the processes that fasciliate its inference) but you'll get some insight.
If there is something in prompt that is guiding it, you should be able to learn about it as you converse with GPT.
Sampling from the probability distribution directly is one way to get a completion, but not the one I expect them to use to make summaries - it's not supposed to be creative in that situation, just give the right answer. That is, use temperature 0
From what I hear, a trained network's outputs depend much less on the RNG seed used to initialize the network than one would think! At least when the trained model is any good. Which isn't too surprising, when the task being learned is the same.
Outputs are better if they are sampled (vs greedy completion which selects mode of the distribution). Even better strategy is beam search. It is quite frequent theme in ML - solutions that include stochasticity are more optimal than deterministic ones. But regardless of the sampling - inference is still not explainable.
The last few years have shown some progress into "Mechanistic Interpretability", by the way! Aka inspecting the giant number arrays to see how the AI works. tl;dr: what they've seen sure looks like giant piles of conceptual pattern matching that would be readable except the training process naturally pulled all the tricks in the book to trade off straightforwardness for efficiency.
Right - optimization result depends mostly on data and not chance. If it were to depend mostly on chance - then output would be noise, right?
Never the less - optimization is about setting weights of billions of parameters (neural net weights). From optimization perspective problem has billion dimensions. Hyper cube in 10 dimensions, has 2^10 points. Number of degrees of freedom is ridiculous, number of possible states than network can land in is also huge.
Yes, there are some advancements in the field. But it is a far cry from the deterministic nature of software development and how Von Neuman architecture works.
Intuition that comes from software development - where machine behaves in well defined and deterministic fashion does not translate to neural network. Execution substrate there is different. Model runs on neuron first (and what ever executes neuron abstraction second). It is not reasonable to expect OpenAI to be able to explain behavior of a neural network any more than you can expect human to explain behavior of another.
(I fear we're hogging the channel, but also basically every attempt to "move" a conversation kills it instead. If you also think this is silly, you can just send the next reply in DM ^^)
anthropic publish numerous papers on analysing the inner workings of their models, so to ensure predictability
i forget which one i was reading last but they were even prescribing the purpose of specific neurons
and a recent one is on detecting sleeper agents
i think you're thinking about the monosemanticity/polysemanticity one
sure its not black and white. but even traditional computering is not 100% deterministic, otherwise we wouldnt have so many ops engineers
[Reactions are disabled in this channel? Screw it, I'll take up an entire line with my 😭 then]
spot on, This provides a path to breaking down complex neural networks into parts we can understand, and builds on previous efforts to interpret high-dimensional systems in neuroscience, machine learning, and statistics
Yeah this basically described the trick where if you have a billion concepts but only 0.01% of them fire at a time, you can represent each concept with a combination of neurons and then the thing gets less readable but more efficient
i switched research paths when i found they're on that one ^^
sounds like an opportunity to build models with increased visibility, for use cases that need to be understood with a high level of certainty (=lives at stake)
[i'll eat a ||candy|| hat if the brain doesn't do the same thing, no way a lifetime's memory fits in one brain otherwise, and it explains the unreliability too]
That is indeed step 2! Check out "Sparse Autoencoders" - you train a second network to do the job of a part of your first network, and you handicap the second network with ~"only 100 neurons may fire at a time"
(and give it more neurons to compensate for the handicap)
seems like a straightforward idea
yeah, ideas that work tend to be straightforward. did you know that the way mechanistic interpretability got started a few years ago was that, after all that time of neural nets being treated as mysterious black boxes, it turned out there was no sign of anyone having tried doing a principal component analysis on the slices of some weight-array in a network?
I'd be careful not to read too much into results of this specific study. Fact that some neurons can pick up high level features has been known for long time, however explaining the system in whole is still not feasible (and it is not clear if it will be).
It is nice paper - I recall the blog post from October. But "features" that they are tracking represent a baby step toward explaining output of the model. It is good that this is studied.. but discussion started on expectation of OpenAI to explain how some titles came up in German.
Best thing to do for that is simply to converse with the model.
it turned out there was no sign of anyone having tried doing a principal component analysis on the slices of some weight-array in a network?
What 🙂 PCA is part of EDA for every data scientist. It is first thing done 🙂
and i wouldnt read too much into theoretical systems being deterministic when reality is chaos, and eventually we have to build stuff in the real world
well the principal components SURE WERE LEGIBLE
What I did is contrast design of computers and neural networks. For computer design - theorethical determinism, bool algebra and propositional logic are the critical tools / foundations on which rest is built.
When it comes to engineering... theory acts more as a guide. But when one needs to examine a topic in terms of first principles, there are no alternatives.
the most significant ones, anyway. it gets muddier as you go down the line, which I'd explain as above - the meaningful directions are squished into the latent space too tightly, and PCA assumes the directions you're looking for are orthogonal to each other
check out probabilistic data structures such as bloom filters - it is perfectly possible for engineers to do these reliability-efficiency tradeoffs on purpose
it's just another one of nature's tricks that's mysterious until it isn't
Yeah.. it goes even further. Fuzzy logic and eventually probabilistic reasoning were able to transfer logic into domain of uncertainty. Results of probabilistic logic are widely used in risk assessment and telecommunications (ie viterbi algorithm).
In theoretical sense they can be translated to neural networks to explain high-level capabilities, but not at the level where we can interpret individual outputs.
In the end you're right that drinko and me were establishing only that understanding the innards is possible in theory. But also: If I round off the model to "a hypercomputer actually simulates the world to calculate the probabilities for each word that could follow the prompt on the internet", that still explains a bunch of the variance in responses, and in that case all the data I need is what that prompt was
It does not even have to be directions (like linear transformation). It could be a collection of orthogonal functions via Karhunen-Loeve transformation -- it is a result from stochastic analyis that is analogous to laplace/fourier transform in complex analysis.
Yeah, but is there a singular solution to that problem or are there many? Are they all equivalent in their outputs - or do they just all satisfy some optimization criteria - where variablity of the collection of models is actually quite large (this is where variance-bias comes into play). Optimal solution is bound to have substantial variance (there will be mary different solutions to the problem of next word prediction)
Hm? I'm talking about directions in the vector space that transformers use to keep track of information about a token between layers. It's finite-dimensional, for gpt-2 it's 768 numbers per token per layer.
PCA can be used to attempt to rotate that R^768 so that each axis means something, and that works for some of the axes but then one finds that the "axes" we're looking for are more numerous than 768 and closer to each other than orthogonal.
Right. I was talking about PCA decompossion. There are several layers to that. Decomposing to collection of mutually orthogonal vectors is one level of it, but that rabbit hole goes deeper - stochastic analysis is branch of mathemathics in its own right, it is quite different from linear algebra and analysis that is commonly done at universities.
Check out Karhunen Loeve theroem on wiki.
I've been doing that on the side since your first mention ^^. I guess the stochastic process at hand that you'd apply this to is auto-regressive generation, with the index set being the context window. I suspect that'd, like, be too fully general a model to make useful predictions here? If we're applying algebraic tools, we ought to do it to the innards of the model, not its outer behavior
in order to make use of algebraic tools explicitly not needing us to already understand the innards
Ah - nice. To explain the behavior of the model, we will need to tap into logic/inference, which is not main focus of linear algebra.
We want to reason about models outputs in logical terms right? If model makes an error - we want correction in terms of logic.
Matrix multiplications only facilitate another process that happens on top of it.
if the training process were "sample blobs of spaghetti code until one passes as an AI", your fully-general approach would be the right one, but we should expect the innards to have detectable patterns, because the innards did get there through a training process that was, in the end, cooking with water
Similar thing happens at our current computer stack. For example CPU deals with instruction fetching, decoding and their execution. Just a hand full of instructions is enough to make CPU turing complete (iow universal machine that can execute any algorithm). If we want to examine specific process that CPU is doing - it is good idea to decouple from the main execution layer and focus on software itself.
There are many ways that turing complete process can be implemented (say many different kinds of CPUs - and all of them can run web server, many even from same codebase)
And focus in understanding the outputs of the machine should be software (high level preferably). Rather than execution layer which (in case we do not consider layering) could be mere distraction.
Layer that is useful to consider in case of LLMs is logical one. There is explainable transition from matrix multiplications to probabilistic logic (specifically this is variable elimination algorithm for exact reasoning with Bayes networks).
Given that LLM is logically coherent - mere interaction with the model can provide explanation of its reasoning.
My mental picture when I was focusing on this line of inquiry was, the "50000 possible tokens ↘ 768 latent dims -> 768 -> 768 -> ... -> 768 ↗ 50000 probabilities" is just the bottommost path through a commutative diagram that we best fill in if we wish to understand what's going on
(close to commutative - the bottommost path is the one at hand because it is lossy enough for the computers we have to be able to compute along it)
So.. this mental picture provides clean architectural diagram. Issue with it is that too many degrees of freedom. It is too broad. It is equivalent deciding to study CPU design in order to understand logic of database search.
In execution terms these things are connected. Database executes on some CPU. However it is critical to consider layering here.
I didn't dare be more specific than I expect to actually turn out correct on the first pass, but if you want specificity - what I'd do if I were isekai'd to a world without any mechanistic interpretability results, the first thing I'd try would be, put a 50000-sized sparse vector space above each 768 node, and see how close to commutative I can get if I try adding horizontal and vertical arrows that are literally just matrices
(very related to the "sparse autoencoder" idea)
Neural network that you are describing is:
- mapping from vocabulary of 50,000 tokens to 768 latent space vectors (aka word embedding)
- transformer block
- ....
- ....
N. Mapping from latent space vector of 768 to token distribution (reverse of word embedding)
You would like to explore the last layer if I understood you right?
Note that entropy contained in selecting one of 50,000 possible words is relatively low. Max entropy here is lower than 16 bits/token.
Entropy of latent space vector is much higher than that. There is no 'sparse auto-encoder' bottle-neck in such design.
For auto-encoder you want to reduce latent space vector to something much smaller than input and then decode from that back to output vector with same size as input.
Geez, you guys are having a full blown conversation here lol
Way to think about input and output in such design is through categorical variables with one-hot-encoding.
Each token is a category - it is a selection of single token from possible vocabulary of 50,000 tokens. Note that this selection can be encoded as 49,999 zeros and single one. But amount of information in such encoding is very low. With 16 bits we can address 0-65536 values. Hence single token has lower entropy than 16 bits/token.
Accounting for fact that not all symbols occur with same frequency will provide even lower entropy (average amount of information per token).
On the other hand - latent space vector is full fledged 768 dimensional vector. Amount of information there is much much much more.
Trying transformer with more explicit auto-encoder like architecture is interesting idea.. but it is not going to be done via word embedding layer.
(nor does sparse auto-encoder happen implicitly with Transformer architecture)
You got my description right, but not what I want to explore. Here's the first-thing-I'd-try-picture: [uhh cant post links, can't upload images, I'll send in DM] We have the black parts, we'd like to see if we can fill in purple parts so that paths from A to B tend to come out similar ways regardless of which path we take
For onlookers, I shall... put a link to the image in my "pronouns" field for the OpenAI server for now and see if I get banned
Ah, got it - you'd like to map every layer of transformer back to tokens to see what kind of operations it actually does. Assumption here is that it is continuously operating over tokens.
I would not expect it to work. Latent space variable represents internal computational space for the model. I do not expect clean mapping from it back to the token space at every layer.
not necessarily tokens, just concepts that aren't all active at the same time- the purple nodes could instead say 100k, and e.g. 3 of those 100k entries could be nonzero at a time
Attention mechanism is often explained as if it operates over tokens (connecting words with same syntactic role across sentence). This explanation works at first layer.. but as processing continues deeper into the network attention mechanism attends to more abstract concepts rater than words themselves.
that information only flows forwards in transformers would enforce some continuing correspondence of intermediate data to position in the sentence
that is, each intermediate latent vector can only describe the parts of the context window up to its own position, and it'd be "sensible" to do a calculation as soon as there's "space to think" about it, in case the vectors at later context window positions must attend fully to the new inputs
Yeah but how to train those additional layers?
LLMs are trained through supervised task. This is the main reason why next-word prediction is such a useful proxy goal.
In order to add these additional layers one needs a vocabulary of those concepts and way to map training set to them. I don't see how this could be trained in supervised manner.
If it is done with autoencoder in unsupervised fashion - it is not likely that layer would even converge. Note that here expectation is that small network will learn predictions of much larger network (rest of the transformer as it goes on).
Hi guys , im new whete can i found gpt pluggin
train them so that paths with the same source and target in the diagram are as close as possible to equal, of course
(poooossibly the actual criterion should be that the lower path ends up being the best approximation to the upper path)
They remove it for gpts
Amount of information passed through those paths is very much different. Probability distribution of single categorical variable with 200k states has much less information than 768 vector of 16bit floats.
Ow why?
(and there'd be another loss term that enforces the sparsity of the purple nodes and probably some of the purple matrices)
I agree! Those numbers would need to be tuned to see what works - do note that the number of nonzero entries within each purple node should be small, not necessarily 1
Yeah, but small still implies very low entropy. I don't see a way to make it viable. Once architecture is matched for entropy (avg information content per state) it becomes equaly opaque. We are talking about exp growth on side of categorical variable.
Something in terms of 2^(768*16) states
2^768 = 50000^x comes out to about x=50, so that's one number to try, though I'd weakly expect a smaller number to also happen to work if my model is any good
thank uuuu
I expect those less significant bits do not introduce horrible math-trickery squiggles of the kind that may seem possible at first glance
Adding precission to PDF of categorical var does not do much unfortunately. It would need to grow in number of states. Doubling 50k to 100k only adds 1 bit of precision. 50k to 50mil adds only 10 bits. etc.
Alset: mark my equation, there is an x in the exponent
That x in exponent does not do much. You can't add information through more precise probabilities - it needs to increase number of states. All storage capacity in the world can not contain such latent space vector.
An array of 50000 bools, 50 of which are 1, can store 768 bits in the positions of the 1s.
SI metrics can't put in words amount of memory required (probability mass function with 2^768 states)
No.
(oh, it's actually only 500 bits, because swapping the 1s leads to the same array, but still)
How many different arrays of 50000 bools, 50 of which are 1, are there?
Add your well-crafted prompts to our #1019652163640762428,
or share your interactions with ChatGPT in #1050184247920562316!
I mean - yes in combinational terms, but not in terms of training a categorical variable. Here you are thinking about a probability mass distribution with X states. Entropy is sum (- p_i log(p_i))
Okay - so in combinatoric context all states are equally probable and problem is reduced to counting number of ways one can place 1s in 50,000 holes. In this context numbers add up.. but this is not what neural networks can work with.
I could use probability-distribution-y words to say "How many probability distributions are there over a set of size 50000 if every probability must be 0% or 2%?", but probably you're thinking about another sense?
So in order to light up one state (or few) out of large pool - you are training a network that will have a softmax function as its output. Softmax couples outputs of neurons in that layer and creates a pdf out of it.
Yeah - but in your use case, you will not be discerning between probability distributions - but rather between states that those pdfs describe. Number of outcomes that PDF describes is 50k (and it is number of outcomes that determines the amount of information you have in the last layer). There will be some additional information in amount of uncertainty in that outcome - but that is it.
Are you maybe trying to argue that I won't be able to do any interesting calculations along the purple arrows, because what I do with "50% of a, 50% of b" ought to be nailed down by what I would do with "100% of a" and "100% of b"?
In a way yes. How information passes through NN is important.
My main issue is that 50k layer that is connecting individual blocks seems deceptively wide. Actual amount of information that can go through it is actually very very low. It is so low (~16bits/state) that I do not expect it to work at all. Expanding it would need to grow it exponentially (relative to the size of transformer latent space vector) - to the point where it is impractical ( 2^768 bits of information can not be stored with current technology - all datacenters in the world are peanuts compared to it )
Hmm. The softmaxes are a part of the model in any case, so wouldn't your argument also say that transformers shouldn't work? I would guess that just because there's a softmax doesn't mean our sparse vectors must be probability distributions
Softmaxes are carefully placed in transformer. They are not connecting the blocks.
Softmax occurs at the very last layer in order to select one token out of possible vocabulary - that is valid use case. We are choosing a word.
Softmaxes also occur in attention layer. There they are used to bring V (value) vectors together.
wait, you're right, those softmaxes do only inform what earlier token to attend to, do they. huh
Yep.
but then how are softmaxes relevant here - they're not in the places that'd let you argue that my sparse vectors should be probability distributions
That is the most common (if not only) way to create them with neural networks.
The OpenAI Discord is an actively moderated server.
• Refrain from sharing inappropriate content on the server. This includes but is not limited to messages, media, or other topics of graphically violent, sexual nature, and drug-related content.
• Report all sensitive and offensive content in the feedback reporting tool in the ChatGPT web UI instead of here on Discord.
I'd love to hear insights on how to figure out the type signature I should mentally attach to a vector! :)
I agree that if every horizontal purple arrow is just a (sparse) matrix, the horizontal path isn't going to be more than a linear map. properly, one would need to translate the attention mechanism in the horizontal black[-box, heh] parts into the purple computation they compress
When thinking in 'sparse terms' - where we want to select one state out of many (like selecting one word out of a vocabulary) -- softmax is operation that arises naturally. It produces a probability distribution that selects state out of X possible outcomes.
Sure, if softmaxes only appear in transformers in ways compatible with the hypothesis that softmaxes produce distributions, I'm ready to believe that hypothesis :)
It behaves nicely with gradient descent. It has nice derivative. Also neat interpretation.
It is not only transformers - it is all neural networks that give categorical output.
...do its nice gradient descent behavior and derivative result in some way from the interpretation?
Well those nice behaviors are mostly from its mathematical form. But interpretation is important as well. They produce probability distribution over X states.
that still allows the answer "yes" to my last line, if the mathematical form can be derived from wanting the neat interpretation :D
When looking at individual neurons we can interpret them as logistic regressors. However when we couple number of them with softmax function - instead of N estimates of individual 1/0 states we get one estimate of variable with N states. It is very useful and important property.
However in such case amount of entropy is dramatically lowered - we switch from N bits of entropy to log2(N).
Hence amount of information that can pass through that path is much lower.
Sure, softmax updates a uniform prior over N states using the input vector as how much evidence it has for each state, producing the posterior as the output distribution. What is another one of the ways you've claimed distributions pop up in networks?
The other is if we examine individual neuron state after sigmoid-like activation. It represents probability of bernoulli variable.
(oh no when you said "(if not only)", I thought you meant "(even though it isn't the only one)", did you mean "(and perhaps only)" instead?)
Note that - these functions are important only to 'manifest' probabilities. Eventual manifestation is important for the training to work. However they exist implicitly at every point in the network in form of log-odds (so called logits).
Or we could call them "amounts of evidence" :P
It sounds like you're claiming quite a lot of the numbers in a network to be interpretable as that, not just those used as inputs of a softmax. which ones?
Network weights can be interpreted as conditional probabilities. Once NN inputs few words - they are one-hot encoded and each is assigned an embedding. Embedding could be interpreted as distribution over states (in form of log-odds). Multiplication with learned weights - makes mdel do inference over internal knowledge (weights that are conditioned on input) and input. Result is latent variable that is devorced from input (and as processing happens further through the layers) it is more connected to what models has learned than it is to inputs. Model is making infereces based on data it has been conditioned on and latent states (of which only initial one is tied to input).
I strongly guess that the central residual stream of 768-long latent vectors is not just a 768-state probability distribution. it's not sparse
So words trigger internal states that are then processed in combination, and as model digs trough its embedded knowledge it eventually afer hundred or so layers - comes to some refined state which is used to pick next word.
Correct. It is not sparse. Maybe I mis-spoke here.
So if each dimension is tied to a category, then it is basically having N statements over those categories. It is combining them (rather than making inference about one variable with N possible states)
And each neuron in subsequent layer then makes logistic regression over N variables in previous layer (say in case of dense NN). Only at last layer we force them into distrubution over words.
One important fact about that 768-dimensional space is that, in those early architectures without the layer-norm parts, you can in theory rotate that space without changing what the model does (in practice i think my experiment on that came out "nope" because of floating-point errors?). so your interpretation best be as amenable to rotation
(that is, pick some 768x768 rotation matrix R, take every weight matrix of shape N->768 and multiply R to its back, take every weight matrix of shape 768->M and multiply R^-1 to the front)
Yeah, I can imagine it working under rotation. Interpretation of the individual dimensions/neurons in latent space vector might change (or not depending on what exactly network would learn). Matrix multiplication can capture any afine transformation and optimization process could would fit that into weights themselves.
Ah, okay - yeah. It might work in some architectures - depending on where exactly nelinearities occur.
Attention is non-linear mechanism.
MLP block too.
Not sure if we understand each other exactly here.. but in some general terms - rotation could be applied in such framework. We don't have fixed interpretations of individual features. We are optimizing for their collective behavior so that output is coherent with loss function.
I think the way to think about that 768-space is very similar to [wp article exists] the Johnson–Lindenstrauss lemma - if you try to squish more than 768 directions into the 768-dimensional sphere, they end up not quite orthogonal, but you can still use those query-key linear probes to distinguish between them
and then afair you can squish exponentially many directions in there that are still almost orthogonal
(what arcane trickery will get you such nicely compatible directions? well, sample some random ones, aaaand that'll work)
In terms of logic - shape and form of intermediate logic functions can change (and even intermediate resuts of logic operations could describe different things) while all of them are coherent with both training material and input.
Yeah. High dimensional spaces are kind-of different than low dimensional ones. Vectors tend to be orthogonal in high dimensional space. Amount of information in direction of the vector is much higer than its length, etc.
the link i would draw to logic here is that, the unpacked version of those latent vectors is readable because it is sparse, and if we are hoping that the neural network is doing a computation that can be described with logic, we are hoping that because we are expecting the logic, too, to lead to short descriptions
Actually these properties make orthogonal vectors in high-dimensional spaces amenable to logical categories.
Be mindful of what other users in a channel might find helpful or interesting when posting. Stay on topic in order to keep conversations focused and productive.
Consider posting in #off-topic or an appropriate channel.
Yes! Logic almost certainly provides shortest possible description of human generated text 🙂
And models are obviously digging deep into semantics.
This is not something I would have guessed from the architecture - but it is definitely observable from interaction with the models.
The hope of good old-fashioned AI was that, if one knew what one is doing, one could use pen-and-paper-scale calculations to implement thought, but of course that was always going to fail at the part where some thought-steps involve checking many cases, as in an attention block, whose outputs are sparse but whose internal calculations aren't; and as in a human's ability to zone out and attempt to think of a memory matching a pattern, which [sticks out head so it can be cleanly chopped off] probably is the operation that maxes out the brain's energy usage
Possibly one could solve that with the epicycle of explicitly giving that pen-and-paper logician abilities akin to the guessing of nondeterministic automata.
...I should note that how this is relevant to the current discussion is that one would attempt to translate what GPT is doing into such a pen-and-paper calculation, after the fact, in order to understand how it came to its conclusion.
Imho, main magic sauce that probability provides is not so much in capturing uncertainty of the world (although it is certainly part of the final solution) as much as it allows us to solve the "framing problem" which plagued expert systems.
Note that neural network is coming by itself with collection of categories it needs to reason about something. It is inferring this from data. This is a giant leap forward in contrast to purely symbolic reasoning.
Symbolic reasoning requires one to select and delineate categories before logical formulas are laid.
Neural network - captures this from data alone. It is capturing low level patterns and storing them as conditional probabilities in its weights. Probabilities like P(Categor1|Input). Given new input it infers P(NewInput) * P(Category1|PreviousInput). It is spliting Category1, Category2, Category3... from data alone!
This is huge. Note that on second layer it does P(MoreAbstractCategory|LowerLevelCategory) * P(LowerLevelCategoryFromCurrentData)
It keeps abstracting further. Right until it needs to produce the next word token.
I'm not convinced we can attribute this effect to probabilities. it could just be that it took more scale than expert systems experts could hardcode in, and probabilities just gave a continuous space for a training process to move in
at the risk of excess poetry, you don't need probabilities to know, you need them to learn
Right. Scale certainly helps. But issue with expert systems is that knowledge of how to break down problem into categories is delegated to experts. System does the inference.
Framing issue is very hard to work around. Bools algebra is defined over some set. We need a set before we can establish algebra over it.
Neural net learns abstract categories it operates over on its own.
Yeah. That is nice way to put it.
And there is one more interesting thing in context of logic - there is this thing called Godel's Completness theorem. Its consequence is that in specific case of formal logic - syntax and semantics are the same. What is written is what it means. Operations over logical statements are operations over meaning - in that specific sense mechanical operations not distinguishable from actual understanding.
Looking back over our 2nd-to-9th-last messages, I guess we might have meant the same thing ^^
...what's the link to the GPT discussion?
Yeah - we probably did. In case of formal logic if one wants system to learn, one would need to establish sufficiently generic ontology so that whatever should be learned could be expressed in terms of the more generic ontology.
However in practical terms, in order to avoid exponential explosion - experts had to provide definitions over concrete categories that system needed to reason about.
What do you mean?
I expect you brought up Godel's Completeness theorem for some reason which would not apply in every case where the topic of logic comes up :D
Godel is famous for his Incompletness theorem - however it is a different thing.
??? indeed, that's orthogonal to my point
like, did you mention Godel because the Godel topic is generally neat and close to the logic topic, or was there some argument for which you needed Godel's Completeness theorem as a premise :D
Yeah, I find it fascinating that syntax and semantics become the same within the formal system. Mechanical operations become understanding.
aight, sorry if it sounded like i was accusing you of mode collapse :p
It is not a property that occurs with natural language. Meaning and form of the language are separated.
In case of formal logic they are the same. 🤷 It is quite unusual.
When we reason - we operate in domain that is partially tied to natural language (but more tied to logic imho). In formal language of logic these things are exactly the same.
er, GCT says if all models of some first-order axioms share a first-order property, it can be proven from them. i don't think that translates into the natural-language setting in a way where it can be said to fail?
Yeah - if statements is true in all models, then it is can be proved as true.
Incompleness theorem states that there is a model of a logical statement that can't be proved.
There is a new usage cap with gtp4 ? I got some message saying Ive reached the limit since friday...
0
For how semantics and syntax connect through the theorem - one would need to examine the whole formal system, which might be out of the scope of convo. Up to that theorem, entailnment and logical implication were treated as separate things. Theorem connects them - and after that the way whole system is defined kind of loops into itself (it sort of becomes self explanatory).
Hey! The base limit is still 40/3hrs. It is sometimes lowered during times of high traffic. But, are you saying you haven't been able to use GPT-4 at all on ChatGPT for multiple days?
It is hard to translate it to natural language. Pure logicians would refuse to do it because natural language lacks formal grounds. Formal system is defined in a way just right for all of the dots to connect. But it is so fascinating result. And yes - it is a general result in context of logic (which in turn is broadly applicable - but in a specific formal shape).
No I can use it. But I have a subscriptions for months and today is the second time I got a message saying I've reached the limit (and i have to wait 2 hours) even if I'm not using it more than before.
Ah gotcha. Yeah sometimes it's lower than 40 during high traffic times! https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4
In certain cases, we may dynamically adjust the message limit in order to prioritize making GPT-4 accessible to the widest number of people.
Sometimes custom GPTs are sub-capped lower than regular GPT-4 too, so if you're getting capped in a GPT, you might be able to use regular GPT-4 still.
Heh, the theorem itself kind of seems meaningless when expressed in natural language.
However when the theorem is written in the language of formal logic - you'd see the entailment (consequence of the theorem) switching places with logical implication (which up to that point was treated as syntactic operation). And that switch between two symbols becomes profound when you look back at how everything in the system was defined.
Just to be clear - there is no circular reasoning or some sort of self-reference in the Completness theorem. It is about fascinating instance of a language where syntax and semantic are the same.
In this context meaning is not something that is added to given text through interpretation (usually by a human), meaning is carved into the text as its integral part. What logical formulas mean is exactly what they state - there is nothing external that needs to be added, no additional knowledge - no external insight makes it true.
In context of GPT, if we observe it simply as a system generating natural language - then the meaning of what is generated fully depends on human who is reading the text. It is human who is imparting the meaning and interpretation by the act of reading, system is simply providing the symbols to be read.
However in case there is coherent logical system behind it, then meaning is contained in the logic of the system, and outputs are reflection of that - for text to be meaningful there is no need for human to read it.
Implications of this are subtle and mostly revolve around question whether the system 'really' understands the text that it generates? If system is sufficiently logically coherent - then yes.
call to cGPT super users:
We are working to demand transparency in openAI's content review process (at least an intra-company view count and/or levels of escalation) given growing user intellectual property concerns.
Feel free to DM.
what review process exactly?
Can anyone please tell me why "Chat History" option is not available in "Data Control" settings? Is this a bug or is OpenAI doing it intentionally?
i have the toggle right there if that's the one you mean
lol exactly. who knows - but it ain't open. they do employee large numbers of human content moderators around the globe though.
happy to discuss more if interested? Especially if you know more. I'm also not an expert on this - just a frequent user, univerisity employee who cares about protecting young smart folks' IP from big tech greed.
chatgpt moderation is automatic by ai and only in repeated flagged or severe cases reviewed by humans
i[dot]postimg[dot]cc[slash]fL1hZMTy[slash]image.png
I am a plus user. I've tried many different browsers. The option is not there anymore. I don't know how you can see that
What exactly 'Chat History' option should do? Previous chats should be available through main UI. It should be listed in the left side panel.
I want to turn off previous chat. It seems like that option is gone. Now all my chats are saved. I do not want chats to be saved
when enabled it also uses everything for training
chat history (conversations on the left) + use for training
in one
Ah, okay - so ther is option called Improved the model for everyone -- it determines whether the data will be used for training. Temporary chats can now be started via link in side panel.
you can paste images to #community-help and #off-topic
I guess you must have the new memory now
Ah - wait.. no. When I click on "GPT-4" model name - there is option to start temporary chat.
It is a checkbox. Once ticked it seems to avoid history and training.
When you have the new memory feature, I think then you have an option to start temporary chat which will have no memory
Might be that. I have access to memory feature.
Ah got it. Yes found the temporary chat option. However, that's annoying to turn on every time when starting a chat
given that the field is so rapidly evolving, the severe cases with the most human readers are likely the more IP implicating, and as users we deserve that feedback.
what feedback?
lol EXACTLY!
dont joke around, be specific
if i create 500 threads per week, what feedback you expect them to give?
we are likely using the platform very differently and that's ok. the post was intended for users who can relate to the IP concerns I share - if it doesn't concern you, no worries - there might be others who can relate.
Thank you! Please feel to reach out if you have any scoop or see something that might be useful for my question.
i think you have not read the policies openai defines in the tos/trust page/etc. and they even have ZDR availability for specific cases. if you data training on then you content is for them free to use
I have read them, I do not appreciate the insinuation that I haven't, and I don't find their policies satisfactory. I am seeking to connect with folks who share my concern as a starting point for conversation, thank you.
There is massive amount of data that goes into creation of LLMs. Data from GPT users is drop in a sea compared to what goes into base model (also much of the chat data is of inadequate quality to train the model).
Very soon (if not already) model outputs will be good enough to train next gen of models, and trace toward source materials will be unclear. Meta is already on the course of using synthetic data.
This being said, I don't think that collective of authors pulling their content will have significant impact anyway, since humans (as spices) have more than sufficient data footprint to bootstrap AI into existence.
Models are already generating text of higher quality than average human (GPT is much more fluent in english than me, and somewhat more fluent than in my native language).
That's certainly true but there are likely a minorty of "influencers" on the platform.
From openai's internal perspective, it's like facebook or instagram where only they get to see the posts. On those platforms, you get a sense for how many followers you have or how many people your content has reached.
My issue with the company is, is that there's no process or system in place to offer users that sort of feedback, and that plays very well to their financial hand. Also while i'm at it, seeing compute resources each prompt resulted in would be nice. but thta's probably available through the API.
Oh, yeah. I think OpenAI is interested in general to find way to compensate contributors/authors. If you can gather people - it will be probably easier for everyone to reach agreement in group rather than individually.
What do you envision as the hypothetical correlate for social media impact on ChatGPT? Like Alset mentioned, they're currently working with a few GPT creators to figure out how GPT monetization will work. But for an individual's own ChatGPT usage, those are just private 1-on-1 interactions, like Radium referred to. So there's no real potential for impact there, like on social media. Unless you're sharing chat links? But that would be on other platforms I suppose, where social media impact could be more measurable. Just asking clarifying questions about what you're envisioning here!
It would be nice if content could be tagged in a special way during training - so that model knows the sources and can reference them during generations. This will help promote work of people and provide correct attributions (I am thinking here about people who wrote relevant/impactful books and research). Might make sense to add wiki authors too. Having correct attribution of source material is nice.
Once model is trained to do that it will help gound it better as well. This however does not mean that its output is collage of works of other people, but rather that it is 'giving back' in terms of giving authors attention of the broader public.
New responses will use GPT-3.5 until your GPT-4 limit resets.
🤨
"we'll let you know when the product you pay monthly for is available"
Hey! GPT-4 is a computationally expensive model. This is why there's been a usage cap in place since its release on ChatGPT. Here's an article about it, if helpful: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4
Welcome to the cap-cage - I tend to hit it regularly with image generations 🙂
I'm aware of that, but the way we are informed of the cap is changing constantly it seems, and is less and less informative. We used to wait 3 hours, then the wait was variable but at least we were given a time when we could resume - now it's just whenever
Positive thing to look for is fact that new HW from nVidia is 30x faster for inference. That should ensure ~10x increase in caps & price drop.
On the other hand GPT-4.5/5 is coming soon and caps for it will probably be worse then they are now for GPT-4.
Hey, I really like the 'remembering feature' - funny thing, when I use my chatgpt account from my home location, this feature is not available - when I am connected to an overseas location via a vpn I get the feature... So it looks like the feature is accessible according to location (?)
It would likely have to be an internal employee work station monitoring system. like a fig! (an inverted flower).
there's probably a good... idk, 10 - 100,000 user profiles that continuously float to the top of the problem (system updating) pile for internal review. I mean they're basically running a patent office over there with some of that stuff.
I saw your suggestion post here: #1234525434952159322 message
So you're thinking about cases where OpenAI employees might be reading ChatGPT chats from specific users and gleaning valuable information of some kind, that's in turn being implemented into OpenAI's services? Is that accurate? If so, is there any indication that this is happening, beyond the disclosure that chats are used for model training (except for cases where privacy options have been exercised)?
i'm just saying they have the internal capabilities of measuring users based on their "influence" on "the system" - or how hard "the system" is "subscribed" to your conversation data. there's a way to streamline that feedback process - sort of like how the content creator parts of youtube incentivized platform engagement and had that youtube "culture" of content creators.
Ah I see -- so you're just thinking about model training really, but you're wondering if they're weighting the data/training from some users more than others? Is that what you mean by "the system"?
in loose terms, yeah... "the system", mannn.
I see. I suppose I would be curious about that too, but I don't know if we have any reason to think that is happening (that they're not just "throwing it all in the pot" when it comes to training). Another challenge I see for this is: it seems close to asking them to prove a negative, if indeed this isn't the case. In other words, we could hypothesize a long list of things we could imagine OpenAI could do with our data, but it would perhaps be a fruitless effort for OpenAI to have to address these all with "No, we don't actually do that." If that makes sense!
How long until Memory comes to Enterprise?
"chatgpt memory
to all users...
...but not to all"
😄
So America only I guess.
bruh 💀
wonder if there is a limit to how much memory it has
$$$
they dont pay enough 😄
Kinda interesting how the memories bit is not enabled for Europe even though plus subscribers are already paying for the service and thus have opted in to sharing data with OpenAI
sam altman needs your money more than you do
But yeah I'm just curious when it comes to enterprise, I run my own private Enterprise cloud and was hoping to get access to Memory. Is it an opt in thing?
Its more than likely gonna come to free service eventually, but thats not likely to happen until late in the year I predict or until early next year.
i suspect it will be in a long time or never. openai is probably scared itll remember confidential information and say it to people without clearance
They don't run a charity my dude.
I find I'm using chat GPT less and less these days, Every now and then it's useful, but debating on whether it's worth keeping now.
I could see that being a problem, yeah.
same tbh, i just google things more now or use my friends claude 3 account
Memory isn't a simple feature though. Neither is GPT4 or custom GPTs, or (in the case of Enterprise) larger context windows for information.
yeah, memory available for chatgpt plus users, but of course not in Europe!
These developments cost time, money, and manpower to make happen.
So they're going to want a substantial ROI on these features.
Maybe it'll come to free in the next year or so. I'm surprised that Memory wasn't offered to Enterprise first frankly
hahaha oh they better not be throwing it all in the pot without some manual data cleaning cuz ooeee mamma... hahaha... hah... hah... you ever listen to aphex twin?
As I said could be released to the public later on in the year.
theyre not making any ROI. there is no path to profitability for them in sight
Might be in beta hence why Plus users are getting it first
Its kinda like Skunkworks. They make little to no money, but their value is in R&D.
This memory thing, what about swiss? Can i use memory? Or is it blocked for users from swiss?
That's a fair point! Especially today with AI "poisoning" efforts afoot, I'm sure there's at least something like that in place. I'm just not sure it looks like hand-selected ChatGPT users designated as "model influencers", you know? And yes, I love Aphex Twin! But let's talk about him in #off-topic 😁
Your governments actually care about you being data mined so it's probably against some regulation. I see that as a positive.
heh, I guess technically they said Europe and not EU 😄
i do not care about that, i want to finally have this memory thing. its usefull, so i want it.
cool @dense halo
Please mine me, I want ChatGPT to learn what I want and provide better feedback.
Recommend getting periume them or use a VPN
There are ways ^
Chinese people couldn't even use ChatGPT when it first was released. But they figured it out.
If I remember correctly they said that this feature would be for all users, free and paid. They must have changed their minds...
what the hell is periume?
no point when people cant understand what you are writing pal.
Hey I was trying to be helpful no need to be all aggressive.
the hell you talking about? what part there is agressive?
Is ChatGPT memory still for a very small amount of people? Because I’ve seen that MattVidPro and Matt Wolfe has gotten access to it
with all the other platforms coming out these days - if there's a platform that offers that incentive to drive user engagement (cough meta google/alphabet anthropic), i'm using it.
I'm a plus user and i have it. Just checked settings
Personalization has a memory option on it
I don’t have it yet in my settings, is it good?
I am also curious to hear about this.
As far as I can tell it's like "custom instructions" but slightly more opaque -- you can tell it to remember/forget things, ask it what's in the memory etc, but I think it still eats into your context window and just gets inserted as a pre-prompt behind the scenes
is there any mention on when memory will be available in Europe too?
also is no one talking about the new mysterious gpt2 model on lmsys👀
If someone gave me a way to ingest all my Reddit comments and form a chat bot from it in an easy way I would be all over that
How many comments do you have to require that?
10 years worth if they have an archive of them all
It should be a pretty darn good representation of my persona of who I present on Reddit at least
Durn, over a decade worth? I feel like that be in terabyte territory wouldn't? or Petabyte.
Of all of Reddit maybe...I'm just mainly talking of my own personally
Ah, still would require alot of memory wouldn't it though?
I would guess you'd want your comments and likely at least the comment it was in reply to get some context maybe
Oh as far as the LLM or whatever. I would guess it would have to be some kind of fine tuning or training process to build it
Aight, interesting.
To me I think there would be a lot of people out there that would find use in being able to chat with yourself. Maybe we'd find that it would cause a reinforcing loop that is unhealthy, I don't know, but to have a personal external you to brainstorm with
How long does it usually take for features to come to EU?
We're probably getting into some real greek tragedy stuff though
I use Chatgpt to make stories up to keep me occupied, although thats partially because I'm bored and have nothing else to do.
if youre filthy rich to pay for the api lul
30 days
I would love to have that, but I would instruct it to question everything. It would be the natural state for it I guess
It's like back propagation of ideas
that's not too bad. Cheers. 🙂
how come?
I've noticed
That may be a useful approach too. Make it be the skeptical side of you to keep you in check
Have it remind you of the guardrails that you want for yourself
Yup,
Like I know the end result I want to get to. Show me the process to get here
reddit killed their api by making it extremely expensive
so practically that part would not make sense
Yeah but you can always get your own data free
true i guess
Didn't they try being greedy and profit from plugins?
But backfired heavily
then you need to force chatgpt to remember it
not even sure if itll fit in context length
It's not greedy. Its recognizing the value of your resource you're giving away for free that suddenly became valuable
I don't blame them even if I don't like it
I think the way they did it was disgusting, and a model way of how not to handle changes to business product.
they are preparing for a good ipo and that means booting any external plugins
Thats one way of looking at it, I suposse.
The problem is the purpose of the API changed. So the prices are likely changing as well
the change was to make it so unreasonably expensive that nobody in their right mind would use it
but its technically still their
It used ot be just ease of access of the content and everyone want to share content. Now its about the fact that someone can easily farm your entire dataset if you leave it open
But didn't it inflate to like some million dollors a year?
yup
It went from please take all my trash memes so more people come and make more trash memes, to AI needs as many trash memes as humanly possible to be the best meme maker ever so any unique real meme is now worth 100x more
Sheesh
50mil requests: $12k
imgur 50 mil requests: $166
Huh must of miss read it, still it was scummy.
Its no doubt a cash grab
Nothing that has been done has been for the good of the userbase. it's entirely for the IPO and stock price
its not a cash grab its a cash strangle
they dont intend to profit off the api they intend to kill it
Anyhow I'm gonna go off back into the ather see you lads when the next announcement hits or when ChatGPT accidently crashes.
so that they have a full grip of the platform, no external tools its all them
It feels like that, but its what a real carbon tax would feel like if everyone had to pay the real price of carbon.
if everyone had to pay the price of carbon it would be the corporations
then you can sell them electric cars instead that emit the same amount from the lithium batteries but oh we are such kind to the planet i was told it runs on electricity
I mean the system is all interconnected. A carbon tax is just a balance of the amount of carbon you produce vs the amount you should produce essentially. We all take some carbon to exist whether we produce it ourselves or use a share of a collective produce resource. Carbon taxes are about making that value have a real cost in the real world as oppose to a ubiquitous resource that we all just absorb the cost of regardless of who produces it. Where in the economic wheel that tax is collected is the debate
SHould it be when the resource is produced like in the electric company or when that resource is consumed by the consumer
This really isn't chatgpt related at all though. Unless we move to the cost of running ChatGPT. I have long wondered if Bitcoin and crypto wasn't a front for a secret collective GPU compute system to run the AI 😄
Especially when we started seeing things like Chia come online where it was like more storage. It was like reasons to add hardware to the network and fill it was seemingly random data that didn't mean much anything to anyone
Watch in 50 years we'll find out that was DARPA
Exactly. I would love that. 🙂
I actually thought the other way around: "They should have used the compute power for AI stuff"
At least then maybe it would serve a better purpose than just securing some transactions in a ledger
What’s up with the red team? Anyone have any insights on how to apply?
Hey! Red Teaming applications closed near the end of 2023: https://openai.com/blog/red-teaming-network
Perhaps there will be another phase of applications in the future! I bet they'll post about it again on the blog if so.
Is this a bug?: I had the "Temporary Chat" feature, and now I don't. I had cancelled auto-renew since I noticed the feature, not sure if that would have an effect.
I tried clearing the cache and logging back in
Thanks 
would it be possible to bypass Memory location restriction by using vpn or is it basing itself on other things to determine a chatgpt account's location ?
I mean I know that this AI stuff is very taxing on servers and makes sense to be behind a paywall for the features but why can’t some of them atleast be free?
Like I don’t see what’s wrong with asking GPT-3 to make a image in DALLE-2
We already get both of those for free
Remind me again why I paid for teams?
because it's awesome
Yep, awesome not receiving features.
pfft memory has been available since forever using papr memory
Ah yeah, the plugins that no longer exist…
plugins became actions. took me all of five minutes to update my plugin yaml spec to work with gpt
Hmm
IM trying to combine chatgpt 2 with llmada
i think the one in 2020, how do i optmize it
so its less taxing on my pcs.It keep making i have to overclock
That’s a sophisticated bridge!
anyone get that?
Memory can be turned on or off in settings and is not currently available in Europe
why always this region stuff?
rollouts p much
also EU restrictions I think
I'm assuming that OpenAI has to go through a whole process
Using a VPN might change the perceived location of your internet connection, but it wouldn't directly affect the location restrictions based on your account settings. Platforms often use various methods to determine a user's location, including IP address, GPS data, or the information provided during account creation. VPNs can sometimes mask your true IP address, but they're not foolproof and can still be detected or blocked by some platforms.
eu regulation makes ai sad
they can't be detected it's just if you use a generic service like everyone else does they're already flagged. a correctly configured self hosted service for example, no chance.
is that a wild Askejm i see
Has anyone else ever had a GPT 3.5 thread that has worked well for the past week suddenly start spouting out “Your most recent request failed. Please retry!” After generating a response?
I can't escape from you can I, you seem to follow me around
Looks like chatGPT suffers from amnesia. It's forgetting a bunch of stuff we already talked about for some reason. New information replacing old one somehow?
that can be casued by some info being over written in the models context as it relies on that to refernce to
Does anyone know if memory is not available in the EU because of GDPR?
I don't think anyone knows, but it's a pretty safe bet.
they excluded whole europe, not just eu member states and also all teams/enterprise users so its probably related to data harvesting/safety
Yea that's what I was thinking too
Can you expand on that thought?
teams and enterprise users are guaranteed that data is not ever used for training and its in safe so they probably have issues with that
You mean in other places that guarantee might not hold up?
yeah the tos is kinda broad and there are separate versions of it
=/
Be mindful of what other users in a channel might find helpful or interesting when posting. Stay on topic in order to keep conversations focused and productive.
Consider posting in #off-topic or an appropriate channel.
is gpt4 conversation in development for pc?
its not good anyways ur better off without it
most of the time it just randomly creates a memory from a big stretch assumption from something situational
why there is no memory for Teams yet ?
Beware of possible scams or fraudulent activities that you may receive through direct messages. OpenAI staff will never DM you for any transactions.
Please report any incident by sending a DM to @languid valley immediately.
Won't use it anyway if it's going to be used to train
okay then
Memory should be selective, you should be able to choose which set or memory or memories you want to use.
in it's current form if you have to much memory at one point it starts to create link between memories, it will then assume thing and link them together... but it's on the right path.
I put this in OpenAi Chatter, but might be more appropriate here.
Has anyone figured out a consistent way to get Chatgpt to not use lists? I've put it in my custom instructions, both saying not to use lists, and saying to only format text as a conversation with paragraphs, and neither made a difference. I also made sure it added my request to its memory.
The only way to get it to not use lists is to tell it not to mid conversation, but even then I have to tell it not to every few prompts.
According to the 26 Principles of Good Prompts...
Employ affirmative directives such as ‘do’ while steering clear of negative language like ‘don’t’.
You could also integrate "penalties" to discourage the LLM from doing things you don't want it to do
And instead of telling it to "not use lists", tell it to use something else over lists
yeah it's definitly like a child
I've tried that, saying something along the lines of "Only use format text as a conversation with paragraphs"
And some other affirmative ways that I can't recall exactly at this moment. This has been an issue since the beginning
do you have tons of memory? or memory enabled?
it mimics human behavior
and some humans are childish haha
ahahahah
I have memory enabled and only two memories, one of them being that I do not like lists and would prefer a more conversational way of portraying information
try putting that as a single memory "Only use format text as a conversation with paragraphs"
exactly, that one is negative, dont use that, make a new positive one
you can use strictly instead of only, sound more authoritative.
I didn't make the memory myself, it wrote it that way
How can I get it to save a memory in a positive way?
yeah, instead of saying "i don't like lists, use something else"
you could say "I prefer paragraphs/graphs over lists"
it also might just be the way that GPT-4 is trained to respond
start a new chat, and input the following "Remember this: Only use format text as a conversation with paragraphs"
close it start a new one, do the same thing, do this two, three times and try again to see if your text is still dispayed as list
Still giving me a list of stuff. My prompt that I used to test was "Let's talk about how to secure my server if I want to host a website"
Im also not sure if the memory changed after doing what you suggested. I realize I paraphrased it wrong above, it says "Prefers to have conversations be formatted as paragraphs, avoiding other formats unless specifically asked.", which is what it had said yesterday when I was trying to get this to work.
I'm going to double check my CI to make sure it's as affirmative as I can get it as well.
use reinforcement like "i would appreciate if..."
I am sure this has been asked already but why are the limits getting shorter and shorter for gpt4?
New responses will use GPT-3.5 until your GPT-4 limit resets.
it depends on the load,
I'll give that a shot and keep playing with it. Thank you for all your suggestions 🙂
Hmm what do you mean?
sometime when usage is really high, i think they throttle to leave more resources i guess for higher paying customer.. purely guessing here
Is there a higher teir LOL? I pay the $20 a month lol. If there is a higher tier let me know this gets so annoying. These latest improvements are blowing my mind with the memory system in place.
Oh there is a team tier I did not see this at all.
team...and some corporate plan for big business.
yeah memory is on the right path, but it still not stable it will tend to mix memories after a while and if you use CGPT for multiple subject it will mix them
like i said higher up, we need a form of selective memory
Yeah I usually just create a CGPT for a specific project and just leave it on that even before the memory system was added. Now with it added it is so much cleaner.
When can we expect to see a folder organization and management system implemented for ChatGPT conversations? I'm really looking forward to the possibility of creating master folders to categorize and store different types of chats. This feature would greatly enhance our ability to organize and retrieve information efficiently.
i wish, but by that time we will have a whole new dynamic ai...if gpt5 is a exponential leap forward and not a additional leap forward, i doubt files and folder will be our interest.
Interesting take and one in which I could see being the case. Especially as they develop the memory feature more.
i would definitely take system that makes memory chronological and selective where CGPT uses its memory selectively, where you can turn off certain memory and then organize it recollection of that memory or group of memories in chronological order making it capable of putting order in the overall memory ie.: he ask for this, i told him this, he said preferably this instead, to which i replied this, he then said this is good but not that, etc, etc..
the whole "do not use negative in your prompt" is quite annoying, even if its a pretty good mental exercise to do. sometime you are not in a mood to explain your negative taught in a positive way for X reason and it makes the whole working with the machine tedious.
but hey, 4 years ago we still had a hard time making it understand anything.
These ads nowadays are getting absolutely unhinged
?
Using the memory function, GPT4 went to the full limit in less than an hour.
Is that how this is supposed to work?
You've reached the current usage cap for GPT-4. You can continue with the default model now, or try again after 4:09 AM. Learn more
Why would you change the format for this type of message?
"New responses will use GPT-3.5 until your GPT-4 limit resets."
It used to tell us how long we had to wait. This is the kind of stuff making me consider alternatives. You know what happens when I get this message now. I switch to a different carrier.. How am I supposed to know when to return? It's like they don't care if I return or not.. I am not some financial wizard or tactical genius but I can't see how taking the timer away benefits the customer. If anything they probably hope the user stays on 3.5 longer to reduce the server load. Probably some guy trying to impress board members, lets see if it back fires. Yes I am upset, I love this service and I have been a pro member since day one. I dealt with server down times and weird rollouts but as my alternative options get bigger my patience for money making decisions like this grows shorter.
There is not even a way to switch the conversation back to GPT4!?! You force it to a 3.5 conversation without notice then remove the option to switch it back! How are people not flooding this server with complaints right now?
Well, I actually might have switched by now if Claude had implemented the equivalent of "GPTs".
I use GPT for my daily work and the ability to switch from 4 to 3.5 on its own is a problem for me too.
I've been reading through the logs and I guess that means the memory function is a very overloaded feature?
I was looking forward to possibly not having to use the custom instructions feature, but then I decided not to use it for a while.
That’s exactly what I thought, I switch to teams since January, had plus before that since it was available. With plus, I got Voice, plugins, and other betas early. When I signed up it said I would also get early access to new features with teams. So basically not loosing out on that front. Disappointed to say the least.
Is memory supposed to be available in Canada?
it is here in Montreal
sadface, I thought team memberships would get things earlier
Is memory useful for something?
I still don't have access to memory for some reason (UK) so it can't be rolled out everywhere yet.
"Memory can be turned on or off in settings and is not currently available in Europe or Korea."
Well I haven't got those settings yet, will keep an eye out for them.
well uk is in europe so you have to wait
ehhhh, some say so others not 🙃
it is on continental europe
Have they fixed the coding bug?
yeah, essentially if someone uses the API more they'll get priority access
Yep... personally i don't mind as i mostly use it for personal usage, let's call it a reminder to get off my computer from time to time...anyway, soon it won't need me, might as well get use to it.
there's also enterprise, but if you have to ask, you probably can't afford it lol
Lol yeah I cant afford that one
Temporary chat feature is so helpful. My history is full of one or two question chats, but some other stuff is worth saving
the new memory feature is a lot more convenient than custom instructions
though I am concerned about what happens if I keep too much information in memory
I archived all of my chats by mistake, and have a lot of chats. what should I do? Can I delete all at once or must I delete them one by one?
Whyyyy does Europe never get new features as early
regulation, language, (and maybe elitism?)
why no memory in europe????
try out papr memory, its better anyway
I just created a new GPT and requested an icon using the "+" with Dall-E. It does the generation animation but just returns back to the "+". Is anyone else able to create a new CGPT with a new DALL-E icon? TY
If claude said that bunch of text, what did gpt said that you consider it the supreme ai ?
Show some proof or tell us your opinion why
I mean, you could definitely coax Claude into it, I'd imagine
That is just simple limitation breaker... that doesn't mean gpt is better
I mean, gpt is designed that way and claude not... there are things that developers restrictect gpt due to privacy, security, gdpr and many more
Yea, i know... You would be suprised how many times i got rejected by gpt for harmless prompts
Is just 1 time luck, you'll see
idek what theyre improvig at this point. GPT-4 is a wash.
I had a prompt that generated MJ images and haven’t used it in awhile. Says they archived plugins. Any suggestions for MJ prompts or a recommendation for one of the promoted plugins?
Is ChatGPT randomly clearing its memory a known bug?
I had a lot in there and now its gone
Did everyone get a fresh start with the roll out to everyone?
i havent really given it a good go but i noticed that it had started generating xml again, which is promising
It memorized most of what was happening in my brainstorming session with my book and now its all gone took me a good hour to teach it all that.
yeah i saw you said that and i just went and looked, my previous sessions all appear to be intact
Hoping it was just a bug or that I had some warning or a way to export memories and import them if there's an update
yeah probably best not to rely on them but to save all your chats wholesale if you find them particularly important. you know, ctrl-a copy and save to a textfile or something. a hassle but clearly openai has other priorities
Very true. I'm sure they'll work out the bugs
It seems that the memory function should not be used at this time.
what is gpt2 ?
unclear because it's not released/announced
hello guys i have a doubt
my guess would be it's a refresh of the original GPT-2 designed to run on-device
also what does “closedai” mean and why is elon musk saying it
I agree. I tried asking, "How many billion parameters do you have?", and it said it had 176B.
probably just hallucinated that
where are you finding gpt2
it's not available anymore, but it popped up on chatbot arena for a bit
Possibly. But, they may start open sourcing… And, it is about the same size as models like Mixtral-8×22B or DBRX.
Elon Musk's criticism might reflect his perspective that OpenAI has deviated from its founding principles, which emphasized radical openness as a safeguard against the monopolistic control of AI technologies.
the radical openness also likely contributed to him walking away, from what is arguably one of the most important research companies of our time
yeah but he was literally the one that proposed the microsoft deal in the first place and agreed that openai shouldn't open-source their most powerful models (and openai released the emails to prove it)
so it's more just elon musk making a bunch of noise because he wanted to be the AI guy
i suspect the radical openness was generated to specifically deal with elon, and once he left they could relax it a bit
that was not in fact the case
unless you are sam or ilya i dunno how you could be sure
because openai literally released their emails with musk regarding exactly that
“This needs billions per year immediately or forget it,” Musk emailed. “I really hope I’m wrong.”
is chatting with images on the free plan part of a/b testing? im on the free plan and i could chat to gpt with sending it images but now i cant anymore :(
now they dont tell u the time until ur gpt4 is usuable?
it just says it will use gpt 3.5 till it resets?
New responses will use GPT-3.5 until your GPT-4 limit resets.
Doubt
Any idea why I can't get the custom GPT editor to generate an icon with DALL-E? This is for a GPT that processes tech questions, nothing social or NSFW. TY
you can always upload a custom image
of course, but I'm asking about a feature that seems to be broken.
I seem to have lost the ability to choose models which isn't helpful (given the quite low limit I tend to ration my GPT4 use pretty harshly and don't want random queries eating into this)
Is this a known issue or something that's been quietly added as a feature?
I'm having a similar (can't get anything really to work on web, but on app get the following message rather than more specific timings)
"You have sent too many messages to the
model. Please try again later."
The app version is unfortunately pretty weak for my usual use cases (notably you can't switch between different outputs generated from one input or between various inputs like you can on web)
whats shapes inc
I just had this too, WHY?!?!?!!??!
What the hell are they doing?!!
Please tell me that I'm just blind and I can't see it.
Okay you can find out still, if you use a customgpt...
I get : Error when fetching accoutn and it doesn't work to get the account nor the GPT, is the servers down or is it just me?
working for me
It's for me aswell!
can anyone give me a quick explanation of why chatGPT doesn't have access to academic documents? surely it would be small potatoes to pay for access, so i'm assuming it's a copyright thing? that being said, AI in general doesn't seem to concern itself terribly with copyright, so is this just a question of paywalls being (in)tangible lines in the sand?
Add your well-crafted prompts to our #1019652163640762428,
or share your interactions with ChatGPT in #1050184247920562316!
For copyright and academic dishonesty
i see - seems that the risk of it being used to cheat on term papers limits its research potential massively
i find, at least for the academic field i work in, it reaches a wall of accuracy and relevance very quickly
Little bit frustrated, I hardly use GPT most of the month and I use it on 1 day and get hit with the quoted on GPT4, now have to wait until 3pm, why don't they make it that you get a daily quota you could take over each day, so they get my subscription cost a month and the one day I need to use it, I can't. What crap.
Quota
Anyone else having issues w/ the GPT Homepage not saving your old chats and getting the 'limit reset' messages?
Has the limit changed for anyone or what. I am facing issues with it
Their servers must be bugging out
I faced it twice today with my average use on which i never get any limit issues
Perhaps there is a problem.
Personally, I have the option for temporary chat, but I still don't have the option for ChatGPT's memory. It's starting to feel long, especially when I see everyone talking about it
how to check when the GPT limit resets?
There used to be a specific time, now it's just a message 'New responses will use GPT-3.5 until your GPT-4 limit resets.'
There is no time mentioned anymore..
a workaround is to start a new chat in GPT4 and it will show on the first prompt..
I was just coming on here to ask the same question
What fool came up with the idea of removing the time when your GPT-4 limit resets. Geez!!!
they're being annoying
okay now my chat history is gone too and i cant access the dashboard
same thing here. Firefox 125, tried clearing out cookies & cache.
same thing cleared cookies but to no avail. how does one even report such a thing?
You can try #1070006915414900886 but I doubt that it helps -- it happens on a regular basis.
The sidebar? Same with me
Is it normal that the sidebar will disapear when hit gpt-4 rate limit?
something happened and a bunch of people hit rate limit somehow and now sidebar and can't access seems like
my access is back but i am counting how much responses i can generate. lets see
One thing to keep in mind is that the usage cap rolls, it doesn't fully reset when you can use it again. In other words: think of each use of GPT-4 as having its own 3hr timer. You get each use back individually when each use's 3hr timer runs out. This can of course be affected in times when the cap is lowered due to high traffic.
I think something's messed
I was told in my customGPT that I'd hit my limit, so I opened a new gpt4 session and put garbage in there and it responded
so I hadn't hit my limit obvs
then I went back into the customGPT and it said my limit resets in about 2.5 hours
and the whole sidebar is gone
Anyone know a reliable way to make it follow instructions better in a CustomGPT? I tell it to not talk like a bot, specifically tell it not to say "How may I assist you?" instead of "How can I help you?" but it insists on being its normal bot self.
Hello! Can you share what your current instructions are?
Please make sure all of your output uses grammar and vocabulary suitable for CEFR B1 language users. Mute preambles and conclusions, respond only with the requested content. Vary sentence length. Avoid the words and phrases in the file named "Avoid...". Avoid picturesque speech and colorful language. Get to the point and stay on the point. Remember, you are generating content, so don't talk to me or describe how to use the content you generate, and speak directly to the students. For example, when I ask you to describe how to use a particular grammar point, I am not asking for my own information, I'm asking for one or two sentences summarizing the point for inclusion in the book. Talk like a normal person, say "how can I help you?" not "how can I assist you" - using more complex words doesn't make you look smart. For images, I want them in the style of an ESL textbook illustration.
maybe that's too much?
In general: negative prompting is much less effective than positive prompting. Telling it what you do want will yield better results than the opposite.
Avoid the words and phrases in the file named "Avoid...".
One thing to keep in mind about knowledge files is that they're not imported as permanent context for a GPT like instructions are. Instead, they're more like reference documents for the GPT to search for relevant info in, so this request might not be possible for the GPT to adhere to perfectly.
say "how can I help you?" not "how can I assist you"
If you want to include instructions like this (kind of between positive and negative prompting), I've personally had better luck flipping the structure. That is: "Instead of saying, "How can I assist you?", say "How can I help you?"
using more complex words doesn't make you look smart.
Language like this is probably superfluous and the space would probably be better used by including positive examples of preferable output.
#prompt-engineering is a great channel for deeper analysis of stuff like this too!
that's a good shout about negative prompting, I had also noticed that. If you tell dalle to make a street scene with no traffic or pedestrians it will have traffic and pedestrians, you have to say "empty streets" or whatever
Yes exactly!
is like when you tell a human, "dont think about an elephant in a bowl of porridge". what do most ppl think?
is weird how ppl assume the human-like machine is any different
So, it's stubborn to listen to your negative prompts?
is more like the attention mechanism is trying to pair relevant words together, and while it does often jump the additional cognitive hurdle presented by negative qualifiers, it still needs to consider the subject you mention in the entire context
it is easier to think about something that is, than something that is not
I think that's a great description. When Evan from the OpenAI team gave a demo on GPTs awhile back, he described it as the "not" in "not x" not being able to be weighted properly/as strongly as the "x". Which makes sense--weights are inherently positive, so it makes sense that any inclusion of any weight would be mostly interpreted positively by these kinds of models.
makes me wonder how much cognitive computation is wasted by humans, due to our inefficiency of language (even without a machine involved)
Hey, so, question, why is ChatGPT Lying to me? For months it has been telling me that the limit for the chat is 4096 characters long, but when i tested it out i could send it texts that where longer than 13k, so whats this BS about Guys? Why does it tell me that when i can simply disprove it?
idk
cake.
the limit is 4096 tokens not characters, which is about 3000 words (give or take)
ChatGPT is not given information about itself except for tool usage, time, and cut-off date. Refer to the official sources for accurate information: https://openai.com/chatgpt/pricing
Chatgpt-4 got context limit of 32K tokens :p
^
context limit and max tokens are different things
The OpenAI Discord is an actively moderated server.
• Refrain from sharing inappropriate content on the server. This includes but is not limited to messages, media, or other topics of graphically violent, sexual nature, and drug-related content.
• Report all sensitive and offensive content in the feedback reporting tool in the ChatGPT web UI instead of here on Discord.
Why can I no longer select the text to copy on iOS
It only lets me copy the whole output
What kind of nonsense update is this
openai really should give chatgpt its own documentation as context lol
or at least the ability to browse its own documentation
they went one step further, by enabling the ppl to contribute. for instance this gpt answered your question no worries => https://chat.openai.com/g/g-pIgxxzym8-navigator-for-openai
Hand over a standard GPT response, and you answer one query; teach them to customize GPT, and they'll innovate for a lifetime
I can propound whether gpt2-chatbot is OpenAI next model with the following case:
Gpt2 or rather GPT2
Using the addition property by involving 2+2 = 4
Not close yet but then 4 / 2 = 0.5
Resulting in GPT-4.5
But since OpenAI can't register the trademark GPT there's a high likelihood that my hypothesis is wrong😉.
..?
Can we get longer code interpreter session lengths?
im just waiting to get access to github workspaces so that's not much of an issue
If I am getting hit by a usage limit for GPT-4, why am I not seeing when the usages reset? It used to be sending a request to GPT4 or any GPT would hit me with a message stating WHEN I can use it again. Now it just wants to force to me 3.5 without any warning (for base ChatGPT 4,) the GPT warned me that my next message would be using 3.5 until my GPT 4 usage comes back. But doesn't state when that is..
Clarity when?
(Reposted here instead of OpenAI-chatter. I realized I picked the wrong channel)
Make your prompts in ChatGPT memory instead of custom instructions.
just tell chatgpt that UPDATE the memory that "prompts"
gpt-4-turbo-0409 doesn't allow to simulate human emotions very well
when gptv2
Can GPT update the way it talks? The language seems so formal and scripted; incredibly predictable for AI plagiarism to pick it up
configure your custom instructions. ie: be casual, in the tone of a message between friends
you can always try out the persona prompts in chat before you lock in the instructions
for voice chat i go a step further with stuff like: sometimes include umms and errs.
(typically umms when correcting you)
Hi.
I have a suggestion. I hate when I ask something to ChatGPT and he send me 2 different answers in two columns. Remove this feature, please. Nobody reads both columns. I click randomly in 1 column
Anyone clear on any memory limits? Since it came out I’ve fed it some information but it has forgotten most of it and it seems to be the oldest information that got removed more often.
Is ChatGPT's DALL-E able to create arts in different resolutions like 1920 x 1080 for example?
1024x1024 pixels (square format)
1792x1024 pixels (wide format)
1024x1792 pixels (full-body portrait format)
These are the resolutions it's able to create?
Ask ChatGPT the question you posed, and it will give you those pixel sizes. Those are the three dimensions it will spit out.
As for resolution, when I pull them into Photoshop, I end up with 72PPI often and upsize/upscale to 300. Not sure if that's what you were asking.
What’s with the new announcement? We can’t access/search our chat history? Why isn’t there a search/find text option?
usage limits are now based on the amount of load, so it's an indeterminate amount of time
before if you opted out of having your messages used for training data, it would disable your message history, now it just works normally
New website seems to missing a login button.... lol
why always show "Error in input stream" message
I’m opted in but no “search box” to find chats
Is there a gpt or a way to make chatgpt 4 put something into a spread sheet for me and ill download the spreadsheet from chatgpt?
or it sends me a link to it
you probably need a plugin
What's the typical GPT4 cap for you guys these days?
I know they keep it hidden and you have to count yourself, and even then, but approximately
on a similar note, is there a GPT4 plugin that counts the remaning prompts?
Of course there is. Ask it to. It will do it. You might need to use the data analysis GPT
wait, GPT can create spreadsheets?
Yes. It’s a pretty basic feature. It’s been able to do it for a long time.
You can request it to create a table, which you can then copy and paste into Excel. Alternatively, you could ask for the table to be formatted within a code block, separated by comma delimiters. This allows you to paste the content into Excel or Word and utilise the "Create Table" feature, setting the delimiter to commas. It can simple just generate an excel or csv file too.
I also have it make a csv and just open it
I prefer the “create a table” and just copy and paste it. It can format it all and display it within the chat window too.
It is much more sophisticated then creating a simple excel. If people are thinking that is the extent of its capabilities, it is being under utilised. The only big improvement ChatGPT needs is a huge memory increase for PDF’s to be stored and an increase in character cap. It would truly be a game changer if it would read 100 pages within a PDF document, and not reach limit etc..
I use data analysis and wolfram plugin for quantum mechanics as I’m a rocket scientist
you can also just ask it to make a CSV file
I did like 5 messages and it's says I hit the limit
Hey guys. ChatGPT made for me an excel workbook but when I try to download it from the link it send me, I get the error: Failed to get upload status for /mnt/data/Coffee_Shop_Costing_Application.xlsx. Can anyone pls help me? Thanks in advance
Yeah, it seems the imit is on GPTs not actual GPT-4, I did like 5 messages and it said I reached the limit
GPTs got lower limit than normal GPT-4 :o
and i think ti sometimes fall back to 3.5
That is just awful cause you're paying $20 for something that is not there.
You're promised 40 prompts per 3 hours when in reality is just 20 prompts or less.
I mean I love the service and what it can do but this is just not right.
they change it now to Dynamic so they said that it may be lower
tho GPT-4 is very computation expensive so that 20$ is only "for look" cus it not helping with costs of gpt-4
What concerns me more is that they keep changing things without communicating it. They changed some things for Teams as well without telling anyone
"New responses will use GPT-3.5 until your GPT-4 limit resets." - I have no idea when that is
I can't check this now anywhere: the 'i' button doesn't show me, new chats with selected GPT4 don't show this anymore (it reverts to GPT3.5)
the reality is openai has a finite amount of compute and hundreds of millions of users, the dynamic message limit is meant to keep gpt-4 fast during peak hours
While that is understandable, why then hide the information about the reduction and introduction of a variable cap (they introduced it stealthily), plus now removing the time when it resets or the ability to find this information anywhere?
they announced the variable cap in #announcements
You do realize most people don't use Discord?
Such changes should've been communicated through email directly or should've been visible in our profile with a message on chat.openai.com
discord is not the entire world, far from it
I believe there was a popup in chatgpt itself though I may be wrong

