#Discuss allowing this Discord to be offiicially used in training AI
1 messages · Page 2 of 1
How about the daily learning scripting debates we have. Ai wouldn't help that?
debates...? with an AI.. what did we do before computers ?
I was around then. And we had this same debate about computers.
I was told computers were goingto take all the jobs and ruin our brains. Causing brainrot as a kid in the 80s
... i mean it's not 100% wrong tho... exagerated, sure, but not 100% wrong
Distributed training is a thing by the way, so we can in fact do it without a business, if its really for the community. Its even built into Pytorch
Training is one thing, running the model is another
Other way. Training is the cost. Running is cheap.
people could run it locally?
That's why it being open source is also far from acceptable, you either need money to pay for compute OR a computer that can run it
You can run GPT on your cell phone now, locally. Offline.
OpenAI is quite the company you know
Sure, but local models are limited in size. You won't get the chatgpt quality using a model that can fit in consumer GPU VRAM
AI will definitely result helpful for Add-On creators, assuming that’s the topic(?).
Specially if the model is trained to handle add-on inquiries, questions like;
What methods do I have to spawn a mob in my world?
Would result in answers like
- You can use
/summon - You can use structures
- You can use spawn rules
- You can use spawners
Etc etc.
Or prompts like “Provide me a working, simple block template”
Would result in the AI to give you an almost empty but working block template for you to use.
Overall, I visualize an “Add-On AI Chat Bot” like this to provide the methods to do certain tasks more than it simply doing the work for you. Yes, if the task is simple enough, like the template example above, it would do it for you, but mostly starters will need that.
We also have to remember that it would probably not have the limitation that ChatGPT does with links, so it could pretty much lead users to the right direction.
holy smokes, that’s a long message. Sorry!
Did you see they are unlocking our GPU memory for textures? Massive news
How about an add-ons wiki
omg a wiki for bedrock dev sounds amazing. someone should do that
Ikr
no idea why it doesn’t exists tbh
or an faq bot maybe?
Are the numbers posted somewhere?
xkcd 927
Sounds like it's been done before...
Yeah, @young steeple what do you have to say?
This bot was created by SmokeyStack for the purpose of making a FAQ bot for the Bedrock Add-Ons Discord Server.
To manage entries, please make a pull request on GitHub.
People starting off addons are impatient and just want to get stuff working. They can't be bothered to read wiki articles, and get chatGPT to make stuff for them... which doesn't perform in game.
Wow thank god the robot can't try to hallucinate a sentence
Also consider adding 'reddit' to the end of your google searches
damn this is wild.
Yup. Which is why I think that an AI would help. Though, I’m not in favor of it simply providing the solution and done. It would be great if it was well explained!
Yeah, they just removed the VRAM check, which was there to ensure that you never exceed the VRAM size during the game. It just replaced the pink texture glitch with a potential GPU crash, which can be worse depending on use case. They didn't increase your memory or anything, just removed the safeguards
If anything, I'd be for them bringing it back. Crashing sucks and is time consuming.
It's one of the backpains of my existence, but we are getting a little off topic.
Before it reduced the texture size. Ruining RTX, HD games. I assume this fixed that.
Hey there, missed this.
Let's split the work we are going to do into two parts:
- Clean and prepare data for use in AI training and more
- Create an AI assistant to support development
The first step is the actual offer we have given the community and team here. This is the most important step to be honest as it will provide the most benefits to us all. It will mean looking over the data and working out how to find the most useful question and answer pairs. It will also mean we create several types of data sets to address different approaches for training and different model requirements.
The second step is the reason we (myself and our team) have offered to do this. We want to try making a useful assistant. Initially this is helpful to our work, but we also recognise it will be helpful for this community - and this is where I believe your question is related (hopefully).
Any AI tool works best with some kind of feedback system - Reinforcement Learning with Human Feedback is the most well known with the release of ChatGPT. For us to be successful in creating this tool (and not have it disappear as useless), we will need to make sure it is useful - and this requires the same approach of designing a feedback system. So yep, we will offer thoughtfulness (which I hope is already being shown here) and diligence (again...) throughout this work.
This fix addresses the pink texture issue in Add-Ons by removing the atlas size limitations, potentially increasing memory usage and causing a crash if it runs out of memory? Correct me if I'm wrong.
Thank you for the thoughtful and detailed answer, I really appreciate it
So the 250mil pixel limit is going away
I think so. I am not certain but again this is a bit off topic and not related to the channels topic, we can talk about this in - #add-ons if you wish?
Maybe it wouldn't be such a problem if bedrock didn't put ridiculous padding between every texture in the atlas
Wait, there's padding in the atlas?
#add-ons message
I got this export from someone which doesn't exactly look like there's padding.
Relocate
as long as it can do JSON UI i don’t care
No entity in the universe can do this well...
This is a horrible idea and there's a good chance it's against the Discord ToS.
It's grey area
Also, even if the bot was open sourced, there's no way to know if it's scraping every message.
Could you elaborate on that?
The code in the public repo might not be what's actually running on the VPS.
As a bot developer with an OSS bot, there's a lot of trust involved with these kinds of things.
If the AI learned from me, there would be a problem XD
If BAO were to host, would that alleviate some worries?
Not really. You're talking to the guy who won't use 3rd party bots.
Like I can trust our staff on Minecraft Commands because I've known them for over a decade, but I don't have that kind of relationship with you guys. Sorry if that sounds rude.
Alright, well, I deleted all my posts, so I no longer have any skin in this. I really wish there were better tutorials for various aspects of add-on development and it doesn't sound like whatever this is going to turn into will be something I'm going to have access to without ultimately shelling out cash.
The dataset will be open sourced, if the offer were accepted. And only posts opted in would be available for the dataset.
Personally, I see this as a good thing. The way I personally see it, people may use code posted in this discord for money anyway, so an ai company having the data won't make a big difference in that.
Also, the ai wouldn't be perfect, but I do think it would help reduce the very basic questions asked here.
To be super clear we would all have the data. We are offering to clean and prep so we can all try making tools with it.
the data collection should be opt in, not opt out methinks
That was the plan from the start, yes
hello, anytbing you post on discord is public, if you don’t want it collected leave
Yeah, but it's still not ethical.
guys you hit 1k messages already
Keep in mind until poll is closed you can change your vote!
Also keep in mind no profit for us, anyway
I asked him exactly how much free usages we will get, even an assumption is fine to me. Did he replied to it ?
I see no pings and I cant read all of that thing.
tl;dr - It depends on how much it is used by the community.
Firstly, our offer is to clean opted in questions and answers to create a good dataset we can all train tools on.
If we manage to succeed in making an AI assistant, we could give as much free usage as we can get free credits for! Microsoft, AWS and Nvidia (and more) can provide free credits for projects that meet their criteria for interesting ideas with AI. They are all 6 figure free credit programs so it would last a while. How long? Depends on how much it is used.
Regardless, eventually the free credits would run out and then you would be paying for your own credits as you use the tools - This is true for ANY tool unless the provider is willing to pay out of their own pockets for AI generations - which would not last long.
But again - The offer is to clean opted in questions and answers to create a good open source dataset.
So, let me make sure I understand correctly. The tool will be free for a while after its release, but later on, users will need to pay to use it. This might not be too appealing to many people here. I really like and support the creating of a good open source dataset though!
There's still only one problem, as Ersatz stated, which I've also been pointing out from the start. However, I didn't get a satisfactory response from the mods. The question is, how can we trust you that you'll not use the data of the users who haven't opted in?
BAO will supply the data. We won't scrape it or look for ourselves. That is the point of this offer - To be transparent about how we are working with the community and datasets. TBH anyone could do this (clean and prep the data), including the BAO team themselves. We are just offering to do it as we are already heading in this direction.
I would love to supply the tools we or others make for free forever - Please convince Microsoft to give us free GPU hours and then all of our woes are solved. (I have a funny feeling they'd supply free GPU hours anyway, which is why I am confident we can make free hours work - but it won't be infinite 😦 )
To be frank, eventually general purpose AI will do a fine job coding for spaces like this - but do we want to wait for that?
If you read the pins and announcement, that is what will be implemented
By "BAO" you mean BAO Administration right not the people themselves who have opted in. It'll be scraped manually or automatically ?
You're a non profit or for profit company ? If you're a non-profit company, I get your point but If you're a for profit company, your message doesn't fits 😄
Later on, if any aspect of the product becomes paid, will the dataset contributors receive any discounts or free credits per N answers/questions contributed?
Could you also comment on this if possible: #1231324824387977286 message
Plenty of for profit companies offer things for free! Microsoft and Mojang being the biggest examples we can understand here ☺️
That would be up to the BAO team. We are offering to clean and prepare whatever this community would like to offer!
That question has been answered already in the FAQ pinned.
If it's free. YOU are the commodity.
Which in this case is fine. Thats what we're voting on being. lol
I personally want the bedrock dev addon information to be as accessable as it can be. the more we can get the information out there the better. The informaiton should be free.
The database will always be free. We actually won’t own or control any of that, the BAO team and this community would. We are simply offering to clean, prepare and return for open source access.
Any AI assistant tool that we make from the dataset will be free as long as we can get resources free for it. Free GPU hours from Microsoft etc
Ah okay, thank you for responding. I was wondering if you were willing to consider something like that for your tool. Since it could also encourage users to contribute to the dataset more.
That would def be the goal! TBH general purpose AI will eventually get good enough that it’ll be a capable assistant off the shelf.
Just curiously, what are people's reasons for voting no on the most recent poll?
How I see it, it's an optional thing which you need to opt-in in order to participate. I know I'm probably not going to opt-in, but if people would like to, why not right?
Also, I know you mentioned that the filtered dataset will be made open source. Will the AI itself also be made open source or not?
A lot of people are afraid there data will be scraped, and that it will go paid after the community contributed for free
I personally just don’t think it’s the right direction to take
It's most likely against the Discord ToS too. Discord has taken down projects using its user data (This is even happening right now).
Didnt we read the TOS and it clearly says without consent. With consent it's allowed. As long as.. the other condictions
It's post above
its quite easy to finetune the data obtained in the server on any open source model given you have GPU access 😄
You can rent gpus
or just run a finetune on openai, monsterapi, etc
yep. I've got access to 6. so over a while I could make a gpt 2.0 maybe.
thousands of LLMs free you can run locally Even on your phone
Question: Can I as a server owner datamine my own members?
There's still no way to check that they're not scraping everything. Even their AI bot that's trained on our data that they might want to add here could be scraping everything.
Its already being done.
So if it's not stoppable , I think its worth using for good when we can try to.
Maybe it's not perfect but it's better than inaction and just letting the bad guys steal and win while we don't do anything at all. Because of? Fear of the maybe... It's not how I live my life. Risk is part of life and moving forward.
I can see that concern for sure. But, If they're going to scrape the whole server without respecting opt-ins, they'd already be doing it anyway. But i can understand worrying about this making it easier for data to be scraped. I just personally don't see it like that
There is no way in a community this size it's not already being scraped. FYI
Yeah, fair enough. Which goes along with my point
It wasn't being scraped on that site Discord is taking down. 🤷♀️
Not even a way to detect and know it is or not. Amazon, Google, microsoft can't and haven't been able to stop it. Discord wont either.
The BAO team will be in charge of collecting data and making sure our terms, which pretty much involves the consent role, are followed.
This doesn't goes against Discord TOS, so not be mistaken. Going by the assumption that they are going to do so, doesn't really give us the right to say they are(not saying you are saying this, Ersatz).
Unfortunately, there is no way to spot people or bots who are scrapping messages. This is something far beyond inevitable.
I'm just saying this as a bot developer. I know there's a lot of questionable things you can do.
I researched this for a large corp client recently asking if they should pay amazon crazy money to protect their assets more... I'm a codeless moron monkey, "could" bypass all amazon protections in a few hours.
Even if we all moved to an encrypted app it wouldn't matter.
Where I live we have 1 party consent laws. So pretty much in public anyone can record anyone.
The Food Chain Denny's records you while eating.
Where I am, we have 2 party consent laws. So you need to mention you're recording.
Which is why I really like their this is a conversation and an opt-in. Freedom of choice is always better. I agree with your concerns. But a decade+ ago we changed from the enemey is outside the "gates/firewall" to you have to assume the enemy is inside the gates, on your LAN, internal already these days. So I already assume any information I share is taken as soon as I start typing... Notice NOT when I send it!
Drafts and typing is also monitored.
If this AI gonna be for free so why not
Bots can't scrape data if they don't have channel access though no? And bots can't bypass verification bots like Double Counter which is widely used in many large servers. Unless I've got it wrong..?
Yeah. Bots are not the only discord accounts that can scrap messages.
Normal user accounts as well?
Yeah. Anyone can do that, and without consent that goes agaisnt discord TOS, but if you don't say nobody would know, sadly.
Very common nowdays with technology, im actually surprised you didn't know 😅
I mean, I didn't know you could do that via a normal Discord user account as well 😅
so will the actual model trained be open sourced?
No, but the information used to train it will be. Please check the FAQ pinned
Right ok
If one would opt out of the data collection and send a code snippet and someone who is opt in re-sends that code will that code be used for training?
I mean, I don't see how the bot would be able to tell the difference between the 2 if it can't access your message
No. When we collect data, we will try our best to make sure we indeed can use said data. We would use the Discord search feature to make sure of this.
No bot will be collecting data. Please check the pinned FAQ.
I just assumed it would be automated from the BAO side, mb
Tbh if the data is manually hand picked by the staff team I just don't see how that would ever work out. You guys do realize how much data is actually required for an ai?
Questions would have to be either very spesific or the answers quite generic
We would only search from people with the role, which will significatively put numbers down, and the data will undergo a process of cleaning, which is what was announced.
hewwo
Due to how wide the conversation is on Ai I wanted to show how I use Ai as a tool now in real time and why we need better datasets like requested here. Why we need a Ai spear not a general Ai hammer.
So watch cyberaxe fail and succeed using the current state of the Bedrock Dev Assistant. Not scripted.
https://youtu.be/AhIvYtJonfs
Presented by CyberAxe of www.OutLandishlyCrafted.com
Tip and Support Welcome, it takes hordes of hours to provide free support.
HTTP://www.OutLandishlyCrafted.com
#minecraft, #bedrock, #mcpe, #indiegamedev, #blockbench, #animations, #portals, #prototype #live
Great video Axe! ❤️
You enjoy watching me fail. lol
xD
You still did a good job demonstrating how useful it can be and clarifying some misconceptions a lot of people seem to have lol
thank you. It's really hard to know if it's helpful or not. So I really really appreciate the feedback.
The dark side of generalist Ai's..
My next guide will be me 3 hours in on a project its making and it's looped and failed 50 times now. And it's just a scrolling list of me cussing at it.
I'm doing better with my Ai induced Anger issues. lol
It really is like a 10 year old some days and you just want to strangle it. But when it's does fail loop you have so little outlet. So I take to cussing at it.
Why does it make me angry? Because for 40 years(all of mankind history) I've/we've dreamed about this moment and it's so close, but it still so stupid at times and hurts.
I understand lol
Right now, I think I'd rather have 5 specialist Ai's for 5 tasks. Than 1 generalist Ai for 100 tasks.
Not exactly a useful/relevant feedback but I feel the dramatic Disney-like voice overs in the intros can be over the top or mildly annoying for first-time listeners. But usually people are pretty quick to familiarise in a minute anyway xD
Also I figure it's like a speech style you've chosen for your content
Yes it's a 80/20, I get alot of comments just about it, more than even my content. Which is interesting. One recently was it "brighten my day", so many guides and informational videos are so mono and robotic.
20% however, really really dislike it. lol. I figure they'll like to Docs or other guides better then. Can't please everyone, and when you do you cheapen it for someone.
I watch so many guides on so many Tech, dev, IT subjects and most are just so dry and dead. I guess, if 1 guide in the 10 people watch this week was silly and an upper instead of a mono or downer. I effected people for the better. Sometimes we need a shock to the system. Granted some people don't like that. We are doing Minecraft Devs should everything we do have more silly in it and less mono.
Exactly lol
Educational content is so much better, and easier to learn when it's more expressive than monotonous.
You can't please everyone, but glad to hear most of your audience actually love it haha
If anything, you could try balancing it here and there.
I personally really enjoyed listening to you. It kinda felt like I was having a conversation with the speaker than someone just going on and on about all the technical stuff xD
I agree. If anything comes of this offer, I think it has accelerated the conversation in the direction of "How should we do it then?". Which is a great step for us all!
We do hope the idea of open sourcing the datasets continues as a given for our kinds of communities - This means we can all innovate together.
This was technically a resounding victory for "Yes", but there are too many people pissed off, and concerns over quality and usefulness have been well stated.
Realistically, I think we have to at least wait for our userbase to get more comfortable with AI in general. Many, many people are especially concerned with the implications of AI for art and are extending that frustration to help bots.
Whether I (or anyone on the staff) agrees with these takes is not relevant. We should revisit this within the next year for sure, as there are many instances on this server where the same questions are asked and ideas are posed, but… yeah, not the right time.
yay
Thank you all for your thoughts and contributions 🙂
We know this is a new space for us all and look forward to working with everyone navigating through the future of Minecraft, Add-Ons, and development etc. with AI that is safe, ethical and empowering for players and creators alike ❤️
Thank you @topaz wharf @quasi badge @sharp cliff and team for letting us offer to the community and for the space to discuss this in a safe and open environment!
Of course! Thank you Fetxu for the offer, I hope that all of the messages sent here by our community can work to make the project better and stronger. Like Ciosciaa said, we should revisit this topic the next year, as it can be beneficial for all of us.
Sorry it worked out like it did.
I think this was a good decision overall, and it was handled very maturely and responsibly. Can’t wait to see the improvements/ideas y’all come up with when this topic is revisited.
Yeah I'm not against this idea as a whole as it could greatly help with new developers and solving problems quicker but the ai space is very grey right now and for me there wasn't enough information provided and a few things I will state:
- A clearer indication on the pricing in the future would be good so we know what to expect since it's very up in the air right now which isn't really ideal (I am not against charging for this as you would be hosting the model)
- What your intentions are and what other things you would do with this data would be good to know so we get a clear understanding of what your company's motives are since I had never heard of it before
- perhaps a local version of this model could be provided (doesn't have to be full) as I feel we would deserve this since we would be a major contributing factor and I am aware the dataset for this would be public but depending on it's size someone having the required power and resources especially in this community to train it would be unlikely.
I hope you don't take this harshly as I am not against this there's just a few grey area things that don't touch me right but I hope if this is requested again things are stated more clearly.
Have a great day
Hey there! I agree, the space is super grey and one we will all have to navigate as we go forward - We certainly will not be the first or last group to try building AI tools for this community and other Minecraft creator communities.
Ultimately the conversation will boil down to a few important challenges: data sets, inference costs, and the rapidly evolving nature of AI tech and models.
- The cost is really in the inferencing - generating the answers. For LLM's these are relatively small (compared to art etc). Any cost would be structured to cover the inferencing costs. What will this be? It changes everyday atm - With AI foundation models generating in varying degrees competency and inferencing times.
- The only thing we would use the data set is for creating coding assistants - There is practically no other use for it. Minecraft development does not translate well to any other space outside of Minecraft. This data set would be made available to us all btw.
- I want to super clear here - We are not building a ground up LLM AI model. That would be an insane amount of work. What we are hoping to achieve is fine tune a modern foundation LLM (LLAMA, MIXTRAL, GPT, CLAUDE etc..) with the cleaned and prepared data. This is why anyone here could do it - but we can only do it with a great dataset. This is what we are wanting to do.
This is obviously not happening here with this space - But we are continuing to prepare a dataset with what we can. We are also interested in sharing OS the data set we do build. Hopefully this and the Minecraft community can come together like we have throughout Minecraft's time on earth to build the kinds of data sets that will help us all create and play using AI to ease the burden of development as much as possible.
Hope that helps!
We found this conversation really valuable in hearing how our community feels about AI and what it might mean for our future together as a community. 🙂
I'm curious, for those who are interested in assisting in giving you data, is there any place we can send copies of addons that we have made?
Thanks for the clarification on the things I stated and it definitely cleared some things up for me, I hope this can be redone in the future with things more clearly stated (partly my bad for not asking these things) cheers :)
Honestly the AI will learn nothing from this server 😂
No lol. This server actually has a lot of valuable information and resources, some of which can't be found in the wikis. A lot of the times you don't even have to ask since you can use the search to find previous answers for the same. But how effectively an Ai will be able to use the data here is something the Ai experts can answer more accurately
30 seconds in, it's gonna turn into Ultron
You think anyone is mentally stable enough to work after every damn api/format update? Now imagine going through that but in 30 seconds
HAHAHAHAHAHAHAHA HAHAHAHAHA-
AI 0 - humanity 1
This thing it will hate.
It will hate Mojang.
Have you used Ai as a dev tool? I'm not sure what you're talking about here.
It'll create passive aggressive comments instead.
hm it sure will know how to fix errors 😂
why is everyone toxic with CraftBench? It's a good, cool project that gets better over time.
Many people are very conscious about where their data is, and/or are nervous about ai. I'm neither of those, but I can see where those people are coming from
not like theirs personal information being shared
Microsoft: Let us introduce you to Copilot Recall.
its all done locally but that wont stop us from taking a small bite of telemetry
huh
BTW, Discord is updating their Developer ToS on July 8th and they banned training AI using messages unless you get permission.
https://discord.com/developers/docs/policies-and-agreements/developer-policy#handle-data-with-care
lol mean while adobe adds machine learn scanning of customer data.

😂
You see why people like me dont like companies doing this?
Interesting to see this chat getting revived every certain amount of time

so what’s the takeaway from this now?
The poll is majority yes, can we update our roles for it?
Check pinned messages, please