#Discuss allowing this Discord to be offiicially used in training AI

1 messages · Page 2 of 1

gusty charm
#

I am not a pro I am hardly a Minecraft developer, why are you using that as an argument

#

Have you tried learning a different language with ChatGPT? It's a horrible way to

plucky swift
#

How about the daily learning scripting debates we have. Ai wouldn't help that?

gusty charm
#

debates...? with an AI.. what did we do before computers ?

plucky swift
#

I was told computers were goingto take all the jobs and ruin our brains. Causing brainrot as a kid in the 80s

scenic bluff
gusty charm
#

Distributed training is a thing by the way, so we can in fact do it without a business, if its really for the community. Its even built into Pytorch

limpid tiger
plucky swift
#

Other way. Training is the cost. Running is cheap.

unique echo
#

people could run it locally?

gusty charm
#

That's why it being open source is also far from acceptable, you either need money to pay for compute OR a computer that can run it

plucky swift
#

You can run GPT on your cell phone now, locally. Offline.

gusty charm
#

OpenAI is quite the company you know

limpid tiger
sharp cliff
#

AI will definitely result helpful for Add-On creators, assuming that’s the topic(?).

Specially if the model is trained to handle add-on inquiries, questions like;
What methods do I have to spawn a mob in my world?

Would result in answers like

  • You can use /summon
  • You can use structures
  • You can use spawn rules
  • You can use spawners
    Etc etc.

Or prompts like “Provide me a working, simple block template”
Would result in the AI to give you an almost empty but working block template for you to use.

Overall, I visualize an “Add-On AI Chat Bot” like this to provide the methods to do certain tasks more than it simply doing the work for you. Yes, if the task is simple enough, like the template example above, it would do it for you, but mostly starters will need that.

We also have to remember that it would probably not have the limitation that ChatGPT does with links, so it could pretty much lead users to the right direction.

#

holy smokes, that’s a long message. Sorry!

plucky swift
gusty charm
#

How about an add-ons wiki

acoustic terrace
sharp cliff
#

Ikr

sharp cliff
#

no idea why it doesn’t exists tbh

acoustic terrace
#

or an faq bot maybe?

plucky swift
unique echo
jolly maple
sharp cliff
#

Yeah, @young steeple what do you have to say?

young steepleBOT
#
Info

This bot was created by SmokeyStack for the purpose of making a FAQ bot for the Bedrock Add-Ons Discord Server.

Managing Entries

To manage entries, please make a pull request on GitHub.

Source Code
ocean locust
#

People starting off addons are impatient and just want to get stuff working. They can't be bothered to read wiki articles, and get chatGPT to make stuff for them... which doesn't perform in game.

gusty charm
#

Wow thank god the robot can't try to hallucinate a sentence

#

Also consider adding 'reddit' to the end of your google searches

latent sable
#

damn this is wild.

sharp cliff
limpid tiger
jolly maple
plucky swift
hollow vector
#

Hey there, missed this.

Let's split the work we are going to do into two parts:

  1. Clean and prepare data for use in AI training and more
  2. Create an AI assistant to support development

The first step is the actual offer we have given the community and team here. This is the most important step to be honest as it will provide the most benefits to us all. It will mean looking over the data and working out how to find the most useful question and answer pairs. It will also mean we create several types of data sets to address different approaches for training and different model requirements.

The second step is the reason we (myself and our team) have offered to do this. We want to try making a useful assistant. Initially this is helpful to our work, but we also recognise it will be helpful for this community - and this is where I believe your question is related (hopefully).

Any AI tool works best with some kind of feedback system - Reinforcement Learning with Human Feedback is the most well known with the release of ChatGPT. For us to be successful in creating this tool (and not have it disappear as useless), we will need to make sure it is useful - and this requires the same approach of designing a feedback system. So yep, we will offer thoughtfulness (which I hope is already being shown here) and diligence (again...) throughout this work.

jolly maple
rocky meteor
plucky swift
jolly maple
rocky meteor
#

Maybe it wouldn't be such a problem if bedrock didn't put ridiculous padding between every texture in the atlas

true pumice
#

Wait, there's padding in the atlas?

plucky swift
#

#add-ons message

true pumice
#

I got this export from someone which doesn't exactly look like there's padding.

plucky swift
#

Relocate

shadow edge
#

as long as it can do JSON UI i don’t care

hollow vector
#

No entity in the universe can do this well...

orchid eagle
#

This is a horrible idea and there's a good chance it's against the Discord ToS.

quasi badge
#

It's grey area

orchid eagle
#

Also, even if the bot was open sourced, there's no way to know if it's scraping every message.

orchid eagle
#

The code in the public repo might not be what's actually running on the VPS.

#

As a bot developer with an OSS bot, there's a lot of trust involved with these kinds of things.

dim swan
#

If the AI learned from me, there would be a problem XD

quasi badge
orchid eagle
#

Not really. You're talking to the guy who won't use 3rd party bots.

#

Like I can trust our staff on Minecraft Commands because I've known them for over a decade, but I don't have that kind of relationship with you guys. Sorry if that sounds rude.

open nebula
#

Alright, well, I deleted all my posts, so I no longer have any skin in this. I really wish there were better tutorials for various aspects of add-on development and it doesn't sound like whatever this is going to turn into will be something I'm going to have access to without ultimately shelling out cash.

hollow vector
teal fractal
#

Personally, I see this as a good thing. The way I personally see it, people may use code posted in this discord for money anyway, so an ai company having the data won't make a big difference in that.

Also, the ai wouldn't be perfect, but I do think it would help reduce the very basic questions asked here.

hollow vector
hard sigil
#

the data collection should be opt in, not opt out methinks

ocean locust
#

That was the plan from the start, yes

shadow edge
#

hello, anytbing you post on discord is public, if you don’t want it collected leave

orchid eagle
#

Yeah, but it's still not ethical.

graceful idol
#

guys you hit 1k messages already

#

Keep in mind until poll is closed you can change your vote!

#

Also keep in mind no profit for us, anyway

lusty kiln
#

I asked him exactly how much free usages we will get, even an assumption is fine to me. Did he replied to it ?

#

I see no pings and I cant read all of that thing.

hollow vector
# lusty kiln I asked him exactly how much free usages we will get, even an assumption is fine...

tl;dr - It depends on how much it is used by the community.

Firstly, our offer is to clean opted in questions and answers to create a good dataset we can all train tools on.

If we manage to succeed in making an AI assistant, we could give as much free usage as we can get free credits for! Microsoft, AWS and Nvidia (and more) can provide free credits for projects that meet their criteria for interesting ideas with AI. They are all 6 figure free credit programs so it would last a while. How long? Depends on how much it is used.

Regardless, eventually the free credits would run out and then you would be paying for your own credits as you use the tools - This is true for ANY tool unless the provider is willing to pay out of their own pockets for AI generations - which would not last long.

#

But again - The offer is to clean opted in questions and answers to create a good open source dataset.

lusty kiln
#

There's still only one problem, as Ersatz stated, which I've also been pointing out from the start. However, I didn't get a satisfactory response from the mods. The question is, how can we trust you that you'll not use the data of the users who haven't opted in?

hollow vector
hollow vector
# lusty kiln So, let me make sure I understand correctly. The tool will be free for a while a...

I would love to supply the tools we or others make for free forever - Please convince Microsoft to give us free GPU hours and then all of our woes are solved. (I have a funny feeling they'd supply free GPU hours anyway, which is why I am confident we can make free hours work - but it won't be infinite 😦 )

To be frank, eventually general purpose AI will do a fine job coding for spaces like this - but do we want to wait for that?

quasi badge
lusty kiln
lusty kiln
dusty imp
hollow vector
hollow vector
sharp cliff
plucky swift
#

Which in this case is fine. Thats what we're voting on being. lol

#

I personally want the bedrock dev addon information to be as accessable as it can be. the more we can get the information out there the better. The informaiton should be free.

hollow vector
dusty imp
hollow vector
#

That would def be the goal! TBH general purpose AI will eventually get good enough that it’ll be a capable assistant off the shelf.

brittle rover
#

Just curiously, what are people's reasons for voting no on the most recent poll?
How I see it, it's an optional thing which you need to opt-in in order to participate. I know I'm probably not going to opt-in, but if people would like to, why not right?

#

Also, I know you mentioned that the filtered dataset will be made open source. Will the AI itself also be made open source or not?

weary pond
#

A lot of people are afraid there data will be scraped, and that it will go paid after the community contributed for free

#

I personally just don’t think it’s the right direction to take

orchid eagle
#

It's most likely against the Discord ToS too. Discord has taken down projects using its user data (This is even happening right now).

plucky swift
#

Didnt we read the TOS and it clearly says without consent. With consent it's allowed. As long as.. the other condictions

#

It's post above

terse totem
#

its quite easy to finetune the data obtained in the server on any open source model given you have GPU access 😄

plucky swift
#

You can rent gpus

terse totem
#

or just run a finetune on openai, monsterapi, etc

plucky swift
#

yep. I've got access to 6. so over a while I could make a gpt 2.0 maybe.

plucky swift
#

Question: Can I as a server owner datamine my own members?

orchid eagle
plucky swift
#

Its already being done.

orchid eagle
#

So?

#

Issue like this is why we need strict data collection laws.

plucky swift
#

So if it's not stoppable , I think its worth using for good when we can try to.

Maybe it's not perfect but it's better than inaction and just letting the bad guys steal and win while we don't do anything at all. Because of? Fear of the maybe... It's not how I live my life. Risk is part of life and moving forward.

teal fractal
#

I can see that concern for sure. But, If they're going to scrape the whole server without respecting opt-ins, they'd already be doing it anyway. But i can understand worrying about this making it easier for data to be scraped. I just personally don't see it like that

plucky swift
teal fractal
#

Yeah, fair enough. Which goes along with my point

orchid eagle
#

It wasn't being scraped on that site Discord is taking down. 🤷‍♀️

plucky swift
#

Not even a way to detect and know it is or not. Amazon, Google, microsoft can't and haven't been able to stop it. Discord wont either.

sharp cliff
#

The BAO team will be in charge of collecting data and making sure our terms, which pretty much involves the consent role, are followed.

This doesn't goes against Discord TOS, so not be mistaken. Going by the assumption that they are going to do so, doesn't really give us the right to say they are(not saying you are saying this, Ersatz).

Unfortunately, there is no way to spot people or bots who are scrapping messages. This is something far beyond inevitable.

orchid eagle
#

I'm just saying this as a bot developer. I know there's a lot of questionable things you can do.

plucky swift
#

I researched this for a large corp client recently asking if they should pay amazon crazy money to protect their assets more... I'm a codeless moron monkey, "could" bypass all amazon protections in a few hours.

#

Even if we all moved to an encrypted app it wouldn't matter.

#

Where I live we have 1 party consent laws. So pretty much in public anyone can record anyone.
The Food Chain Denny's records you while eating.

orchid eagle
#

Where I am, we have 2 party consent laws. So you need to mention you're recording.

plucky swift
#

Which is why I really like their this is a conversation and an opt-in. Freedom of choice is always better. I agree with your concerns. But a decade+ ago we changed from the enemey is outside the "gates/firewall" to you have to assume the enemy is inside the gates, on your LAN, internal already these days. So I already assume any information I share is taken as soon as I start typing... Notice NOT when I send it!
Drafts and typing is also monitored.

inner sapphire
#

If this AI gonna be for free so why not

dusty imp
sharp cliff
#

Yeah. Bots are not the only discord accounts that can scrap messages.

dusty imp
#

Normal user accounts as well?

sharp cliff
#

Yeah. Anyone can do that, and without consent that goes agaisnt discord TOS, but if you don't say nobody would know, sadly.

dusty imp
#

oh

#

I did not know that lol
That's pretty concerning..

sharp cliff
#

Very common nowdays with technology, im actually surprised you didn't know 😅

dusty imp
#

I mean, I didn't know you could do that via a normal Discord user account as well 😅

graceful light
#

so will the actual model trained be open sourced?

sharp cliff
#

No, but the information used to train it will be. Please check the FAQ pinned

graceful light
#

Right ok

lethal drum
#

If one would opt out of the data collection and send a code snippet and someone who is opt in re-sends that code will that code be used for training?

teal fractal
#

I mean, I don't see how the bot would be able to tell the difference between the 2 if it can't access your message

sharp cliff
sharp cliff
teal fractal
#

I just assumed it would be automated from the BAO side, mb

lethal drum
#

Tbh if the data is manually hand picked by the staff team I just don't see how that would ever work out. You guys do realize how much data is actually required for an ai?

#

Questions would have to be either very spesific or the answers quite generic

sharp cliff
#

We would only search from people with the role, which will significatively put numbers down, and the data will undergo a process of cleaning, which is what was announced.

tawdry coyote
#

hewwo

plucky swift
#

Due to how wide the conversation is on Ai I wanted to show how I use Ai as a tool now in real time and why we need better datasets like requested here. Why we need a Ai spear not a general Ai hammer.

So watch cyberaxe fail and succeed using the current state of the Bedrock Dev Assistant. Not scripted.
https://youtu.be/AhIvYtJonfs

Presented by CyberAxe of www.OutLandishlyCrafted.com

Tip and Support Welcome, it takes hordes of hours to provide free support.
HTTP://www.OutLandishlyCrafted.com

#minecraft, #bedrock, #mcpe, #indiegamedev, #blockbench, #animations, #portals, #prototype #live

▶ Play video
plucky swift
#

You enjoy watching me fail. lol

dusty imp
#

You still did a good job demonstrating how useful it can be and clarifying some misconceptions a lot of people seem to have lol

plucky swift
#

The dark side of generalist Ai's..
My next guide will be me 3 hours in on a project its making and it's looped and failed 50 times now. And it's just a scrolling list of me cussing at it.
I'm doing better with my Ai induced Anger issues. lol
It really is like a 10 year old some days and you just want to strangle it. But when it's does fail loop you have so little outlet. So I take to cussing at it.
Why does it make me angry? Because for 40 years(all of mankind history) I've/we've dreamed about this moment and it's so close, but it still so stupid at times and hurts.

dusty imp
#

I understand lol

plucky swift
#

Right now, I think I'd rather have 5 specialist Ai's for 5 tasks. Than 1 generalist Ai for 100 tasks.

dusty imp
plucky swift
#

Yes it's a 80/20, I get alot of comments just about it, more than even my content. Which is interesting. One recently was it "brighten my day", so many guides and informational videos are so mono and robotic.
20% however, really really dislike it. lol. I figure they'll like to Docs or other guides better then. Can't please everyone, and when you do you cheapen it for someone.
I watch so many guides on so many Tech, dev, IT subjects and most are just so dry and dead. I guess, if 1 guide in the 10 people watch this week was silly and an upper instead of a mono or downer. I effected people for the better. Sometimes we need a shock to the system. Granted some people don't like that. We are doing Minecraft Devs should everything we do have more silly in it and less mono.

dusty imp
# plucky swift Yes it's a 80/20, I get alot of comments just about it, more than even my conten...

Exactly lol
Educational content is so much better, and easier to learn when it's more expressive than monotonous.

You can't please everyone, but glad to hear most of your audience actually love it haha
If anything, you could try balancing it here and there.

I personally really enjoyed listening to you. It kinda felt like I was having a conversation with the speaker than someone just going on and on about all the technical stuff xD

hollow vector
topaz wharf
#

This was technically a resounding victory for "Yes", but there are too many people pissed off, and concerns over quality and usefulness have been well stated.

#

Realistically, I think we have to at least wait for our userbase to get more comfortable with AI in general. Many, many people are especially concerned with the implications of AI for art and are extending that frustration to help bots.

Whether I (or anyone on the staff) agrees with these takes is not relevant. We should revisit this within the next year for sure, as there are many instances on this server where the same questions are asked and ideas are posed, but… yeah, not the right time.

topaz wharf
hard sigil
#

yay

hollow vector
#

Thank you all for your thoughts and contributions 🙂

We know this is a new space for us all and look forward to working with everyone navigating through the future of Minecraft, Add-Ons, and development etc. with AI that is safe, ethical and empowering for players and creators alike ❤️

Thank you @topaz wharf @quasi badge @sharp cliff and team for letting us offer to the community and for the space to discuss this in a safe and open environment!

sharp cliff
#

Of course! Thank you Fetxu for the offer, I hope that all of the messages sent here by our community can work to make the project better and stronger. Like Ciosciaa said, we should revisit this topic the next year, as it can be beneficial for all of us.

topaz wharf
#

Sorry it worked out like it did.

weary pond
#

I think this was a good decision overall, and it was handled very maturely and responsibly. Can’t wait to see the improvements/ideas y’all come up with when this topic is revisited.

graceful light
#

Yeah I'm not against this idea as a whole as it could greatly help with new developers and solving problems quicker but the ai space is very grey right now and for me there wasn't enough information provided and a few things I will state:

  • A clearer indication on the pricing in the future would be good so we know what to expect since it's very up in the air right now which isn't really ideal (I am not against charging for this as you would be hosting the model)
  • What your intentions are and what other things you would do with this data would be good to know so we get a clear understanding of what your company's motives are since I had never heard of it before
  • perhaps a local version of this model could be provided (doesn't have to be full) as I feel we would deserve this since we would be a major contributing factor and I am aware the dataset for this would be public but depending on it's size someone having the required power and resources especially in this community to train it would be unlikely.

I hope you don't take this harshly as I am not against this there's just a few grey area things that don't touch me right but I hope if this is requested again things are stated more clearly.
Have a great day

hollow vector
#

Hey there! I agree, the space is super grey and one we will all have to navigate as we go forward - We certainly will not be the first or last group to try building AI tools for this community and other Minecraft creator communities.

Ultimately the conversation will boil down to a few important challenges: data sets, inference costs, and the rapidly evolving nature of AI tech and models.

  1. The cost is really in the inferencing - generating the answers. For LLM's these are relatively small (compared to art etc). Any cost would be structured to cover the inferencing costs. What will this be? It changes everyday atm - With AI foundation models generating in varying degrees competency and inferencing times.
  2. The only thing we would use the data set is for creating coding assistants - There is practically no other use for it. Minecraft development does not translate well to any other space outside of Minecraft. This data set would be made available to us all btw.
  3. I want to super clear here - We are not building a ground up LLM AI model. That would be an insane amount of work. What we are hoping to achieve is fine tune a modern foundation LLM (LLAMA, MIXTRAL, GPT, CLAUDE etc..) with the cleaned and prepared data. This is why anyone here could do it - but we can only do it with a great dataset. This is what we are wanting to do.

This is obviously not happening here with this space - But we are continuing to prepare a dataset with what we can. We are also interested in sharing OS the data set we do build. Hopefully this and the Minecraft community can come together like we have throughout Minecraft's time on earth to build the kinds of data sets that will help us all create and play using AI to ease the burden of development as much as possible.

Hope that helps!

We found this conversation really valuable in hearing how our community feels about AI and what it might mean for our future together as a community. 🙂

teal fractal
#

I'm curious, for those who are interested in assisting in giving you data, is there any place we can send copies of addons that we have made?

graceful light
quick wave
#

Honestly the AI will learn nothing from this server 😂

dusty imp
# quick wave Honestly the AI will learn nothing from this server 😂

No lol. This server actually has a lot of valuable information and resources, some of which can't be found in the wikis. A lot of the times you don't even have to ask since you can use the search to find previous answers for the same. But how effectively an Ai will be able to use the data here is something the Ai experts can answer more accurately

finite bane
#

You think anyone is mentally stable enough to work after every damn api/format update? Now imagine going through that but in 30 seconds

#

HAHAHAHAHAHAHAHA HAHAHAHAHA-

AI 0 - humanity 1

#

This thing it will hate.
It will hate Mojang.

plucky swift
#

Have you used Ai as a dev tool? I'm not sure what you're talking about here.

orchid eagle
#

It'll create passive aggressive comments instead.

delicate plover
#

hm it sure will know how to fix errors 😂

heavy flame
#

why is everyone toxic with CraftBench? It's a good, cool project that gets better over time.

teal fractal
#

Many people are very conscious about where their data is, and/or are nervous about ai. I'm neither of those, but I can see where those people are coming from

shadow edge
#

not like theirs personal information being shared

orchid eagle
#

Microsoft: Let us introduce you to Copilot Recall.

tawdry coyote
#

its all done locally but that wont stop us from taking a small bite of telemetry

plucky swift
#

huh

orchid eagle
plucky swift
#

lol mean while adobe adds machine learn scanning of customer data.

tender carbon
quick wave
#

😂

signal fossil
sharp cliff
#

Interesting to see this chat getting revived every certain amount of time

unborn pond
noble rain
#

so what’s the takeaway from this now?

#

The poll is majority yes, can we update our roles for it?

grave seal