#Github Copilot Agent

1 messages · Page 1 of 1 (latest)

versed glade
#

@mild roost do you mean a bad idea for Github to launch this? Or a bad idea for Dagger contributors to use it?

mild roost
#

I think it's legally compromised.

versed glade
#

Oh I see

mild roost
#

the whole situation with the lawsuits over model theft of content from all over the internet is one part.

#

lawyers who use office 365 are suing because it's exfiltrating their legal documents that are made in confidence and even if they have guardrails to keep them from leaking out of their practice, they don't have any guardrails to prevent them from leaking between clients in the same practice, which destroys protections to maintain confidentiality.

#

two things you don't mess with: confidentiality and client's money. more lawyers are disbarred over those two issues than any other.

#

Like I don't mind rules-based code generators but I've seen these tools just lift things from code with the wrong license and I'm really not a fan of the exposure.

#

(by wrong license, I mean any of the so-called "viral" licenses)

slate pawn
#

not to get lawyery, but copyright infringement does require intent to copy, and stealing GPL code through an AI introduces a ton of doubt into an already-very-difficult-to-prove-intent situation

#

put another way, we'll see if M$'s massive army of IP lawyers get more work to do because of this, but im kinda doubtful users will have issues bigger than unenforcible C&Ds

median karma
#

@mild roost @slate pawn @versed glade
I may be wrong about context here but, in case if you guys don't want to send your data then you can turn off it github copilot settings...

mild roost
#

At our company we have a policy against using new systems until we know the controls (in the sense of ensuring compliance) and they they pass IQ/OQ validation proving compliance. We have documentation requirements tied to CFR 21 part 11 (FDA).

#

There's a whole video on the youtube lawful masses about the lawer stuff I mentioned above. In their profession, the intent doesn't matter so much as the result, in the situation above. Also, the video is about the project manager for 365 not understanding that, microsoft moving fast and breaking things in the process all while apparently not asking lawyers if this use case could cause conflicts.

#

Microsoft has turned Copilot Generative AI features on for everyone in the product formally known as Office. Does Microsoft 365 with Copilot violate lawyers' duty to protect client confidential information? Ben Schorr from Microsoft responds - poorly.

#copilot #AI #lawyers

Join the Lawful Masses community on Discord!
https://discord.gg/cUqVPzQ...

â–¶ Play video
#

we have a similar confidentiality rule where someone working on a project for $redacted_a cannot take findings from unpublished work there over to $redacted_b

median karma
#

@mild roost I am saying about using such ai in open source project, while anyone is building 100% open source technology and not having any confidential code. at that time i think it can help...
And this is Github Copilot and not Microsoft Copilot...JFYI

mild roost
#

Don't ignore licenses. Our better put: ignore licenses at the cost of exposure

median karma
mild roost
#

I don't think any of it is fine from this perspective. They take in (appropriate) content they are not licensed to use.

#

there are many legal battles being fought over this on multiple fronts.

#

also, GoLand has some "ai" parts, and they recently broke the inline tool by trying to integrate into their "ai" stack. or google's use of gemini v1 as an assistant replacement that couldn't do the one thing for me that I do with assistant: set a timer.

median karma
slate pawn
# mild roost At our company we have a policy against using new systems until we know the cont...

copilot 365 is 100% a compliance/confidentiality/trade secrets nightmare, im just doubtful about the the GPL-bleed thing that's always the first thing software engineers jump to.... copyright law vs bar ethics vs trade secrets are all relatively orthogonal here. the craziest part of it in trade secrets law is that trade secrets are only trade secrets when you make "reasonable efforts" to protect confidentiality and there is a very strong argument to be made that if you feed your trade secrets to a 3rd party LLM you're not making a reasonable effort to keep it secret... then bam, your 150 year old top-secret coca-cola recipe is public domain. the bar has an even lower standard, "reasonable" does not apply, it's just don't break confidentiality period.

median karma
#

Anyways it's now your or dagger team's call to evaluate and use for faster development or not. Thanks for your time and knowledge.
@mild roost

ancient crest
#

With most of the large LLM vendors, you can buy plans where they do not train on your data. For example, this is part of paid Google Workspace accounts and Vertex AI

median karma
# ancient crest With most of the large LLM vendors, you can buy plans where they do not train on...

I propose we explore leveraging advanced Large Language Models (LLMs) like Gemini and Claude Sonnet to accelerate Dagger's development. Both offer potential advantages, and evaluating them is crucial for maximizing value, especially in the open-source context where rapid delivery is key. It's also important to consider the data handling practices of each provider.

I urge the Dagger core team to assess the potential of these LLMs for integration, considering factors like performance, cost, data handling practices, and any other relevant implications. Thank you.

ancient crest
#

If AI could understand the BuildKit code base, I would be impressed. Even I had a hard time learning it because it has near zero comments and is rather complicated overall

median karma
ancient crest
#

AI comments tend to just explain the steps of the code, not the conceptual logic or why things are done as they are

#

These types of comments are noise and don't help people

median karma
#

We can have two type of comments into code, one is to explain code and other details of code and other it to mention why or decision taken for this code.
We can prioritize such work and can start work.
But we should start adopting for open source project also in which everything is open. GenAi can give so much value to such projects.

ancient crest
#

They still hallucinate too much, they are still very elementary in their explanations, it's why projects don't adopt AI more

median karma
ancient crest
#

I think figuring that out will help you understand why I wouldn't want to work on that codebase

#

You're going to have to understand the code in order to know if the AI is doing a good job or not

median karma
ancient crest
#

Just one example that shows how these AI have no idea what they are talking about

#

Most of the advice in there is very generic, which is expected because transformers are giant averaging machines that predict the most likely next token(s).

median karma
ancient crest
#

My perspective is that they are wrong more often than right based on my usage. 99% is way overestimating, I don't recall any, what I would consider, insights in the video you shared.

#

On top of that, it made multiple errors and incorrect interpretations of the code. This only reinforces my opinion these things are not ready for prime time.

I'm with Yann LeCun when he says we need to move beyond transformers to live up to the hype

median karma
# ancient crest Most of the advice in there is very generic, which is expected because transform...

I just want to learn so i am asking...
As you mentioned this are generic comments so i wonder
In future should we consider upgrading the dagger code as mentioned by gemini as shown in this video? even for few suggestions...
If the answer is yes then no discussion required, and if the answer is no should we rethink?

What are your prospects on https://dagger.io/blog/new-ai-developer-devin devin adoption?
Check SWE from github here https://www.youtube.com/watch?v=VWvV2-XwBMM&embeds_referring_euri=https%3A%2F%2Fgithub.blog%2F&source_ve_path=MjM4NTE

ancient crest
#

A lot of companies have invested a lot of money into AI, they are going to overpromise and try to sell whatever they can to make that money back. Many others are on the AI marketing train, because that is what is vogue right now. You can show cool examples, but when tested outside of constrained examples, they all fail to live up to the promise.

re: the Dagger blog post and PR, I wonder if @versed glade noticed that they commited a 42Mb binary to git...?
https://github.com/dagger/dagger/pull/9130/files#r1881196667

#

The main thing I've learned from this AI hype cycle is that people really really want to outsource their thinking and the wizard behind the curtain has too many people believing these tools are ready for it, which they are not

mild roost
#

Reinforcing your point but you mentioned it not even being close to 99%

#

Also, I do not want comments about the what in code unless it's something like a weird hand rolled loop. Even so, I didn't want a redundant retelling of the steps; that's what the code is for. I want to know why. Why are you anding this value with that? What does a variable represent in a function? Where did this constant come from? What is this technique called?

#

Like, I hope everyone understands what the "reasoning" steps in these new models are for. They're to prestidigitate on the "impressive" thinking beyond the wrong answer. Just ask any of the explanatory mode models to give you a random number and explain it.

#

It goes on about lucky numbers and how 42 is used in popular culture and how a number less than ten doesn't look random enough and other word salads.

mild roost
versed glade