Updates to GitHub Copilot interaction data usage policy

[−] stefankuehnel 51d ago

If you scroll down to "Allow GitHub to use my data for AI model training" in GitHub settings, you can enable or disable it. However, what really gets me is how they pitch it like it’s some kind of user-facing feature:

Enabled = You will have access to the feature

Disabled = You won't have access to the feature

As if handing over your data for free is a perk. Kinda hilarious.

[−] data-ottawa 51d ago

It’s not so bad, there’s no double negative and it’s not a confusing “switch” that is always ambiguous as to whether it’s enabled or not.

In contrast when you create a a GCS bucket it uses a checkmark for enabling “public access prevention”. Who designed that modal? It takes me a solid minute to figure out if I’m publishing private data or not.

[−] mentalgear 51d ago

> On April 24 we'll start using GitHub Copilot interaction data for AI model training unless you opt out. Review this update and manage your preferences in your GitHub account settings.

Now "Allow GitHub to use my data for AI model training" is enabled by default.

Turn it off here: https://github.com/settings/copilot/features

Do they have this set on business accounts also by default? If so, this is really shady.

[−] QuadrupleA 51d ago

Fun fact: Copilot gives you no way to ignore sensitive files with API keys, passwords, DB credentials, etc.: https://github.com/orgs/community/discussions/11254#discussi...

So by default you send all this to Microsoft by opening your IDE.

[−] section_me 51d ago

If I'm paying, which I am, I want to have to opt-in, not opt-out, Mario Rodriguez / @mariorod needs to give his head a wobble.

What on earth are they thinking...

[−] pred_ 51d ago

What is the legal basis of this in the EU? Ignoring the fact they could end up stealing IP, it seems like the collected information could easily contain PII, and consent would have to be

> freely given, specific, informed and unambiguous. In order to obtain freely given consent, it must be given on a voluntary basis.

[−] sph 51d ago

Thanks to Github and the AI apocalypse, all my software is now stored on a private git repository on my server.

Why would I even spend time choosing a copyleft license if any bot will use my code as training data to be used in commercial applications? I'm not planning on creating any more opensource code, and what projects of mine still have users will be left on GH for posterity.

If you're still serious about opensource, time to move to Codeberg.

[−] diath 51d ago

> This approach aligns with established industry practices

"others are doing it too so it's ok"

[−] Deukhoofd 51d ago

So basically they want to retain everyone's full codebases?

> The data used in this program may be shared with GitHub affiliates, which are companies in our corporate family including Microsoft

So every Microsoft owned company will have access to all data Copilot wants to store?

[−] hoten 51d ago

Why is there no cancel copilot subscription option here?. Docs say there should be...

Mobile

https://github.com/settings/billing/licensing

EDIT:

https://docs.github.com/en/copilot/how-tos/manage-your-accou...

> If you have been granted a free access to Copilot as a verified student, teacher, or maintainer of a popular open source project, you won’t be able to cancel your plan.

Oh. jeez.

[−] hmate9 51d ago

For what it's worth they're not trying to hide this change at all and are very upfront about it and made it quite simple to opt out.

[−] badthingfactory 51d ago

I appreciated the notification at the top of the screen because it prompted me to disable every single copilot feature I possibly could from my account. I also appreciated Microsoft for making Windows 11 horrible so I could fall back in love with Linux again.

[−] _pdp_ 51d ago

Microsoft doing dumb things once again.

Who in their right mind will opt into sharing their code for training? Absolutely nobody. This is just a dark pattern.

Btw, even if disabled, I have zero confidence they are not already training on our data.

I would also recommend to sprinkle copyright noticed all over the place and change the license of every file, just in case they have some sanity checks before your data gets consumed - just to be sure.

[−] TZubiri 51d ago

If this doesn't sound bad enough, it's possible that Copilot is already enabled. As we know this kind of features are pushed to users instead of being asked for.

Maybe it's already active in our accounts and we don't realize it, so our code will be used to train the AI.

Now we can't be sure if this will happen or not, but a company like GitHub should be staying miles away from this kind of policy. I personally wouldn't use GitHub for private corporate repositories. Only as a public web interface for public repos.

[−] TZubiri 51d ago

Two issues with this:

1- Vulnerabilities, Secrets can be leaked to other users. 2- Intellectual Property, can also be leaked to other users.

Most smart clients won't opt-out, they will just cut usage entirely.

[−] stefanos82 51d ago

Serious question: let's say I host my code on this platform which is proprietary and is for my various clients. Who can guarantee me that AI won't replicate it to competitors who decide to create something similar to my product?

[−] OtherShrezzing 51d ago

It’s not clear to me how GitHub would enforce the “we don’t use enterprise repos” stuff alongside “we will use free tier copilot for training”.

A user can be a contributor to a private repository, but not have that repository owner organisation’s license to use copilot. They can still use their personal free tier copilot on that repository.

How can enterprises be confident that their IP isn’t being absorbed into the GH models in that scenario?

[−] pizzafeelsright 51d ago

I am not certain this is that big of a deal outside of "making AI better".

At this point, is there any magic in software development?

If you have super-secret-content is a third party the best location?

[−] rectang 51d ago

I just checked my Github settings, and found that sharing my data was "enabled".

This setting does not represent my wishes and I definitely would not have set it that way on purpose. It was either defaulted that way, or when the option was presented to me I configured it the opposite of how I intended.

Fortunately, none of the work I do these days with Copilot enabled is sensitive (if it was I would have been much more paranoid).

I'm in the USA and pay for Copilot as an individual.

Shit like this is why I pay for duck.ai where the main selling point is that the product is private by default.

[−] david_allison 51d ago

I have GitHub Copilot Pro. I don't believe I signed up for it. I neither use it nor want it.

1. A lot of settings are 'Enabled' with no option to opt out. What can I do?

2. How do I opt out of data collection? I see the message informing me to opt out, but 'Allow GitHub to use my data for AI model training' is already disabled for my account.

[−] OtherShrezzing 51d ago

So, how does this work with source-available code, that’s still licensed as proprietary - or released under a license which requires attribution?

If someone takes that code and pokes around on it with a free tier copilot account, GitHub will just absorb it into their model - even if it’s explicitly against that code’s license to do so?

[−] liquid_thyme 51d ago

They use data from the poor student tier, but arguably, large corporates and businesses hiring talented devs are going to create higher quality training data. Just looking at it logically, not that I like any of this...

[−] cebert 51d ago

I wish GitHub would focus on making their service reliable instead of Copilot and opting folks into their data being stolen for training.

[−] kevcampb 48d ago

This is terrifying. Github was the one provider I did not expect to make such an action. We're now playing whack-a-mole with vendors to try and ensure that our company IP doesn't end up being used to train a model.

[−] etothet 51d ago

The fact that this is on by default, especially for paid accounts and even more especially for organizations, where certain types of privacy is sometimes mandated by the industry your business is in, is ridiculous.

There should also be a much easier one-click to opt out without having to scroll way down on the settings page.

[−] thesmart 51d ago

I'm ready to abandon Github. Enschitification of the world's source infrastructure is just a matter of time.

[−] robeym 49d ago

There are several settings in my account relating to Copilot that are locked/enabled with a shield and key icon next to it. Any idea how to disable these settings? It's on the same settings/copilot/features page.

[−] jmhammond 51d ago

Mine was defaulted to disabled. I’m on the Education pro plan (academic), so maybe that’s different than personal?

[−] ncr100 51d ago

On my Android phone I was able to change the setting using Firefox by logging into GitHub and not allowing it to launch the GitHub app.

I was unable to change the setting when I used the GitHub app to open up the web page in a container.. button clicks weren't working. Quite frustrating.

[−] greatgib 50d ago

And something important, that is leaking from the phrasing of their blog post, is that it is not really "Github" that wants to suck all your data "prompts, code, context, documents", ... but it is "Microsoft"!

[−] phendrenad2 51d ago

So I do all the work of thinking about how to do something, and as soon as I tell Copilot about it, not it's in the training data and anyone can ask the LLM and it'll tell them the solution I came up with? Great. I'm going to cancel.

[−] sbinnee 51d ago

Bold move. Who uses Copilot these days? Unless they have free credit I mean.

[−] rvz 51d ago

> From April 24 onward, interaction data—specifically inputs, outputs, code snippets, and associated context—from Copilot Free, Pro, and Pro+ users will be used to train and improve our AI models unless they opt out.

Now is the time to run off of GitHub and consider Codeberg or self hosting like I said before. [0]

[0] https://news.ycombinator.com/item?id=22867803

[−] Heliodex 51d ago

Finally. The option for me to enable Copilot data sharing has been locked as disabled for some time, so until now I couldn't even enable it if I wanted to.

[−] indigodaddy 51d ago

Checked and mine was already on disabled. Don't remember if I previously toggled it or not..

[−] dartf 49d ago

I don't see an option to opt-out? Is it US only thing?

[−] djmashko2 51d ago

> Content from your issues, discussions, or private repositories at rest. We use the phrase “at rest” deliberately because Copilot does process code from private repositories when you are actively using Copilot. This interaction data is required to run the service and could be used for model training unless you opt out.

Sounds like it's even likely to train on content from private repositories. This feels like a bit of an overstep to me.

[−] mt42or 51d ago

Is it legal ? Surely not in any EU countries.

[−] marak830 51d ago

As it's enabled by default, does that mean everything has already been siphoned off and now I'm just closing the gate behind the animals escaping?

Shit like this shouldn't be allowed.

[−] explodes 51d ago

We all knew Microsoft was going to destroy GitHub eventually when it was first bought.

How much longer do you want to tolerate the enshittification? How much longer CAN you tolerate it?

[−] tuananh 51d ago

making this option opt-in by default is a very shady choice, GitHub.

[−] semiinfinitely 51d ago

ill be moving off github now

[−] nhkjhh 45d ago

nig

[−] baobabKoodaa 51d ago

(oops)

[−] abdelmon 49d ago

[flagged]

[−] pugchat 51d ago

[dead]

Updates to GitHub Copilot interaction data usage policy (github.blog)

166 comments