49

I just got one of those GDPR mails from gitlab.com, which pointed me to a Web page where I had to accept some terms and conditions. The same as everywhere, except this passage:

(For GitLab Contributors Only) As part of my voluntary contribution to any GitLab project, I acknowledge and agree that my name and email address will become embedded and part of the code, which may be publicly available. I understand the removal of this information would be impermissibly destructive to the project and the interests of all those who contribute, utilize, and benefit from it. Therefore, in consideration of my participation in any project, I hereby waive any right to request any erasure, removal, or rectification of this information under any applicable privacy or other law and acknowledge and understand that providing this information is a requirement under the agreement to contribute to the GitLab project.

As far as I understood GDPR, this passage is just plain bullocks and they're trying to get away with arguably the most difficult bit of GDPR, especially if you consider their argument. I can feel their pain, but it also just doesn't feel like this is made possible by GDPR and if GitLab would deny or not completely fulfil such a deletion request, it would be liable to litigation. Am I correct in this?

Note: I'm not trying to put GitLab in a bad spot here, they're just the first (only?) ones that included this kind of passage in their agreement.

sleske
  • 8,095
  • 4
  • 26
  • 56
rubenvb
  • 533
  • 4
  • 7
  • 13
    GitLab is probably referring to how git includes your name and email in every commit you make. To remove them would mean modifying the history of every change that was made to all projects since your first commit in each. That's indeed very destructive. This would mean that the next time someone syncs their copy of the project with GitLab, it would potentially show all the history of the project since your first commit as diverging and one would be left to figure out what part of their local history to move GitLab's new history, and how. I expect GitHub and others to have the same problem. – JoL May 25 '18 at 21:40

2 Answers2

72

Yes, their waiver has no legal basis and is invalid under the GDPR. They should have hired a better lawyer.

GDPR rights cannot be waived (mrllp.com).

The last bit should have been:

Therefore, in consideration of my participation in any project, I understand that retaining my name and email address, as described above, does not require my consent and that the right of erasure, as spelled out in the GDRP Article 17 (1) b does not apply. The legal basis for our lawful processing of this personal data is Article 6 (1) f ("processing is necessary for the purposes of the legitimate interests pursued by the controller").

I.e. there is nothing in the GDPR that compels GitLab to erase this information, but their waiver is bogus.

Keeping track of individual contributions in a software projects is necessary for a number of reasons, including security (if somebody contributes code that jeopardizes security, you want to audit everything that person has contributed).

Free Radical
  • 3,212
  • 15
  • 28
  • 5
    OK, so it is made possible by GDPR to keep these people's names in history. I wondered what kind of impact this would have had to open source software, but now it seems limited. Thanks for this bit of information! – rubenvb May 25 '18 at 06:14
  • 21
    People saw the "consent" basis and went bananas, ignoring all the other legitimate bases for data processing. – pjc50 May 25 '18 at 10:20
  • 2
    Hum, I would say, they have to remove the personal data and anonymized it at least if they cannot remove it from the commit history. Therefore, it would obviously require a rewrite of the full commit history. – TheCodeKiller May 25 '18 at 11:37
  • 2
    One thing to note is that the right of erasure should apply to historical data. E.g. you get a GDPR notice and say "ok I don't want to accept and want to roll back my personal data", but with thousands of old commits done in the past. Just to comment out... Looks like GDPR is destructive to Git and Blockchain technologies – usr-local-ΕΨΗΕΛΩΝ May 25 '18 at 15:10
  • 1
    Good answer, especially since it helps dispel the myth that GDPR always requires user consent. However, I think things may not be quite so simple: One could reasonably argue that GitLab should at least anonymise commits on request (i.e. anonymise name & email address), since that information is not critical for a software project (and you could anonymise such that you can still tell which commits had the same committer info). I realize this is technically difficult with Git (changing hashes and all that), but I'm not sure this would be a sufficient defense. – sleske May 25 '18 at 20:19
  • 4
    I was about to go off on a rant about how this is an example of why the whole "right to deletion" thing is fundamentally unethical and this being a perfect example of why, but apparently the law stopped short of breaking all legitimate reasons for indefinite historical record keeping, sooo... I still stand by my thesis but the rant is postponed. +1 for citing example legalese and bringing attention to this reasonable provision for exceptions in the law. – mtraceur May 25 '18 at 21:06
  • 2
    @mtraceur Oh, don't be so sure some court won't rule that a disliked foreign company doesn't really need not to rewrite a Git history and should be fined. – chrylis -cautiouslyoptimistic- May 26 '18 at 00:00
  • 4
    @chrylis: That would be really fun with Mercurial's append-only history modification (TL;DR: Instead of throwing away rewritten commits after the GC period expires, Mercurial marks them obsolete, hides them from view, and persists them forever in clones that knew they existed. This is useful for a number of reasons but also means full obliteration has to be done locally in each clone -- there's no git push --force equivalent. So you can't automate it without local shell access everywhere.) – Kevin May 26 '18 at 03:43
  • 14
    Fast work. For the curious, GitLab has already integrated this change (not worded precisely as in this answer but close). – Wildcard May 26 '18 at 04:33
  • 2
    I would argue that for any decently sized project a force push is basically impossible. I think where it would get interesting is that there's probably no requirement in their ToS that the name and email address in the commit be real, just that they're basically valid. – Kaithar May 27 '18 at 01:53
  • @Kaithar That applies to the name and email associated with individual commits in general. The "proper", so to speak, way of ensuring accurate attribution in commits is cryptographic signatures. https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work – JAB May 27 '18 at 18:55
  • 1
    @Kevin: That doesn't seem much different from Git, to be honest. Even if the server accepts git push --force with your rewritten version, there is no way to automate receiving it into everyone else's repositories. (If they try git pull, it'll either fail outright or accidentally merge both old and new.) – user1686 May 28 '18 at 08:48
  • 1
    @grawity: You don't care about end users, just your own local replicas, backups, etc. You can't reach into an end user's computer and delete data off of it anyway. – Kevin May 28 '18 at 15:31
  • @Kevin I think GitLab and individual project distributors/owners very much do care about end users. Easying distribution of software is the very point of their role, so I can't see them not caring about causing such a mess on their users' systems. – JoL May 29 '18 at 19:38
  • 1
    @JoL: I mean from a legal perspective. It's beyond any provider's reasonable ability to delete the personal data of user A off of the computer of user B. – Kevin May 29 '18 at 19:49
5

(Please note that I an a random guy on the internet, not a lawyer)

Although the GDPR seems rather ill-conceived, they managed to cover this part OK:

(3) Paragraphs 1 and 2 shall not apply to the extent that processing is necessary:

...

e) legal claims.

It's already established that contributions to software are an act of writing, making you the author, with author rights and copyrights that have been established in law over the course of the past 500 years. The copyright can be granted or sold (automatically if someone is paying you for it), but the author rights cannot be waived. (The idea being someone cannot sue an author and legally take their authorship away, even if the author owes them money).

Thus the record of who-wrote-what is a legal claim and cannot be removed.

I can feel their pain, but it also just doesn't feel like this is made possible by GDPR and if GitLab would deny or not completely fulfil such a deletion request, it would be liable to litigation. Am I correct in this?

GitLab does not have the right to edit a contributor list to a codebase they do not own. It would be illegal for them to fulfill a deletion request. GDPR does not apply to the contributor list.

IKM
  • 167
  • 3
  • 7
    +1. One nuance to note is that "copyright being granted/sold" is an American-esque construct, and many countries consider copyright inalienable, and in those countries, a similar effect is achieved with authorship-for-hire by unlimited and perpetual licensing. Also I don't know if there really is a legal conflict between authorship claims being special in the case of the author themselves wanting such claims deleted, but if there is then a version control system's commit history is also effectively a list of authorship claims, to my non-lawyer intuition. – mtraceur May 25 '18 at 21:10
  • 12
    "Although the GDPR seems rather ill-conceived" Maybe stick to the facts, a lot of people - including me - wholeheartedly disagree on this. – Polygnome May 25 '18 at 22:02
  • 5
    "GitLab does not have the right to edit a contributor list to a codebase they do not own" -- although a note to anyone who is hoping to evade GDPR by simply writing everyone's personal data into a document and selling the copyright in that document to a third party so that "we don't own this data and therefore cannot remove your personal data from it": that won't work, you'll still get fined. If the court thought that's what was going on then, as a last resort, GitLab could be directed to delete the whole document, at which point they may or may not discover that they can edit it after all ;-) – Steve Jessop May 26 '18 at 13:16
  • 6
    @mtraceur "many countries consider copyright inalienable": you're thinking of moral rights; that does not apply to other aspects of copyright protection, which can be granted or sold anywhere. Furthermore, moral rights include a right not to be publicly identified as the author of a work, so the idea that moral rights forbid a data processor from removing identifying information from a work of authorship is ill founded. See https://en.wikipedia.org/wiki/Moral_rights. The "record of who wrote what" must be removed if the author requests it, for copyright purposes and for GDPR. – phoog May 26 '18 at 13:22
  • @Polygnome The rest of my post was mater-of-fact other than that one disclaimer, but it needs to be there because I don't want anyone to mistake me for a supporter of this law, I do not want any blame directed toward me if this law has unintended consequences a year or two from now. – IKM May 27 '18 at 12:04
  • @IKM Its still a matter of opinion and imho not suited for an answer on an Q&A site. Facts are facts, and opinions are opinions. We close opinion-based questions for a good reason. – Polygnome May 27 '18 at 12:12
  • 2
    @Steve Jessop - The personal data wouldn't be part of a legal claim in that case. So the first request for removal should to the owner of the codebase, second request to GitLabs, similar to how it would work today if someone decided to upload a Harry Potter book onto GitHub. They can lock/delete the whole repo, but they can't just start editing the code inside files. – IKM May 27 '18 at 12:13
  • 2
    @Polygnome - "Its still a matter of opinion and imho* not suited..."* So IKM should take heed of your (imho) opinion but should not express their own? This is not an "opinion based" answer. It is a "factual" answer that happens to contain a casual, innocuous opinion within it. Oddly enough, the fact that the answer would be factually unchanged, with or without the "opinion" statement *proves* it is not an opinion based answer so the opinion statement should be allowed to remain. I guess I should add that this is just my opinion. – Kevin Fegan May 28 '18 at 12:45
  • @KevinFegan No, he should just keep the opinion completely out of the answer and keep it neutral. "It is a "factual" answer that happens to contain a casual, innocuous opinion within it." => Yes, and the opinionated part could be left out, we are a fact-oriented Q&A platform, after all. I never said the whole answer was opinion-based, far from it, I always referred to that one sentence. – Polygnome May 28 '18 at 12:49