GDPR deficiencies - changes needed before May 25, 2018
Since pretty much anything with a web address falls under the purview of the EU's General Data Protection Regulation, we need to make some changes here. Upon a cursory review I find the following issues at minimum:
- First sentence: Participation on the Doom Wiki at DoomWiki.org, by design, rarely involves considerations of privacy - we cannot really say this any more; the EU now defines usernames and IP addresses as personal information and considers them subject to privacy laws, so they're now a full concern.
- Users that do register are identified by their chosen username. Since I believe MediaWiki 1.24, real name is an option for input on Special:CreateAccount and through user preferences and can be used instead of the username. This is also protected information and should be mentioned explicitly.
- the IP address used is publicly and permanently credited as the author of the edit. Under GDPR, if so requested, we will have to suppress IP address contribution by-lines using revision information suppression (aka oversight). So permanent is not necessarily the best choice of wording.
- Once created, user accounts will not be removed. It may be possible for a username to be changed, but there is no guarantee that a username will be changed on request - GDPR makes removal of accounts a legal right of EU citizens, so it would be necessary to merge a user's contributions into an anonymous account if a request for user deletion occurs - this is no longer optional for us.
- there is no expectation of any permanent deletion occurring. - Again also enumerated as a user right, and technologically possible however painful it may be under MediaWiki to have to run bespoke scripts to pull out and extinguish such information.
- When a visitor requests or reads a page, no more information is collected than is typically collected by web sites. MancuNET may retain raw logs of such transactions, including the originating IP addresses, but these will not be published nor used to track legitimate users. - We need to fully enumerate the information collected by Apache when a web hit occurs. WikiMedia Foundation has more complete wording in their policy now that we can borrow.
- This information is automatically deleted after a set period. - We need to figure out the actual retention lengths anywhere this wording is used, or, as an alternative use the language "for the minimum length of time technically required."
- Information transmitted to these third-party sites is limited to the minimum possible and is anonymized to the greatest extent possible. - Making this more explicit in the same manner as explaining what information is automatically collected by our own server is needed.
- Due to lack of participation and interest, however, the Doom Wiki has not constructed formalized procedures to handle such issues. Further, per the introductory paragraph, we have little or no empirical experience to inform best practices. The above text represents an educated guess as to what constitutes reasonable behavior by administrators and contributors. It has not been "ratified" by anyone, and individual administrators or database maintainers may disagree with some provisions. As in all cases of policy enforcement on DoomWiki.org, administrators are expected to use their common sense and good judgement to derive a workable solution.
- None of this flies any longer unfortunately:
- This needs to become a full-force official policy.
- Users must be assured that staff members will comply by legal necessity.
- Additional language and best practices can be borrowed from other organizations, such as WMF
--Quasar (talk) 08:38, 22 May 2018 (CDT)
- Most of this is simply inexact language, which arises from condensing the WMF version . The WMF version includes pages of detail about the applicable ethical principles and the technical remedies to be followed. I know almost nothing on the server side, and everyone who does was busy with debugging and configuration. I certainly did know that admins would disregard any enforcement procedure they disagreed with. Therefore, I summarized the detailed statements into general ones. Those can be improved, as you say, now that we're better informed.
- The last part gets into a broader issue of what obligations go along with the admin position. To my knowledge this is the first time a specific "legal necessity" has been stated. If you're serious about this, then I assume (a) all admins must explicitly agree or be demoted, (b) the shell users' collective skill set must be known to include all of the corrective activities proposed, and (c) off-wiki contact information must be provided in case harm would be caused by a public request.
- I agree this document should be changed first since it is visible immediately on the transition date, but I expect it will need tweaking as the above described inventory and tools become available. Ryan W (living fossil) 17:56, 22 May 2018 (CDT)
- I envision that bureaucrats should be the only ones burdened with any compliance activities (aside from helping get this document updated) - frontend admins should rarely have to deal with it by doing anything more than referring somebody to this policy and to my contact info (I'll be the "designated" information officer or whatever it is they're calling it, since nobody else really has the full amount of access required and is ever actually around here to deal with stuff). --Quasar (talk) 02:53, 23 May 2018 (CDT)
Highest priority improvements
Numbering these in advance in case of RL interruptions. I will fill in any diffs I create.
As I said on IRC, each paragraph imported from the WMF version is not ipso facto incrementally better compliance, as tempting as that is to believe, because the WMF has not yet audited its policies for compliance . Some points might well be steps backward, at least regarding the spirit of the regulation.
- Permitted uses section: The current policy takes great pains to confine these to specific reactive steps that protect the project (malicious editing, downtime, imminent legal proceedings). Are we now broadly allowing proactive research with no identified risks? help you share your knowledge with the world; create new features; learn more about how the Project Site is used — that could mean anything. If this is to encompass ongoing development activity like extension maintenance or SEO, I suppose that's unavoidable, but mention them as narrow exceptions.
- Scope: Given we just added a big wad of description on local storage (from the WMF version), I think this is now an important example. I could be wrong. Even if I am right, I hope the webmaster will be kind enough to correct any technical misstatements.
- Access to hidden information: Oops, this was a placeholder for using the term "staff" in this section, already done in #12. The link I removed seemed to contradict the text, which is almost entirely about staff, or things that non-staff can do less well but must sometimes attempt in emergencies (e.g. guessing range blocks from recent changes).
- Disclaimer: Seems obvious, if long-winded. Did Quasar or Manc indeed have to sign a document for Google to provide all that sitemap information and stuff?
- Publicly Visible Information:: Per the above post regarding merge a user's contributions into an anonymous account . . . this is no longer optional for us. I also tried to make it consistent with precedents, whereas we didn't usually evaluate our legal obligations point by point before proceeding! Shell users should review this part carefully, because they will have to do all the actual work. Once they have a procedure in mind, they might even consider spelling out how a username or IP can indeed be "erased" while still complying with the CC attribution requirement; I have some idea because Wikipedia does it regularly, but a random contributor might not, and we don't want to imply that we'll hide behind the letter of CC-BY to make their erasure more difficult.
- IP addresses: Per the above post regarding This information is automatically deleted after a set period. The existing language also seemed inconsistent with "How Long Do We Keep Your Data?" down below. N.B. The last time we discussed this on IRC (in the context of checkuser functionality), it was found to be patently false. That should be addressed before this goes live.
- Will MancuNET have its own public document? If so, that should be enumerated in the warning text about offsite privacy policies applying concomitantly. If not, we should avoid reference to actions that only Manc has permissions to do, unless Manc indeed agrees to do them. (EDIT: per Quasar's post below, don't even bother mentioning MancuNET by name; it will just confuse people.)
- For Legal Reasons: Aren't there enough project-wide risks without adding this one? Unlike the WMF, we do not have a legal team and $100 million in the bank. I always thought our main goal was to preserve our knowledgebase beyond the lifetime of the current global economy, if not beyond the species itself. Please no epic quests wherein our contributors' tens of thousands of hours of labor, regardless of political affiliation, could become collateral damage.
- To Protect You, Ourselves and Others: It is hardly tenable to instantly elevate all policies to privacy-superseding status! Our historical practice is for one contributor to write a document with no discussion and very little research. If we wish a public commitment to be legally bound by a particular policy, which is probably a sign of the project's maturing, then we must actually review it first (as we did with the ToU and are doing here). This is in contrast to the WMF process which not only involves a legal team, but a place where they can establish global policies which override community-authored pages in case of inconsistency (the Meta-Wiki). Therefore, incomplete or outdated pages on a specific project are not a legal risk in the same way. to protect our organization, employees, contractors, users, or the public — as with #1 above, this is an invitation to fishing expeditions. Investigatory laws in the US are already incredibly broad, so what is the motivation for proactively going beyond them? assess and address — at face value this seems a bit excessive also, but if shell users find it an accurate description of their day-to-day tasks, so be it. imminent and serious bodily harm or death — IIRC this was added in response to the Michelle Carter case, where it was found that protection of one person superseded constitutional rights in an entire state. From that wording you can probably guess my opinion, but I am also against risking our project's existence in breaching experiments, so I left it in. All that said, if the GDPR specifically mentions such situations as NOT being exceptions, then this sentence doesn't belong.
- Disclaimer: I've changed my mind and removed the general statement of principles. In the context of a policy with actual legal force, IMO, it is partly redundant with the lead section, and partly just a motherhood-and-apple-pie plea that our admins and staff are conducting themselves in good faith and doing their best to comply with the law in the total absence of legal counsel. I think I have seen boilerplate like that in government contracts — a sort of diplomatic offering to the public service culture at large — but if an actual legal proceeding were to arise, that would be the least of our concerns. I hope this makes some sense; I was very proud of that section in 2012 as an actual intellectual contribution over and above parroting the WMF over and over.
- Project Staff: Per the above post regarding bureaucrats should be the only ones burdened with any compliance activities and related IRC discussion, I've attempted to make this less ambiguous. I'm not sure how much it would help with an actual legal filing or police investigation, since we would have zero control over who was investigated, but at least the language should be clear enough now for straightforward cases. A reader with a request should go to the staff, not to a random user with any level of enhanced permissions, who may or may not even be active... The biggest hole was the COPPA wording, which I've never liked, but could make the argument that (per the disclaimer section) I had no additional legal obligations beyond what I had the instant I became an admin. Now I can't assume that. The wording might still be incomplete; how do large social media outlets or forums apologize for not being able to moderate all traffic in real time? Do we need "report abuse" links everywhere like Doomworld has? Does the COPPA statute itself specify response times? A GDPR request made on behalf of a child would be especially important to get right the first time, because no good faith on our part would be assumed whatsoever. :P More immediately, the GDPR seems to include some detail about how requests should be handled slightly differently for children, which might suggest wording tweaks here.
Ryan W (living fossil) 14:24, 26 May 2018 (CDT)
For number one, those were precisely some of the use cases I had in mind by adopting those statements, and ones I feel have already occurred in the past, ie, with the roll-out of Google Analytics in order to understand our site usage metrics, we're effectively using some of the protected data in this way just to make the site work. These are almost unavoidable in the course of running a website.
I do not follow on your point about MancuNET, so I need further explanation. "MancuNET" is just a group of servers for which the maintenance accounts are owned by Mike Lightner. If you believe inclusion of the name in this document is overcomplicating matters, then I would suggest it be removed entirely.
--Quasar (talk) 14:30, 26 May 2018 (CDT)
Ah, OK. I thought it was more centralized than that. You have mentioned in the past that you didn't have enough access to perform some maintenance task, and everything seems to stop short until manc can be found. If such would be a sticking point in responding to a request, I would think the staff should discuss that. Assuming you do, though, IMO it's fine to just keep saying "the Project" instead. Ryan W (living fossil) 15:48, 26 May 2018 (CDT)
I have no problem with the second round of changes, other than one I already fixed (regarding "DoomWiki.org" without a following defined term). Regarding the COPPA wording, that was adopted into the policy by me after we moved here and was done so more or less verbatim from language I found on another site. I am not entirely satisfied with the wording as you've put it because it seems to imply the site is going to lazily allow a child's information to stay on the site until somebody raises an issue about it. That is not the case - given I review virtually every change made, someone posting their name/address and an age under 13 would see me expunging that information and limiting the activities of that user appropriately, proactively. AFAIK, the law in question requires proactive, not reactive, action. Once some authority points it out, you're probably already in trouble over it. So, I feel this could be improved versus what is now there. --Quasar (talk) 19:43, 26 May 2018 (CDT)
The only way to be 100% proactive would be like a porn site, disabling IP contributions and asking for an age range on registration. The user could lie of course, but the project would be protected because you would have data showing the user checking that box at that timestamp. But I assume you consider that a scorched-earth option.
Maybe I'm missing something obvious because I've never had to deal with an incident (luckily for the kid!). What you say makes some sense, but the current wording has the opposite problem, implying less laziness than exists. Legally, the more we claim to be monitoring something, the more liability we take on. Therefore, no offense, but I'm concerned about a generic statement like "will be removed by $group" when that could immediately become untrue if something happened IRL and you were less active. Again, does Ling read every post? Does the NY Times read every op-ed comment? (Maybe I should just read some of the background material but this is already taking forever.) Ryan W (living fossil) 21:05, 26 May 2018 (CDT)
I did some research on that at the time and what I found suggested that the common age gate is just an implementation of some "due diligence" and is not prescribed by law, especially since we state we're not aimed at minors. I definitely don't want one here either way ;) What about, "Any Personal Information posted by such a user will be removed when the Project Staff have actual knowledge of its posting"? Actual knowledge being a term derived from the law itself in this case. --Quasar (talk) 03:26, 27 May 2018 (CDT)
Reading some more, there are two other ways to be exempt: actually read everything before it goes public, or register as a charitable nonprofit while continuting to meet the two other conditions already in the policy (the publication is for a general audience, and age isn't collected as structured data). In case of a nuisance lawsuit you might also have to establish that we didn't allow third parties to collect anything from which age might be inferred. I assume the first would have unacceptable collateral damage, even if resources could be found to implement it, and the second would be a large annual expense.
This is a helpful site incidentally. The links under Guidance were written by the people who would be sending the marshals. :P
As to your suggested wording, it may be the best we can do, though I'm not convinced it's bulletproof because the terms "operator" (from the statute) and "responsible person" (from the other primary documents) seem broader than our staff. Seen by an outsider, would admins obviously be viewed as subordinate to the staff, compelled to enforce whatever policy you felt was binding? For one thing it isn't true :> and even organizations that pay middle managers highly to enforce rules have trouble getting uniform results. For another, any admin is empowered to act immediately to remove outing info from public view, regardless of how few may even know that, let alone know that you expect prompt action. More broadly, if a legal proceeding occurred, people would compulsively assume that admins on all wikis are like WMF admins, i.e. Wild West sheriffs who have total authority within a localized area. By the time that were straightened out, we might be pretty close to the cliff edge, assuming the staff could even pay what it takes to argue our side.
I'll try to post about the missing topics later today — I too would like to move this along. Ryan W (living fossil) 12:52, 27 May 2018 (CDT)
On IRC, Quasar said: I think my only remaining concern is if we should have our own local copy of the glossary of terms instead of using WMF's
Misunderstanding here. This was referring to the WMF's separate technical glossary document, which we could have mirrored just in case, but haven't. Ryan W (living fossil) 19:45, 30 May 2018 (CDT)
Given the list of bullet points above is complete and I have no additional changes to make at this point, I'd like to move ahead with the go-live process. What period of time are we going to allow? --Quasar (talk) 05:50, 29 May 2018 (CDT)
- Have we even mopped up your initial post? My list wasn't necessarily a superset, owing to backend considerations.
- Ideally I meant to at least check for 2012 gaucheries (typos, sarcastic links, bald self-contradictions), and rearrange the intro because the new and old text now seem mismatched.
- Rightly or wrongly, we are going on zero responses to the banner in 4.4 days, other than people you approached individually on IRC. 30 days does seem unnecessary in that light. Ryan W (living fossil) 16:30, 29 May 2018 (CDT)
- My intuition — sometimes bass-ackwards of course — is that we should wait out the weekend. The yellow box popped up just before a major US holiday, and only a few days after many people had presumably tuned out notifications (following net neutrality vote). Ryan W (living fossil) 19:50, 30 May 2018 (CDT)