[pmwiki-devel] GDPR Compliance Issues

Petko Yotov 5ko at 5ko.fr
Wed Jun 27 15:44:25 PDT 2018


Copyright is one thing, personal privacy rights are another. Especially 
if your crowdsourced content has been clearly published as one of the 
open-source/free-as-in-freedom licenses like the GFDL, most of the 
Creative Commons licenses and most of the Open source software licenses. 
(Co)authors once they published a version of a page or a derivative 
version, can no longer retract a published work, even if they get all 
other co-authors of the work to agree: no backsies, read the licenses. A 
Facebook or a Twitter post has a single author, it is not a collective 
work. I don't want to go there, let's stay on the GDPR topic.

If the crowdsourced content itself contains personal information about 
living people that you want to remove or anonymize, please do. BTW, 
health-related personal information is considered "very sensitive".

If you want you can restrict or @lock the "diff" action, so that you 
don't make available the page history, or you can clear the page 
history, see ExpireDiff.

About cookies: you do not need consent before sending cookies "for the 
sole purpose of carrying out or facilitating the transmission of a 
communication over an electronic communications network":

   
https://web.archive.org/web/20110224183417/http://www.ico.gov.uk/for_organisations/privacy_and_electronic_communications/the_guide/cookies.aspx

The default PmWiki cookies PHPSESSID, author and imstime do precisely 
that, also some recipes (TOC is open or closed), unless you have some 
custom functions that do something else like usage stats, total counter 
or something else.

OTOH the GDPR states that a session ID cookie is personal information -- 
I don't know. See, the directive is imprecise, sometimes the session 
data contains no personally identifiable information at all, and the 
session ID is just a random string, and about IP addresses, 
router/NAT/proxy IPs are unusable to identify an actual person.

You can disable session cookies in PHP and have the session ID in the 
URLs, like:

   example.com/pmwiki.php?n=Main.HomePage&sid=1234567890ABCDEF

See: 
http://php.net/manual/en/session.configuration.php#ini.session.use-cookies

This is possible but not trivial to implement in PmWiki, and it is much 
less secure because someone may post somewhere this URL and another 
visitor may immediately get the permissions of the logged-in user, see 
https://en.wikipedia.org/wiki/Session_fixation .

The Lynx browser by default stops the processing for every cookie and 
asks if you accept it or not. All browsers should implement this by 
default, and our problem will be solved. Most importantly, it is several 
orders of magnitude easier to implement it in a limited number of 
browsers than on billions of websites. BTW very quickly, most users will 
check "never again ask me this question" then press "always accept all 
cookies". :-)

About "explicit consent is not required if your company has a legitimate 
interest", here is article 40 of the GDPR:

   In order for processing to be lawful, personal data should be 
processed on the
   basis of the consent of the data subject concerned or some other 
legitimate
   basis, laid down by law, either in this Regulation or in other Union 
or 4.5.2016
   L 119/7 Official Journal of the European Union EN Member State law as
   referred to in this Regulation, including the necessity for compliance
   with the legal obligation to which the controller is subject or the 
necessity
   for the performance of a contract to which the data subject is party 
or in order
   to take steps at the request of the data subject prior to entering 
into a
   contract.

See the following articles about what is a legitimate interest or a 
contract which may be written or not. I wrote, and I agree with you, 
that you should explain this in a plain and simple language, but you may 
not need to have 13 checkboxes for each individual purpose of your 
processing.

About IP addresses in the server access logs, I don't say it is 
difficult for admins and devs to implement this, I say it is impossible 
to satisfy both the GDPR and the law that requires to store all access 
logs for 1 year (it was requested by large content distributors to track 
and imprison people sharing cultural works, but they also pitched 
terrorism in; I suspect you too have something like this):
   
https://www.legifrance.gouv.fr/affichTexte.do?cidTexte=JORFTEXT000023646013&dateTexte=20180627

It is impossible for the visitor to consent about storing her IP address 
before she opens your site: at that moment her IP address is already 
stored![1] Most site owners, especially on shared or container hosting 
plans, don't have root access and cannot remove entries from the access 
logs, and even if they can, it is illegal.

As an admin I had to deal multiple times with DDOS or vandalism attacks 
that brought the server to its knees. A quick scan of the access logs, 
add a few lines in .htaccess and the problem is solved. This IMHO is one 
seriously legitimate interest to store the IP addresses.

Petko

[1] Effective demonstration of the observer effect in quantum physics. 
;-)

P.S. I don't intend fighting to facilitate ads, analytics, tracking and 
big data, I despise those probably more than you do. I have a very 
tightly controlled browsing, with a hosts blacklist, Multi-Account 
Containers for Firefox, uBlock Origin with all 3rd party scripts and 
frames blocked by default as well as some 1st-party scripts. :-)

On 27/06/2018 17:31, Criss Ittermann wrote:
> Note: I'm not any type of lawyer.  I'm not an intellectual property
> lawyer. I'm not an EU law lawyer.
> 
> I'm just a (former?) designer who has to be informed about
> intellectual property issues and read a lot of information on the GDPR
> from other websites and lawyers.  And I may misunderstand any or all
> of it.  But I'm trying to help my clients (some on PmWiki, thankfully
> most US and only dealing with local customers) understand this stuff,
> and deal with my own business information and needs at the same time.
> 
> 
> Not everyone using PmWiki needs be concerned about the GDPR, but many
> of us do.  I run a life coaching website with a PmWiki website, a
> large mental health resource, and a private consulting business — in
> the States.  But my clients disclose a great deal of personal
> identifying information that could be stigmatizing/grounds for
> discrimination to me.  From as simple as gender or gender-identity
> issues and sexual orientation to mental health diagnosis and physical
> health information.  Or if I'm editing or formatting a book manuscript
> from someone in the EU — that counts.
> 
>> On Jun 27, 2018, at 10:10 AM, Petko Yotov <5ko at 5ko.fr> wrote:
>> 
>> On 22/06/2018 00:40, Criss Ittermann wrote:
>>> What I see as material problems are:
>>> Removing people from Diffs — mentioned in a thread on the PmWiki 
>>> Users
>>> list — if they request their data to be completely removed from the
>>> site.  That can be tricky — there's a difference between being an
>>> author (of an original article or section thereof, thus possessing
>>> copyright to the creation) vs. editor.  Removing a diff in the middle
>>> of a chain of diffs can materially change a wiki page in ways that
>>> don't work.  If someone fixed a typo, it's now a typo again — and 
>>> that
>>> would be OK I suppose.  But if someone added a paragraph that was
>>> later edited & added-to — now the context for further changes is
>>> missing.
>> 
>> You don't need to remove their edits (the diffs), their edits are not 
>> personal information. Personal information in page  history are only 
>> their name and IP address.
>> 
>> We need to write a recipe that takes an author identifier (username or 
>> e-mail) and possibly an IP address (although some IP addresses may 
>> forward thousands of users), then reads all pages with full history 
>> and pseudonymizes or anonymizes these bits: just rewrites the "author" 
>> and "host" page attributes with some string like user20180627T1322.
>> 
>> As long as it is impossible to guess or recover the personal 
>> information from the files on your server by other users, or in case 
>> of a breach, it may be enough.
> 
> The GDPR appears to require people be able to be "Disappeared" and if
> their contributions are their copyrighted information, then their
> contributions need to be removed.
> https://thegdprguy.com/right-to-erasure/
> 
> From what I understand they also need to be able to have a download of
> all their contributions.  They can back up their contributions or all
> their contributed copyrighted work — or both download/back-up and have
> it all deleted.
> 
> So, for example: I have a large PmWiki website (kinhost.org) in which
> people may have shared personal information on a live wiki page.  They
> may have disclosed their sexual orientation, different names they go
> by online, etc.
> 
> The contributions itself, signed with their name or Internet handle,
> are personal identifying information and disclosures, and they have
> the right to erase their contributions.
> 
> Yes, they volunteered the information just like one volunteers a
> Facebook post or Tweet.  That doesn't block their right to have it
> backed up &/or removed entirely.
> 
> And if I want to be erased, I have the right to have all of my
> information removed from the website — perhaps because it may affect
> future employability.  Like if I felt someone might hold my
> contributions to PmWiki against me (yes, I'm making something up), I
> have the right to have all my information on the PmWiki.org site
> removed.  One could argue whether that would include my plug-ins and
> the documentation thereof.  Is it still my copyright?  Is my name in
> the document?  It's a rough consideration — but it's something
> required to think about especially with PmWiki having authors from
> around the world.
> 
> It's the nature of my wiki (being a mental health resource) that these
> disclosures are more likely, and much more likely to be stigmatizing
> or the basis for discrimination — but the fact that this type of info
> is still in a Diff would be a problem.
> 
>>> Making sure all email & comment forms have a required checkbox (not
>>> checked already) asking permission to share/email/store personally
>>> identifying information.  Though that's pretty easy if you know how 
>>> to
>>> use PmForm.
>> 
>> 
>> If you use "explicit consent" as sole legal basis for collection and 
>> processing of personal information you need to explain each and every 
>> different purpose for this collection and processing, with individual 
>> checkboxes, where people may select some or all checkboxes.
> 
> Yes.
> 
> In most cases, it would simply be for email storage (unless I was
> specifically tracking people in a CRM or putting them on a newsletter
> email list etc.).  And someone could request I go into my email and
> delete their emails from my personal files or my IMAP or wherever it's
> stored.
> 
> The basic PmWiki contact email should have an example of this.  I know
> I wrote the honeypot recipe for Pmform for example.... so I will have
> to (both for my own sites' sake and for PmWiki in general) see about
> contributing a fix for this issue.  It's the need to track/log
> permissions that's a real thorn.
> 
> Actually, in the case of emails, it could be tracked in the email that
> comes to the addressee really. They're the one storing the email
> anyhow....  so the wording of the consent and what's checked.
> 
> The checkbox being a required field stops the email from being sent
> without consent, so it's a self-limiting issue in the first place.  At
> least that's how I set up a Mailchimp form — 1 checkbox, and it's
> required.  Without explicit consent, the email form doesn't work.
> 
>> Note that besides "explicit consent" there are 6 other cases for legal 
>> basis for this -- if you are in at least one of these cases, you don't 
>> require explicit consent.
> 
> As a not-a-lawyer, I can't defend these 6 cases as easily as I can
> write a disclosure/privacy policy that explains why I have someone's
> information, how it will be used, and where it's kept/security info.
> I don't think I fall under these cases except perhaps IP address info
> on my server, but it's easier for me to tell people the web logs have
> their IP address and it's on a rotation to be deleted automatically
> than it is for me to figure out if I'm an exception to the privacy
> policy rules or defend an exception in court.
> 
> I've already figured out that in several cases I'm NOT an exception to
> their under-the-radar under 250 employee rules. I don't need to hire
> someone to be my GDPR expert, but I do need to follow just about
> everything else.  Both for my life coaching business AND for my
> consulting business because it handles business information & other
> people's copyrighted materials or trade secrets.  And my community
> resource site also — it's about mental health info. Bingo. Hits the
> mark for the GDPR.
> 
>> One of these cases is "legitimate interest of your company or a third 
>> party" (for example usage statistics, software troubleshooting), 
>> another one is "legal obligations" (for example it is required by law 
>> to store the server access logs for 2 years, and they contain the IP 
>> address which is considered personal information by the GDPR), and yet 
>> another one is "fulfill contractual obligations with person", and 
>> "perform tasks at person's request" (for example they request the 
>> creation of an account, or request notifications, or request password 
>> recovery).
> 
> I believe the safest thing to do would be to explain that these are
> your reasons for having their information in a privacy policy, rather
> than assume that one can let it ride because one is covered with legit
> reasons.
> 
>> That means, if you have some "terms of use" which may be considered a 
>> contract, one single checkbox may be enough.
> 
> A flag with a link to the policy, an "Accept" button and a log of who
> has accepted with a server timestamp.... "Who" may be IP address, or
> Author name if entered.  But it must be explicitly collected, making
> this tricky. It can't just be stored in their cookie.
> 
> Mailchimp is now saving an electronic "copy" of a person clicking that
> they want to receive "email" from you.  If you have several checkboxes
> for GDPR acceptance, it tracks how many one has checked.
> 
> Record keeping is part of the GDPR....
> 
> It can be a text log I would imagine, or it could be stored in
> SiteAdmin on the PmWiki back-end.  But the text of the
> form/notification agreed to probably should be tracked along with the
> checkbox checked and time & some way to identify "who" like IP/Author.
> 
> I wish it weren't complicated — but it is. And it's necessary.
> 
>> At any rate, you need a simple, plain text summary of your use of 
>> personal information.
>> 
>>> Getting explicit permissions before setting ANY cookies (not "if you
>>> use this site you agree to cookies....") which should be in a pop-up
>>> with a checkbox, and the permission has to be tracked though I have 
>>> no
>>> idea how you'd trace it (just on IP?).
>> 
>> For a PmWiki cookie, only a session ID, and probably the "Author" 
>> cookie are considered personal information, you can send other cookies 
>> without the need for consent.
> 
> It's still a cookie, and from what I can tell you're not allowed to
> set cookies without express consent.  So the cookie can't be set until
> they agree to accept it.  Even if it's "only" a session ID.
> 
>> If you have a legitimate interest (usage information, editor 
>> accountability, security, troubleshooting), you don't need explicit 
>> consent.
> 
> Where does it say this?  Everything I've read makes it our
> responsibility to get that consent before ALLOWING the cookie to be
> set.  So for our security on both ends we need explicit consent before
> allowing comments, emails to us, or edits/authoring on our wikis.
> From people in the EU at least, but as more countries decide they like
> this policy, I expect this to catch on for other countries.
> 
> (Especially when they see the type of money the EU makes on 
> compliance.)
> 
>> BTW the IP address is also personal information, it is crazy that by 
>> law we have to store the server access logs with the IP address, and 
>> people need to consent before. This is a Catch 22 abomination, when 
>> someone opens the site, the server immediately stores the log entry, 
>> and if they do not consent the server stores another log entry.
> 
> Yes, the IP address is personal identifying information.  It says so
> directly in the GDPR.
> 
>> I believe the people who wrote the parts about cookies and IP 
>> addresses were somewhat ignorant about how the internet works, and 
>> they did not get help, which was stupid.
> 
> Actually, I think they meant it no matter how difficult it is for
> admins and software authors.  And we can blame it on Big Data, but yes
> their tracking is completely obnoxious, and after retargeting pixels
> and Google's gmail using targeted advertising based on email content —
> this is 100% necessary for the protection of individuals.  So even
> though it's inconvenient, and even though I'm in the US, I actually
> think they're doing the right thing and it's our responsibility to our
> users and admins to make this more secure and explicit for them to be
> able to use this software.
> 
> It may or may not come from ignorance — but it really doesn't matter
> now that it's law.  Fight to change it if you're in the EU, and I'd
> disagree with you as a person who doesn't like their kids being a
> bunch of statistics in some advertisers' databases — but it's there
> now.  Hopefully the EU wouldn't come after someone as small as me, but
> they can.  And maybe they will lest they be accused of only going
> after big corps with big bankrolls.
> 
> In the meantime, I'm scrambling to cover my butt.
> 
> 
>>> And you can't say "using this site constitutes you agree to our
>>> privacy policy or terms of service" — you need a material checkbox
>>> agreeing to it, with a link, and that checkbox use has to be tracked
>>> somehow (just like email form & comment form permission, and just 
>>> like
>>> the cookie-setting issue — everything has to be tracked).
>> 
>> If the software is written in a way that it refuses to go forward 
>> unless the checkbox is checked, wouldn't this be enough?
> 
> If it actually doesn't allow use of the site, yes, but implied consent
> is NOT allowed. So if a cookie is set without explicit consent you're
> in violation.  And if they continue to use the site without explicit
> consent, it does NOT count as consent, even if you claim it does
> (according to the GDPR).
> 
> Is there a way to disable cookies unless logged in as an author or
> admin?  That would be a quick way to put a stop to at least part of
> the problem.
> 
> Then the PmWiki login could have a GDPR-compliance checkbox.
> 
> As most of my sites are only edited by me, it would help because I'm
> the only one it needs a cookie for — why put a cookie on visitors?  I
> don't need that.
> 
> I took Google Analytics off my sites (before GDPR stuff) because I
> don't want to be responsible for their handling my traffic and knowing
> which IPs go to which pages of my sites — people are logged in in
> Gmail and Google or YouTube — Google knows exactly who they are and
> doesn't need to know they have mental health issues of any type.  I'm
> scrambling to remove a call to Google Fonts off one of my sites so the
> fonts are embedded rather than pulled from Google thus identifying my
> web visitors.
> 
> This is pretty serious.
> 
>>> A neat thing WordPress did is they have plug-ins supply "Suggested
>>> wording" for privacy policies to cover that they're in use on the
>>> site.  When the user is on the back-end there's help documents for
>>> creating a privacy policy, and for example Akismet suggests some
>>> wording for your privacy policy.  WordPress overall gives suggested
>>> wording (which covers general cookies, and mentions that you have to
>>> put your analytics etc. into the document).
>> 
>> Indeed, you probably need to mention that you outsource analytics to 
>> external companies and embed content from other platforms like videos 
>> or maps.
> 
> Yes.  Done.
> 
>> There is a JS program that can be useful, Tarteaucitron ("Lemon pie" 
>> in French):
>> 
>>  https://github.com/AmauriC/tarteaucitron.js
>> 
>> It can be configured to delay the loading of external resources like 
>> analytics and videos until the visitor accepts these individually and 
>> explicitly, and the visitor can see and delete individual cookies.
> 
> Very nice.  Sexy. lol — wouldn't work for people with JS blocked....
> and since embedded content isn't always done with JS, it would
> probably not work for me.  I'd have to require JS to use this
> plug-in....which kinda defeats the purpose of the good reasons to
> block JS. :/
> 
> Crisses
> _______________________________________________
> pmwiki-devel mailing list
> pmwiki-devel at pmichaud.com
> http://www.pmichaud.com/mailman/listinfo/pmwiki-devel



More information about the pmwiki-devel mailing list