Raw Record Source

{
  "$type": "site.standard.document",
  "canonicalUrl": "https://numergent.com/2019-06/Identity-Privacy-and-the-Edge.html",
  "path": "/2019-06/Identity-Privacy-and-the-Edge.html",
  "publishedAt": "2019-06-14T09:22:20.000Z",
  "site": "at://did:plc:cf6futaebyc2k4wgzsr4v42k/site.standard.publication/3mp2ewx43js2g",
  "tags": [
    "identity",
    "privacy",
    "humanism",
    "edge computing",
    "talks",
    "composability"
  ],
  "textContent": "Introduction\n\nMy goals for this talk are to:\n\n- Give a small taxonomy going into categories and labels that I think are useful when talking about identity;\n- Walk you through what a layered conceptual model for identity could be like;\n- Talk about the privacy implications for how we go about implementing things;\n- Hopefully convince you that the closer to the edge we process things, the better it is for the user, but that the edge does not guarantee privacy (no matter what the Blue Behemoth whose name starts with an F would like people to believe).\n\n<div class=\"space-around\">\n<iframe width=\"640\" height=\"360\" src=\"https://www.youtube.com/embed/a8o0HWwhHb0\" frameborder=\"0\" allowfullscreen></iframe>\n</div>\n\nSlides at Speakerdeck.\n\n\n\nBefore we get too far...\n\nLet me clarify one key point here, before we get too deeply into the weeds.\n\nThis is both a work in progress _and_ a collaborative effort. There are a few people involved on some areas that we're discussing today, both from Samsung NEXT and outside, including a bunch of us who met at the Internet Identity Workshop. I expect more people to get involved as we progress.\n\n<div class=\"space-around\" style=\"text-align: center\">\n\n</div>\n\nThe reason why I'm making this remark early is not just to give credit to these people.\n\nI also mean to drive home the point that this is an on-going topic. We are all still figuring things out. There is no set-in-stone approach I can suggest or a good textbook I can refer you to. \n\nThis is not master class, it's an attempt to spark conversation.\n\nDefinitions\n\nWhat identity is\n\nIdentity is everything that defines you.\n\nIdentity is how you identify yourself not only to a random site, but to other people. \n\nIt is how you perceive yourself and how others perceive you. \n\nIt is the core of your life. \n\nDigital identity is what emerges from the sum total of everything you do online.\n\nThat hardly narrows things down, though.\n\nLet's get down to specifics and start naming things.\n\nWe have two major components of online identity: facts and characteristics.\n\nFacts\n\nFacts are specific atomic details we know about an individual. These are elements that will generally fall under personally-identifying information. \n\nThey run the gamut from the name you go by, to username/password pairs, to shipping addresses, to phone and passport numbers.\n\nThis aspect of identity generally relates to verification and authentication.\n\nThese can self-asserted in some cases, like username/password pairs or phone numbers. You don't need a government to confirm what my phone number or e-mail address might be - you can just verify it yourself.  \n\nWe are calling these self-asserted elements _identifiers_.\n\nOther elements will only be considered valid if they are backed by a trusted third party, or are presented in a medium that's hard enough to fake that the person verifying it can trust it. \n\nIf there is any sort of external validation, we call them _credentials_.\n\nNotice that some values can fall on both categories, depending on use. \n\nAmazon trusts me to enter the right address from my apartment, because it's in my best interest to actually receive the stuff I paid for.\n\nYou trust my name is Ricardo, because ... what do you have to lose if it's not?\n\nA bank, however, would not trust either of those elements unless they are able to validate them  as _credentials_. Their entire business would suffer if they take as a credential something that they can't eventually substantiate.\n\nCharacteristics\n\nBut on the other hand... while my name _is_ Ricardo, that is not _who I am_.\n\nThe _who I am_ part of my identity is a lot less structured, and a lot more mercurial.\n\nA lens on who I am is the technologist working on investments who writes talks that combine movies and the tech industry.\n\nAnother is the anime fan with an eclectic music taste that skews towards prog rock and synthwave.\n\nYet another is the very private person who's made incredibly uncomfortable by even stating those characteristics to an audience.\n\nThere are aspects of this identity which have been there for decades but others have changed with the years.\n\nWhen we are talking digital identity, these characteristics are what emerges from your daily activities online, what can be scryed from your data exhaust.\n\nCharacteristics can have a lot of value without identifiers, as we'll see later.\n\nIf you know I have a strong interest in anime, and have a general idea of how that preference skews, you don't need to know what my name is or where I live to sell me the right t-shirt.\n\nThis means that where identifiers and credentials have to deal with authentication, this aspect has a much larger impact on areas ranging from behavioral patterns to monetization.\n\nWhenever I talk about _identity_ in the aggregate, you can expect I'm talking about this aspect.\n\nGranularity\n\nBut then, you're all builders. I imagine you're thinking \"yeah, that's nice, but it's still not granular enough\", so let's drill down a bit further.\n\nAn OSI-like model for identity\n\nYou are probably all familiar with the Open Systems Interconnection model, but let me do a quick intro just in case.\n\nIt is conceptual model meant to organize the functions of a telecommunications system in a series of abstraction layers.\n\nThere are some basic expectations. Every layer:\n\n- Serves the layers above it,\n- Is supported by the layer below it,\n- Communicates with layers at the same level.\n\nThis is a great model for figuring out _where does a new system or component fit_.\n\nI've seen a couple of stabs at designing an equivalent for identity. A bunch of us tried as well and came up with our own version, which we feel is more general.\n\nI'd like to reiterate that this is very much a work-in-progress conceptual model. This is not a standard, this is the result of six or seven of us mucking about with concepts for a few hours.\n\nRemember that the OSI model is not just a layered stack - it's also about how systems interact, or which layers we could swap for a different but compatible implementation.\n\n<div class=\"space-around\" style=\"text-align: center\">\n\n</div>\n\nLet's go over a couple of these, because unfortunately we don't have that much time.\n\nThe first layer down here, _Storage_, is pretty much all identifiers and raw characteristics. We need to have this data stored _somewhere_. \n\nIn theory we don't care very much where, but if you ask me (and I know you're just dying to ask me) I'd say this needs to be in the user's control.\n\nWe'll talk more about that when we get into the privacy topic.\n\nBut chances are identifiers need to be validated for them to be useful, turning them into _credentials_, which is why we have the second layer: _Validation_, or the trust system associated with the identity.\n\nThis validity will sometimes be self-asserted: for instance, if I give you my phone number, you can trust me or call me.\n\nIn other cases it will be built-in: if I'm sending a username and password pair to a site, the site itself is the trust system because they have the (hopefully) hashed and salted equivalent.\n\nOn the other extreme, you have the case I mentioned earlier, where the trust system is provided by an external party like a government institution.\n\nNow, those are just storage and expectations. Obviously, before we can talk about or validate any of these things, we need to be able to _Reference_ them.\n\nAnd this was an interesting one. My first impulse was seeing _foo@bar.com_ as the basic unit, but as someone else pointed out, both _foo@bar.com_ and _did:foo:bar_ can be references to the same underlying identity.\n\nThe reference can change, even if the two layers below it remain the same.\n\nWe can talk offline about what we see fitting into the other layers.  You'll probably see that the further we go up these layers, the least defined these things are.\n\nThat is because identity is relatively recent topic - as an industry, we are still figuring things out. \n\nI expect other people will come up with different models, and we'll adjust things as the industry evolves.\n\nEven so, we created this because we need to have a vocabulary when talking about these things. Having a common vocabulary simplifies communication, and communication is fundamental for getting things right.\n\nWhy we need to get this right\n\nAnd we do need to get this right, as an industry, because the design of your online identity has fundamental implications on privacy.\n\nThere's currently a mad scramble by Gargantuan, Ginormous, Gigantic Goliaths to redefine privacy as \n\n> \"We'll take all your data and just not sell it directly\" \n\n(Paraphrasing)\n\nUnder that definition, you will have privacy because it's only _them_ watching your every move. It's not like _that_ other company, who shared your data with others. Those other people do evil.\n\nOn the other hand, the Blue Behemoth is looking to pivot into a \"private\" social network. \n\nStop me if you've heard this one before.\n\nEverything will be encrypted, they say, and any processing that needs to be done will be done on device (that's the edge!) so Facebook will have no idea what you are talking about or what you're into.\n\nWell...\n\n<div class=\"space-around\" style=\"text-align: center\">\n\n</div>\n\n\"We'll data mine you but not share it with anyone\" is adorable.\n\nIt is, at best, what an acquaintance calls \"pinky-swear privacy\".\n\nIf that's what you're into, and if you trust that:\n\n- The other side will keep their pinky-swear promise of not being evil, \n- They will properly implement controls over it so that no employee can abuse their power, \n- They're such infallible engineers that the data is never going to leak \n- (not like, say, people who keep passwords in cleartext for well over a decade)... \n\nThen that's fine, I guess.\n\nJust be aware that there is a non-insignificant level of trust involved.\n\nAnd about your data being encrypted so Facebook can't read it... well... I wouldn't put too much stock on it until they're no longer an ad network.\n\nEdge processing can be privacy-enhancing, but it is not privacy guaranteeing. \n\nMetadata can be more revealing than data.\n\nLet me give you just five data points. Suppose you’re processing all my data on device, yet you know…\n\n- I am online usually in the Berlin time zone,\n- Which IP addresses my connections come from,\n- That I got served ads that skew towards movies and anime,\n- That I click on ads about cat food every 3-4 weeks,\n- That I never click on ads about nearby KFCs.\n\nI expect we're all seeing a profile emerge. You do not need to have a full log of someone's conversations to build a profile, and get a picture of how to sell them a president (or sell them on not voting).\n\n(You know, hypothetically)\n\nIn this scenario, you wouldn't even need to get the identity facts about someone, as long as you can learn everything about their identity characteristics.\n\nThis is, I believe, something for which we need technological solutions.\n\nMake it better\n\nIf there are any lawyers in the audience, I can imagine them rolling their eyes and going _\"bloody programmers and technology being a solution for everything...\"_, but frankly, I don't think we are going to be able to legislate or regulate our way out of this.\n\nWe already have regulation. We even use it. \n\nIt has had no effect as a deterrent.\n\nOn April 24th, Facebook announced that they expected to be fined between 3 and 5 billion dollars by the Federal Trade Commission over privacy issues.\n\nThat's a record breaking fine. \n\nAnd it had worse than zero effect. \n\nNot only it didn't have a negative effect, but their stock price shot up, increasing Facebook's capitalization by $40 Billion.\n\nThe FTC may as well have given them a license for privacy invasion.\n\nSo we can add regulation to the list of things that aren't going to help.\n\n- Regulation and fines aren't going to get us out of this mess;\n- People won't leave because of scandals or screw-ups (or they'd have done it already); and more importantly\n- People won't switch because your solution is more ethical - we already have those, and people don't use them.\n\nKeep that in mind if you are working on identity. For users to switch, not only your solution needs to be better, but it needs to give users a good reason: it needs to enable them to do something that they couldn't do before.\n\nBecause identity is sticky. This stickiness manifests in the fear of losing their friends, losing their connections, losing the small trickle of updates about who is dating whom or how the grandchildren are doing.\n\nIt is also a fear of impermanence. People have become so used to handing all this information out to third parties, to this being easier than keeping it safe themselves, that they worry about what might happen if they choose a more ethical but smaller cloud service, and then this service goes away.\n\nSelf-sovereignty\n\nOn his book _Who owns the future?_, Jaron Lanier warns against your identity being locked into places like Facebook and Google.\n\nHis concern is not only that these places are privacy invaders which use your life as a raw material, but also the fact that if Facebook ever went into decline, billions of people would lose access to their contacts, photos, and the rest of the online life they have invested there.\n\nHis suggestion is instead...\n\n> \"Government must come to be the place where the most basic online identity will be grounded in the long term.\"\n>\n> Jaron Lanier, _Who owns the future?_\n\nI like and can recommend the book. Lanier makes some excellent points, even if I don't always agree with him.\n\nBut this is... this is one where I vehemently disagree.\n\nThe root identity residing with _anyone_ other than the user themselves makes them inevitably subordinate to whomever controls this root.\n\nThis might be unavoidable right now when we are talking about credentials like a passport, because the institutions that consume them (like banks) want them rooted in a government institution.\n\nBut that is not the case for online identity.\n\nOnline identity must be in control of the individual. It must be self-sovereign.\n\nSelf-sovereign identity is a talk in and of itself. I'd suggest everyone reads Christopher Allen's article _The Path to self-sovereign identity_, where he outlines the principles that he feels are necessary for any identity system to not only enable trust in a provider but preserve a user's privacy.\n\nBecause earlier I said that for users to switch you're going to need to enable some new interaction, allow them to do something they couldn't do before.\n\nEven leaving aside that the fact that I think that Allen's self-sovereign identity principles are a moral necessity, we can just be practical: self-sovereign identity is more likely to enable new interactions just because nobody else is fully doing it yet, so we haven’t started exploring it.\n\nYou will see that a lot of those principles boil down to the user being able to have a say in what happens and when. \n\nFor us to be able to absolutely ensure that, the users need to preferably be holding the data themselves, or at the very least hold the keys to this data that is stored elsewhere.\n\nWe need to move things closer to the edge.\n\nThe edge needs a humanistic focus\n\nWe usually limit ourselves to the baseline definition of edge that comes from _edge computing_: how close do you process a data emission to where it happened.\n\nThat's useful, from a purely technical standpoint. But we need to start looking at our implementations and architectures with a more humanistic focus. \n\nWe should think not only about _where_ we process the data, but about _who_ controls that node and its output. \n\nFacebook doing edge computing probably saves them a lot of aggregate data center processing power, but users have no control over the profile that ends up being uploaded for ad delivery.\n\nFor identity data, we are not talking about just edges in a network. \n\nWe are talking about people.  \n\nAnd if we want to empower them, we should architect our solutions so that control over the data, so that sovereignty itself, lies with these very same people.\n\nThis is, what I think, we should be building towards. So let's talk.",
  "title": "Identity, Privacy, and the Edge"
}