Privacy technologies course roundup: Wiki, student projects, HotPETs

In earlier posts about the privacy technologies course I taught at Princeton during Fall 2012, I described how I refuted privacy myths, and presented an annotated syllabus. In this concluding post I will offer some additional tidbits about the course.

Wiki. I referred to a Wiki a few times in my earlier post, and you might wonder what that was about. The course included an online Wiki discussion component, and this was in fact the centerpiece. Students were required to participate in the online discussion of the day’s readings before coming to class. The in-class discussion would use the Wiki discussion as a starting point.

The advantages of this approach are: 1. it gives the instructor a great degree of control in shaping the discussion of each paper, 2. the instructor can more closely monitor individual students’ progress 3. class discussion can focus on particularly tricky and/or contentious points, instead of rehashing the obvious.

Student projects. Students picked a variety of final projects for the class, and on the whole exceeded my expectations. Here are two very different projects, in brief.

Nora Taranto, a History of Science major, wrote a policy paper about the privacy implications of the shift to electronic medical records. Nora writes:

I wrote a paper about the privacy implications of patient-care institutions (in the United States) using electronic medical record (EMR) systems more and more frequently.  This topic had particular relevance given the huge number of privacy breaches that occurred in 2012 alone.  Meanwhile, there is a simultaneous criticism coming from care providers about the usability of such EMR systems.  As such, many different communities—in the information privacy sphere, in the medical community, in the general public, etc.—have many different things to say.  But, given the several privacy breaches that occurred within a couple of weeks in April 2012 and together implicated over a million individuals, concerns have been raised in particular about how secure EMR systems are.  These concerns are especially worrisome given the federal government’s push for their adoption nationwide beginning in 2009 when the American Recovery and Reinvestment Act granting funds to hospitals explicitly for the purpose of EMR implementation.

So I looked into the benefits and costs of such systems, with a particular slant towards the privacy benefits/costs.  Overall, these systems do have a number of protective mechanisms at their disposal, some preventative and others reactive.  While these protective barriers are all necessary, they are not sufficient to guarantee the patient his or her privacy rights in the modern day.  These protective mechanisms—authentication schemes, encryption, and data logs/anomaly-detection—need to be expanded and further developed to provide an adequate amount of protection for personal health information.  While the government is, at the moment, encouraging the adoption of EMR systems for maximal penetration, medical institutions ought to use caution in considering which systems to implement and ought to hold themselves to a higher standard.  Moreover, greater regulatory oversight of EMR systems on the market would help institutions maintain this cautious approach.

Abu Saparov, Ajay Roopakalu, and Rafi Shamim, also undergraduates, designed an implemented an alternative to centralized key distribution. They write:

Our project for the course was to create and implement a decentralized public key distribution protocol and show how it could be used. One of the initial goals of our project was to experience first-hand some of the things that made the design of a usable and useful privacy application so hard. Early on in the process, we decided to try to build some type of application that used cryptography to enhance the privacy of communication with friends. Some of the reasons that we chose this general topic were the fact that all of us had experience with network programming and that we thought some of the things that cryptography can achieve are uniquely cool. We were also somewhat motivated by the prospect of using our application to talk with each other and our other friends after we graduate. We eventually gravitated towards two ideas: (1) a completely peer-to-peer chat system that is encrypted from end-to-end, and (2) a “dumb” social network that allows users to share posts that only their friends (and not the server) can see. During the semester, our focus shifted to designing and implementing the underlying key distribution mechanism upon which these two systems could be built.

When we began to flesh out the designs for our two ideas, we realized that the act of retrieving a friend’s public cryptographic keys was the first challenge to solve. Certificate authorities are the most common way to obtain public keys, but require a large degree of trust to be placed in a small number of authorities. Web of Trust is another option, and is completely decentralized, but often proves difficult in practice because of the need for manual key signing. We decided to make our own decentralized protocol that exposes an easily usable API for clients to use in order to obtain public keys. Our protocol defines an overlay network that features regular nodes, as well as supernodes that are able to prove their trustworthiness, although the details of this are controllable through a policy delegate. The idea is for supernodes to share the task of remembering and verifying public keys through a majority vote of neighboring supernodes. Users running other nodes can ask the supernodes for a friend’s public key. In order to trick someone, an adversary would have to control over half of the supernodes from which a user requested a key. Our decision to go with an overlay network created a variety of issues such as synchronizing information between supernodes, being able to detect and report malicious supernodes, and getting new nodes incorporated into the network. These and the countless other design problems we faced definitely allowed us to appreciate the difficulty of writing a privacy application, but unfortunately, we were not fully able to test every element of our protocol and its implementation. After creating the protocol, we implemented small, bare-bones applications for our initial ideas of peer-to-peer chat and an encrypted social network.

Master’s students Chris Eubank, Marcela Melara, and Diego Perez-Botero did a project on mobile web tracking which, with some further work, turned into a research paper that Chris will speak about at W2SP tomorrow.

Finally, I’m happy to say that I will be discussing the syllabus and my experiences teaching this class at HotPETs this year, in Bloomington, IN, in July.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

May 23, 2013 at 2:14 pm 1 comment

What Happened to the Crypto Dream? Now in a new and improved paper form!

Last October I gave a talk titled “What Happened to the Crypto Dream?” where I looked at why crypto seems to have done little for personal privacy. The reaction from the audience (physical and online) was quite encouraging — not that everyone agreed, but they seemed to find it thought provoking — and several people asked me if I’d turn it into a paper. So when Prof. Alessandro Acquisti invited me to contribute an essay to the “On the Horizon” column in IEEE S&P magazine, I jumped at the chance, and suggested this topic.

Thanks to some fantastic feedback from colleagues and many improvements to the prose by the editors, I’m happy with how the essay has turned out. Here it is in two parts: Part 1, Part 2.

While I’m not saying anything earth shaking, I do make a somewhat nuanced argument — I distinguish between “crypto for security” and “crypto for privacy,” and further subdivide the latter into a spectrum between what I call “Cypherpunk Crypto” and “Pragmatic Crypto.” I identify different practical impediments that apply to those two flavors (in the latter case, a complex of related factors), and lay out a few avenues for action that can help privacy-enhancing crypto move in a direction more relevant to practice.

I’m aware that this is a contentious topic, especially since some people feel that the time is ripe for a resurgence of the cypherpunk vision. I’m happy to hear your reactions.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

April 29, 2013 at 12:06 pm Leave a comment

Privacy technologies: An annotated syllabus

Last semester I taught a course on privacy technologies here at Princeton. Earlier I discussed how I refuted privacy myths that students brought into class. In this post I’d like to discuss the contents of the course. I hope it will be useful to other instructors who are interested in teaching this topic as well as for students undertaking self-study of privacy technologies. Beware: this post is quite long.

What should be taught in a class on privacy technologies? Before we answer that, let’s take a step back and ask, how does one go about figuring out what should be taught in any class?

I’ve seen two approaches. The traditional, default, overwhelmingly common approach is to think of it in terms of “covering content” without much consideration to what students are getting out of it. The content that’s deemed relevant is often determined by what the fashionable research areas happen to be, or historical accident, or some combination thereof.

A contrasting approach, promoted by authors like Bain, applies a laser focus on skills that students will acquire and how they will apply them later in life. On teaching orientation day at Princeton, our instructor, who clearly subscribed to this approach, had each professor describe what students would do in the class they are teaching, then wrote down only the verbs from these descriptions. The point was that our thinking had to be centered around skills that students would take home.

I prefer a middle ground. It should be apparent from my description of the traditional approach above that I’m not a fan. On the other hand, I have to wonder what skills our teaching coach would have suggested for a course on cosmology — avoiding falling into black holes? Alright, I’m exaggerating to make a point. The verbs in question are words like “synthesize” and “evaluate,” so there would be no particular difficulty in applying them to cosmology. But my point is that in a cosmology course, I’m not sure the instructor should start from these verbs.

Sometimes we want students to be exposed to knowledge primarily because it is beautiful, and being able to perceive that beauty inspires us, instills us with a love of further learning, and I dare say satisfies a fundamental need. To me a lot of the crypto “magic” that goes into privacy technologies falls into that category (not that it doesn’t have practical applications).

With that caveat, however, I agree with the emphasis on skills and life impact. I thought of my students primarily as developers of privacy technologies (and more generally, of technological systems that incorporate privacy considerations), but also as users and scholars of privacy technologies.

I organized the course into sections, a short introductory section followed by five sections that alternated in the level of math/technical depth. Every time we studied a technology, we also discussed its social/economic/political aspects. I had a great deal of discretion in guiding where the conversation around the papers went by giving them questions/prompts on the class Wiki. Let us now jump in. The italicized text is from the course page, the rest is my annotation.

0. Intro

Goals of this section: Why are we here? Who cares about privacy? What might the future look like?

In addition to helping flesh out the foundational assumptions of this course that I discussed in the previous post, pairing these opposing views with each other helped make the point that there are few absolutes in this class, that privacy scholars may disagree with each other, and that the instructor doesn’t necessarily agree with the viewpoints in the assigned reading, much less expects students to.

1. Cryptography: power and limitations

Goals. Travel back in time to the 80s and early 90s, understand the often-euphoric vision that many crypto pioneers and hobbyists had for the impact it would have. Understand how cryptographic building blocks were thought to be able to support this restructuring of society. Reason about why it didn’t happen.

Understand the motivations and mathematical underpinnings of the modern research on privacy-preserving computations. Experiment with various encryption tools, discover usability problems and other limitations of crypto.

I think the Chaum paper is a phenomenal and underutilized resource for teaching. My goal was to really immerse students in an alternate reality where the legal underpinnings of commerce were replaced by cryptography, much as Chaum envisioned (and even going beyond that). I created a couple of e-commerce scenarios for Wiki discussion and had them reason about how various functions would be accomplished.

My own views on this topic are set forth in this talk (now a paper; coming soon). In general I aimed to shield students from my viewpoints, and saw my role as helping them discover (and be able to defend) their own. At least in this instance I succeeded. Some students took the position that the cypherpunk dream is just around the corner.

  • The ‘Garbled Circuit Protocol’ (Yao’s theorem on secure two-party computation) and its implications (lecture)

This is one of the topics that sadly suffers from a lack of good expository material, so I instead lectured on it.

One of the exercises here was to install and use various crypto tools and rediscover the usability problems. The difficulties were even worse than I’d anticipated.

2. Data collection and data mining, economics of personal data, behavioral economics of privacy

Goals. Jump forward in time to the present day and immerse ourselves in the world of ubiquitous data collection and surveillance. Discover what kinds of data collection and data mining are going on, and why. Discuss how and why the conversation has shifted from Government surveillance to data collection by private companies in the last 20 years.

Theme: first-party data collection.

Theme: third-party data collection.

Theme: why companies act the way they do.

Theme: why people act the way they do.

This section is rather self-explanatory. After the math-y flavor of the first section, this one has a good amount of economics, behavioral economics, and policy. One of the thought exercises was to project current trends into the future and imagine what ubiquitous tracking might lead to in five or ten years.

3. Anonymity and De-anonymization

Important note: communications anonymity (e.g., Tor) and data anonymity/de-anonymization (e.g., identifying people in digital databases) are technically very different, but we will discuss them together because they raise some of the same ethical questions. Also, Bitcoin lies somewhere in between the two.

Tor and Bitcoin (especially the latter) were the hardest but also the most rewarding parts of the class, both for them and for me. Together they took up 4 classes. Bitcoin is extremely challenging to teach because it is technically intricate, the ecosystem is rapidly changing, and a lot of the information is in random blog/forum posts.

In a way, I was betting on Bitcoin by deciding to teach it — if it had died with a whimper, their knowledge of it would be much less relevant. In general I think instructors should choose to make these such bets more often; most curricula are very conservative. I’m glad I did.

It was a challenge to figure out which deanonymization paper to assign. I went with the DNA one because I wanted them to see that deanonymization isn’t a fact about data, but a fact about the world. Another thing I liked about this paper is that they’d have to extract the not-too-complex statistical methodology in this paper from the bioinformatics discussion in which it is embedded. This didn’t go as well as I’d hoped.

I’ve co-authored a few deanonymization papers, but they’re not very well written and/or are poorly suited for pedagogical purposes. The Kaggle paper is one exception, which I made optional.

This is another pair of papers with opposing views. Since the latter paper is optional, knowing that most of them wouldn’t have read it, I used the Wiki prompts to raise many of the issues that the author raises.

4. Lightweight Privacy Technologies and New Approaches to Information Privacy

While cryptography is the mechanism of choice for cypherpunk privacy and anonymity tools like Tor, it is too heavy a weapon in other contexts like social networking. In the latter context, it’s not so much users deploying privacy tools to protect themselves against all-powerful adversaries but rather a service provider attempting to cater to a more nuanced understanding of privacy that users bring to the system. The goal of this section is to consider a diverse spectrum of ideas applicable to this latter scenario that have been proposed in recent years in the fields of CS, HCI, law, and more. The technologies here are “lightweight” in comparison to cryptographic tools like Tor.

5. Purely technological approaches revisited

This final section doesn’t have a coherent theme (and I admitted as much in class). My goal with the first two papers was to contrast a privacy problem which seems amenable to a purely or primarily technological formulation and solution (statistical queries over databases of sensitive personal information) with one where such attempts have been less successful (the decentralized, own-your-data approach to social networking and e-commerce).

Differential privacy is another topic that is sorely lacking in expository material, especially from the point of view of students who’ve never done crypto before. So this was again a lecture.

These two essays aren’t directly related to privacy. One of the recurring threads in this course is the debate between purely technological and legal or other approaches to privacy; the theme here is to generalize it to a context broader than privacy. The Barlow essay asserts the exceptionalism of Cyberspace as an unregulable medium, whereas the Grimmelmann paper provides a much more nuanced view of the relationship between the law and new technological frontiers.

I’m making available the entire set of Wiki discussion prompts for the class (HTML/PDF). I consider this integral to the syllabus, for it shapes the discussion very significantly. I really hope other instructors and students find this useful as a teaching/study guide. For reference, each set of prompts (one set per class) took me about three hours to write on average.

There are many more things I want to share about this class: the major take-home ideas, the rationale for the Wiki discussion format, the feedback I got from students, a description of a couple of student projects, some thoughts on the sociology of different communities studying privacy and how that impacted the class, and finally, links to similar courses that are being taught elsewhere. I’ll probably close this series with a round-up post including as many of the above topics as I can.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

April 16, 2013 at 4:02 am 2 comments

How I utilized “expectation failure” to refute privacy myths

Last semester I taught a course on privacy technologies. Since it was a seminar, the class was a small, self-selected group of very motivated students. Based on the feedback, it seems to have been a success; it was certainly quite personally gratifying for me. This is the first in a series of posts on what I learnt from teaching this course. In this post I will discuss some major misconceptions about privacy, how to refute them, and why it is important to do this right at the beginning of the course.

Privacy’s primary pitfalls

Instructors are often confronted with breaking down faulty mental models that students bring into class before actual learning can happen. This is especially true of the topic at hand. Luckily, misconceptions about privacy are so pervasive in the media and among the general public that it wasn’t too hard to identify the most common ones before the start of the course. And it didn’t take much class discussion to confirm that my students weren’t somehow exempt from these beliefs.

One cluster of myths is about the supposed lack of importance of privacy. 1. “There is no privacy in the digital age.” This is the most common and perhaps the most grotesquely fallacious of the misconceptions; more on this below. 2. “No one cares about privacy any more” (variant: young people don’t care about privacy.) 3. “If you haven’t done anything wrong you have nothing to hide.”

A second cluster of fallacious beliefs is very common among computer scientists and comes from the tendency to reduce everything to a black-and-white technical problem. In this view, privacy maps directly to access control and cryptography is the main technical mechanism for achieving privacy. It’s a view in which the world is full of adversaries and there is no room for obscurity or nontechnical ways of improving privacy.

The first step in learning is to unlearn

Why is it important to spend time confronting faulty mental models? Why not simply teach the “right” ones? In my case, there was a particularly acute reason — to the extent that students believe that privacy is dead and that learning about privacy technologies is unimportant, they are not going to be invested in the class, which would be really bad. But even in the case of misconceptions that don’t lead to students doubting the fundamental premise of the class, there is a surprising reason why unlearning is important.

A famous experiment in the ’80s (I really really recommend reading the linked text) demonstrated what we now know about the ineffectiveness of the “information transmission” model of teaching. The researchers interviewed students after any of four introductory physics courses, and determined that they hadn’t actually learned what had been taught, such as Newton’s laws of motion; instead they just learned to pass the tests. When the researchers sat down with students to find out why, here’s what they found: 

What they heard astonished them: many of the students still refused to give up their mistaken ideas about motion. Instead, they argued that the experiment they had just witnessed did not exactly apply to the law of motion in question; it was a special case, or it didn’t quite fit the mistaken theory or law that they held as true.

A special case! Ha. What’s going on here? Well, learning new facts is easy. On the other hand, updating mental models is so cognitively expensive that we go to absurd lengths to avoid doing so. The societal-scale analog of this extreme reluctance is well-illustrated by the history of science — we patched the Ptolemaic model of the Universe, with the Earth at the center, for over a millennium before we were forced to accept that the Copernican system fit observations better.

The instructor’s arsenal 

The good news is that the instructor can utilize many effective strategies that fall under the umbrella of active learning. Ken Bain’s excellent book (which the preceding text describing the experiment is from) lays out a pattern in which the instructor creates an expectation failure, a situation in which existing mental models of reality will lead to faulty expectations. One of the prerequisites for this to work, according to the book, is to get students to care.

Bain argues that expectation failure, done right, can be so powerful that students might need emotional support to cope. Fortunately, this wasn’t necessary in my class, but I have no doubt of it based on my personal experiences. For instance, back when I was in high school, learning how the Internet actually worked and realizing that my intuitions about the network had to be discarded entirely was such a disturbing experience that I remember my feelings to this day. 

Let’s look at an example of expectation failure in my privacy class. To refute the “privacy is dying” myth, I found it useful to talk about Fifty Shades of Grey — specifically, why it succeeded even though publishers initially passed on it. One answer seems to be that since it was first self-published as an e-book, it allowed readers to be discreet and avoid the stigma associated with the genre. (But following its runaway success in that form, the stigma disappeared, and it was released in paper form and flew off the shelves.)

The relative privacy of e-books from prying strangers is one of the many ways in which digital technology affords more privacy for specific activities. Confronting students with an observed phenomenon whose explanation involves a fact that seems starkly contrary to the popular narrative creates an expectation failure. Telling personal stories about how technology has either improved or eroded privacy, and eliciting such stories from students, gets them to care. Once this has been accomplished, it’s productive to get into a nuanced discussion of how to reconcile the two views with each other, different meanings of privacy (e.g., tracking of reading habits), how the Internet has affected each, and how society is adjusting to the changing technological landscape.

I’m quite new to teaching — this is only my second semester at Princeton — but it’s been exciting to internalize the fact that learning is something that can be studied scientifically and teaching is an activity that can vary dramatically in effectiveness. I’m looking forward to getting better at it and experimenting with different methods. In the next post I will share some thoughts on the content of my course and what I tried to get students to take home from it.

Thanks to Josh Hug for reviewing a draft.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

April 11, 2013 at 4:50 am Leave a comment

Unlikely Outcomes? A Distributed Discussion on Decentralized Personal Data Architectures

In recent years there has been a mushrooming of decentralized social networks, personal data stores and other such alternatives to the current paradigm of centralized services. In the academic paper A Critical Look at Decentralized Personal Data Architectures last year, my coauthors and I challenged the feasibility and desirability of these alternatives (I also gave a talk about this work). Based on the feedback, we realized it would be useful to explicate some of our assumptions and implicit viewpoints, add context to our work, clarify some points that were unclear, and engage with our critics on some of the more contentious claims.

We found the perfect opportunity to do this via an invitation from Unlike Us Reader, produced by the Institute of Network Cultures — it’s a magazine run by a humanities-oriented group of people, with a long-standing interest in digital culture, but they also attract some politically oriented developers. The Unlike Us conference, from which this edited volume stems, is also very interesting. [1]

Three of the five original authors — Solon, Vincent and I — teamed up with the inimitable Seda Gürses for an interview-style conversation (PDF). Seda is unique among privacy researchers — one of her interests is to understand and reconcile the often maddeningly divergent viewpoints of the different communities that study privacy, so she was the ideal person to play the role of interlocutor. Seda solicited feedback from about two dozen people in the hobbyist, activist and academic communities, and synthesized the responses into major themes. Then the three of us took turns responding to the prompts, which Solon, with Seda’s help, organized into a coherent whole. A majority of the commenters consented to making their feedback public, and Seda has collected the discussion into an online appendix.

This was an unusual opportunity, and I’m grateful to everyone who made it happen, particularly Seda and Solon who put in an immense amount of work. My participation was very enjoyable. Research proceeds at such a pace that we rarely have the opportunity to look back and cogitate about the process; when we do, we’re often surprised by what we find. For example, here’s something I noted with amusement in one of my responses:

My interest in decentralized social networking apparently dates to 2009, as I just discovered by digging through my archives. I’d signed up to give a talk on pitfalls of social networking privacy at a Stanford workshop, and while preparing for it I discovered the rich academic literature and the various hobbyist efforts in the decentralized model. My slides from that talk seem to anticipate several of the points we made about decentralized social networking in the paper (albeit in bullet-point form), along with the conclusion that they were “unlikely to disrupt walled gardens.” Funnily enough, I’d completely forgotten about having given this talk when we were writing the paper.

I would recommend reading this text as a companion to our original paper. Read it for extra context and clarifications, a discussion of controversial points, and as a way of stimulating thinking about the future prospects of alternative architectures. It may also be an interesting read as an example of how people writing an article together can have different views, and as a bit of a behind-the-scenes look at the research process.

[1] In particular, the latest edition of the conference that just concluded had a panel titled “Are you distributed? The Federated Web Show” moderated by Seda, with Vincent as one of the participants. It touched upon many of the same themes as our work.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

March 27, 2013 at 7:44 am 1 comment

The job talk is a performance

This is the second in a series of posts with advice for computer science academic job candidates.

One shot, one opportunity

The philosopher Marshall Mathers once asked rhetorically, “Look, if you had one shot, or one opportunity / To seize everything you ever wanted in one moment / Would you capture it or just let it slip?”

He added, “Yo.” [1]

I don’t mean to imply that an academic position is everything you ever wanted, but it’s a pretty good life (although not for these reasons). Like it or not, it’s set up so that your career up until this point comes down to one moment. After years of hard work, your ability as a researcher will be judged primarily based on how you sell yourself in the fleeting span of an hour. Of course, you’ll (hopefully) give your talk at many places, but it’s going to be the same talk!

There’s a reason I’m saying this, and it’s not to stress you out even more. Rather, if at any point the level of preparedness that I suggest seems excessive or disproportionate, remember the wise words quoted above.

Public speaking is a performance

My first piece of advice is to read the book Confessions of a Public Speaker.[2] As in, don’t even think about giving your job talk without having read it. You can read it in a sitting; putting it into practice will of course take longer. I cannot overstate the impact this book had on my talk (and my public speaking in general). There are probably other books that capture much of the wisdom in Confessions, and I’d love to hear other recommendations, but if you’re going to read one book it would have to be this one.

There are numerous very useful little details in the book, but it has one central idea that can be boiled down to the phrase “public speaking is a performance.” Job talks are are even more of a performance than public speaking in general, since the audience is specifically there to judge you.[3] This is a generative metaphor — it allows you predict things about your job talk based on what you know about performing. Fully appreciating the metaphor will require reading the book, but here are two such predictions that might otherwise be surprising.

Your first priority is to entertain

Certainly you must both entertain and inform, but the point is that you don’t really have a shot at the latter if you fail at the former. Sitting in a lecture, as everyone who remembers their student days is surely aware, can be excruciating; it’s an extremely unnatural situation from an evolutionary perspective (again, read Confessions to appreciate why.) The chart below from the book What’s the Use of Lectures? shows students’ heart rate over time as they sat in a lecture. It’s only a drop of a few beats per minute, but it translates to an enormous difference in alertness.heartrateIf you don’t do anything different in your talk and simply present your material, your audience’s attention level will be greatly diminished by the half-hour mark, and by the end of your talk people will basically be comatose. Anything you can do to break the routine, linear, hyper-boring pattern of a lecture will help jolt the audience out of their stupor. (That includes asking questions — I usually asked two or three in my talk.) Otherwise they won’t be excited about you nor remember much after the talk.

Rehearse, rehearse, rehearse

You may have heard “practice, practice, practice.” I’d rather cast it in the language of performance, as there are some subtle differences. For example, when people tell you to practice, they tell you not to overdo it because you’d lose your spontaneity. I disagree. In a rehearsal, everything is practiced down to the last detail. In fact, the apparently spontaneous things that I said my talk were the most well-rehearsed parts.

Rehearsal should include videotaping yourself and watching it. Yes, it’s painful and majorly cringe-inducing, but it’s absolutely, absolutely essential. In addition to all the obvious facets of good presentation style that I won’t repeat, one of the subtle but important things you should watch for is nervous tics or other repetitive behaviors — almost everyone has one or more of those, and they can almost derail your talk by distracting your audience.

The reason rehearsal makes such a huge difference is that when you’re delivering a rehearsed talk, your every word and gesture is subconscious, freeing up your mental bandwidth for observing and reacting in real-time to the facial expressions of your audience. There are never more than 40-50 people in these talks, a small enough number that you can instantly notice if someone looks confused, skeptical, or bored. But this won’t be possible if you have to think through your slides instead. The reduction in cognitive load also minimizes the chance of “hitting the wall,” a phenomenon of sudden mental fatigue that’s a serious danger in long-ish talks and can leave you helpless.

Let me close with an example of a little theatrical thing I did that shows the value of rehearsal and the performance metaphor. One of the goals in my location privacy project is to minimize smartphone power consumption. When I got to that part, I’d say, “those of you with Android phones know how bad the battery life is. In fact I usually carry a spare battery around… actually, I think I have it on me.” Then I’d pull a smartphone battery out of my jacket pocket with a bit of a dramatic touch. Somehow the use of a physical prop seemed to reframe their thinking from “yet another academic paper” to “solving a real problem.” It would also usually elicit a laugh and elevate their attention level.

There is so much more to say about job talks, not to mention other aspects of the job interview. I might do follow up posts on a mathematical model of audience behavior and/or an explanation of why slide transitions are (by far) the most important part of your slides.

[1] This post was written while listening to Lose Yourself in a loop.

[2] If it needs to be said, I have no stake in the book, financial or otherwise.

[3] I hasten to add that teaching is very different from public speaking and is emphatically not a performance.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

February 9, 2013 at 8:43 am 4 comments

Price Discrimination and the Illusion of Fairness

In my previous article I pointed out that online price discrimination is suspiciously absent in directly observable form, even though covert price discrimination is everywhere. Now let’s talk about why that might be.

By “covert” I don’t mean that the firm is trying to keep price discrimination a secret. Rather, I mean that the differential treatment isn’t made explicit — e.g., by not basing it directly on a customer attribute — and thereby avoiding triggering the perception of unfairness or discrimination. A common example is selective distribution of coupons instead of listing different prices. Such discounting may be publicized, but it is still covert.

The perception of fairness

The perception of fairness or unfairness, then, is at the heart of what’s going on. Going back to the WSJ piece, I found it interesting to see the reaction of the customer to whom Staples quoted $1.50 more for a stapler based on her ZIP code: “How can they get away with that?” she asks. To which my initial reaction was, “Get away with what, exactly? Supply and demand? Econ 101?”

Even though some of us might not feel the same outrage, I think all of us share at least a vague sense of unease about overt price discrimination. So I decided to dig deeper into the literature in psychology, marketing, and behavioral economics on the topic of price fairness and understand where this perception comes from. What I found surprised me.

First, the fairness heuristic is quite elaborate and complex. In a vast literature spanning several decades, early work such as the “principle of dual entitlement” by Kahneman and coauthors established some basics. Quoting Anderson and Simester: “This theory argues that customers’ have perceived fairness levels for both firm profits and retail prices. Although firms are entitled to earn a fair profit, customers are also entitled to a fair price. Deviations from a fair price can be justified only by the firm’s need to maintain a fair profit. According to this argument, it is fair for retailers to raise the price of snow shovels if the wholesale price increases, but it is not fair to do so if a snowstorm leads to excess demand.”

Much later work has added to and refined that model. A particularly impressive and highly cited 2004 paper reviews the literature and proposes an elaborate framework with four different classes inputs to explain how people decide if pricing is fair or unfair in various situations. Some of the findings are quite surprising. For example: in case of differential pricing to the buyer’s disadvantage, “trust in the seller has a U-shaped effect on price fairness perceptions.”

The illusion of fairness

Sounds like we have a well-honed and sophisticated decision procedure, then? Quite the opposite, actually. The fairness heuristic seems to be rather fragile, even if complex.

Let’s start with an example. Andrew Odlyzko, in his brilliant essay on price discrimination — all the more for the fact that it was published back in 2003 [1] — has this to say about Coca Cola’s ill-fated plans for price-adjusting vending machines: “In retrospect, Coca Cola’s main problem was that news coverage always referred to its work as leading to vending machines that would raise prices in warm weather. Had it managed to control publicity and present its work as leading to machines that would lower prices in cold weather, it might have avoided the entire controversy.”

We know how to explain the public’s reaction to the Coca Cola announcement using behavioral economics — the way it was presented (or framed), customers take the lower price as the “reference price,” and the price increase seems unfair, whereas the Odlyzko’s suggested framing would anchor the higher price as the reference price. Of course, just because we can explain how the fairness heuristic works doesn’t make it logical or consistent, let alone properly grounded in social justice.

More generally, every aspect of our mental price fairness assessment heuristic seems similarly vulnerable to hijacking by tweaking the presentation of the transaction without changing the essence of price discrimination. Companies have of course gotten wise to this; there’s even academic literature on it. One of the techniques proposed in this paper is “reference group signaling” — getting a customer to change the set of other customers to whom they mentally compare themselves. [2]

The perception of fairness, then, can be more properly called the illusion of fairness.

The fragility of the fairness heuristic becomes less surprising considering that we apparently share it with other primates. This hilarious clip from a TED talk shows a capuchin monkey reacting poorly, to put it mildly, to differential treatment in a monkey-commerce setting (although the jury may still be out on the significance of this experiment). If our reaction to pricing schemes is partly or largely due to brain circuitry that evolved millions of years ago, we shouldn’t expect it to fare well when faced with the complexities of modern business.


Given that the prime impediment to pervasive online price discrimination is a moral principle that is fickle and easily circumventable, one can expect that companies to do exactly that, since they can reap most of the benefits of price discrimination without the negative PR. Indeed, it is my belief that more covert price discrimination is going on than is generally recognized, and that it is accelerating due to some technological developments.

This is a problem because price discrimination does raise ethical concerns, and these concerns are every bit as significant when it is covert. [3] However, since it is much less transparent, there’s less of an opportunity for public debate.

There are two directions in which I want to take this series of articles from this point: first a look at how new technology is enabling powerful forms of tailoring and covert price discrimination, and second, a discussion of what can be done to make price discrimination more transparent and how to have an informed policy discussion about its benefits and dangers.

[1] I had the pleasure of sitting next to Professor Odlyzko at a conference dinner once, and I  expressed my admiration of the prescience of his article. He replied that he’d worked it all out in his head circa 1996 but took a few years to put it down on paper. I could only stare at him wordlessly.

[2] I’m struck by the similarities between price fairness perceptions and privacy perceptions. The aforementioned 2004 price fairness framework can be seen as serving a roughly analogous function to contextual integrity, which is (in part) a theory of consumer privacy expectations. Both these theories are the result of “reverse engineering,” if you will, of the complex mental models in their respective domains using empirical behavioral evidence. Continuing the analogy, privacy expectations are also fragile, highly susceptible to framing, and liable to be exploited by companies. Acquisti and Grossklags, among others, have done some excellent empirical work on this.

[3] In fact, crude ways of making customers reveal their price sensitivity lead to a much higher social cost than overt price discrimination. I will take this up in more detail in a future post.

Thanks to Alejandro Molnar, Joseph Bonneau, Solon Barocas, and many others for insightful conversations on this topic.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

January 22, 2013 at 10:24 am 10 comments

Older Posts Newer Posts


I’m an associate professor of computer science at Princeton. I research (and teach) information privacy and security, and moonlight in technology policy.

This is a blog about my research on breaking data anonymization, and more broadly about information privacy, law and policy.

For an explanation of the blog title and more info, see the About page.

Me, elsewhere

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 260 other followers