It's hard to measure the value of data (transfers).
Data’s fundamental characteristics, including its non-rivalrous nature and its ability to capture key learnings about the past and present – and even, as we see clearly through today’s explosion of large language models, predict the future – allow for an incredible range of valuable use cases, including through the combination of data across sources and across users. But some of the most valuable data is personal data, in which people have privacy rights and (in many countries) robust legal protections. Protecting user safety while unlocking the full value of user data is not always straightforward; this difficulty limits new experimentation and value creation that can be created through data transfers, as a necessary side effect of actions to limit the harm that such sharing can also produce. At the Data Transfer Initiative, we’re working to unlock some of that value by empowering people to make a choice to transfer their personal data, thus setting them up to be the principal beneficiaries of the subsequent value creation. And a couple of our recent efforts have brought these data value questions to the surface.
In late May, a team of maintainers to the Data Transfer Project, engineers and product leaders from DTI’s partners Google and Meta, organized an event at MyData 2023 in Helsinki, Finland. (Sadly, I couldn’t join them, but I’m hoping to next year!) MyData is both a conference and a community, building tools to help people have more direct individual control over their data and to unlock value opportunities downstream. I engaged with some companies working in this space, often associated with the term Personal Information Management Systems (PIMS), many years ago through the Mobile Ecosystem Forum. The early stalwarts have been joined in recent years by startups like Inrupt, which is implementing the Solid protocol in service of a similar vision. We’re looking forward to continuing to engage with these organizations as we expand the horizons of our data transfer technologies.
Scholarly work on the value of data often highlights the difficulty of quantification. One of the best pieces I’ve seen is the Value of Data report, from the Open Data Institute and the Bennett Institute for Public Policy at the University of Cambridge. In many contexts where the question of the value of data is at hand, the focus isn’t on the commercial sector, but rather government data. The obstacles to making government data available for sharing and re-use are quite different as compared to commercial and personal data; and it still costs money, time, and investment to share, including to build and maintain infrastructure to host the data, along with potentially substantial work to clean the data to make it usable. We have a few anecdotes, at this point in history, of the incredible positive externalities that can come from sharing government-generated information, such as weather data. But it can take time and nontrivial taxpayer money to show the public benefits of such sharing – resources that short-attention-span political cycles rarely allow.
The difficulty of systemically measuring the value of data sharing is why anecdotes are so important. Which brings up another Data Transfer Initiative effort – our new collaboration with AcademyHealth and the Gordon and Betty Moore Foundation, among others including DTI’s partner Google. This effort builds on work done by medical researchers such as Microsoft Chief Scientific Officer Eric Horvitz, who has published extensively on the relationship between online data and health. Long before someone walks into a doctor’s office, they turn to easy, cost-free internet sources of inquiry with their first symptoms, including search engines, social media portals, and other places where humans may have written about circumstances just such as theirs. Modern technology excels at spotting patterns in language reflective of human behavior; even a small data set can be powerful. From DTI’s perspective, a key goal of this effort is to facilitate voluntary sharing by patients of their online activity as a means of building such data sets and making possible downstream medical research on effective diagnosis. For us, it’s not only a way that we can build tools to deliver positive impact, it’s also a way to help illustrate that data transfers are incredibly valuable, above and beyond the consumer choice benefits of enabling a person to multihome and to transfer their data from one service to another with similar features and offerings.
Data delivers value, but making data available for that value to be derived costs money, and can have complicated roadblocks to clear. Yet, in Article 6(9) of the Digital Markets Act, the data portability obligation must be met “free of charge” by platforms. At a small scale, these costs can be absorbed by platforms as part of the cost of doing business. But the DMA opens up this access to “third parties authorised by an end user,” creating at least the possibility of access requests at such a scale that the cost of production becomes massive; at some point, allowing the platform to recoup some of that cost is reasonable, even necessary. How, and at what thresholds, and what frameworks might be useful for the EU to think about these questions, will be the subject of my next piece - stay tuned!
What We’re Reading