Centralized Vs. Decentralized Data Systems—Which Choice Is Best?

David Weldon
VentureBeat
September 12, 2022

Healthcare and insurance payers spend nearly $496 billion each year on billing and insurance-related costs, noted Bruce Schneier, chief of security architecture at Inrupt—a company created by the father of the modern web, Tim Berners-Lee. As the amount of data continues to grow, it is becoming more difficult for healthcare providers to access necessary information when treating patients.

Providers typically turn to centralized means such as healthcare information exchanges, but these present a laundry list of potential problems, Schneier argued.

“Centralized systems face the risk of security breaches, as well as ethics and confidentiality issues,” Schneier told VentureBeat. “Decentralized data systems can provide healthcare providers better access to important data and information and enable citizens to control what data is being shared and to what provider. For example, one person can have their own data profile and give their doctor access to relevant information needed for their appointment, leading to better care.”

The question of whether to use centralized or decentralized data in the healthcare sector is just one industry example. To help CIOs—regardless of industry—better understand the benefits and shortfalls of each approach, VentureBeat asked Schneier to detail the pros and cons. A slightly edited version of the conversation follows:

VentureBeat: Would you please describe in as much detail as possible what you consider to be the key factors in each of the following scenarios:

Pros of centralized data

Bruce Schneier: Centralized can mean a lot of different things, depending on the context. When we at Inrupt talk about centralized data architectures, we mean applications or online services that are tightly coupled to their databases. It’s the way most things are built today. All data from all users of an app or service is stored in the same place, virtually, if not physically. The benefit is that it’s easier for the makers and operators of the service to optimize its performance.

Cons of centralized data

Schneier: You can think about this at multiple levels. Inside an organization, when services are tightly coupled to their databases, it leads to silos of data. Every big organization interacts with its user via more than one online channel or app. But with centralized architectures, it becomes very difficult to share data between systems. It’s hard to reuse the same piece of data for multiple purposes. Integrations introduce complexity and risk of insecurity. So, user data ends up decaying in silos, frustrating users and holding back the organization.

At a higher-level, centralization leads to monopolies. A single individual, group of people or corporation holds power over the functionality of a centralized data system, making it prone to risks. There is also a lack of privacy, as some centralized data systems share the user data with third parties. Centralized data systems are also big targets for hackers, making them more vulnerable to breaches and data theft.

Pros of decentralized data

Schneier: The important change is not about where the data physically lives, it’s about the decoupling of applications and data. This has many benefits. Inside an organization, systems are naturally interoperable. Data can be reused for a new purpose without being copied somewhere else, and all data about a user can be organized around the user, instead of being tied to the application that it first came from. When this kind of reorganization is done with the users’ cooperation, it improves both trust and customer experience.

At a higher-level, distributed data puts people back in control of their own data. Your data is distributed in the sense that no one organization has control of it, but in a way, it’s also “centralized” around you. You can share it, or not, with whomever you want.

So often when we are online, we enter the same data over and over again on different websites, usually forgetting who we’re sharing our data with in the process. In addition to being a more private and secure model, the interoperability of Solid, a technology by Inrupt, makes systems generative. New ideas come from linking things together, but we can’t link our data from different parts of our lives together today because it is stuck in centralized systems. Cons of centralized data

Schneier: Companies managing single-purpose, internal-only datasets that don’t contain personal information on users may not see the benefits of a distributed data system like Solid.

VB: What are the top challenges to data management in a centralized system versus a decentralized one?

Schneier: In addition to challenges [in] complying with privacy regulations, centralized data systems create high-value attack targets—a single database of 10 million credit card numbers, for example—that attract hackers with a lot of resources. Distributed data centered on the user completely changes the incentives of hackers, and therefore makes threats of breach more manageable for most organizations that aren’t in the core business of cybersecurity.

VB: Which route does your company generally advise clients to go, and under what circumstances?

Schneier: Inrupt’s business is built around the distributed data approach of Solid. Our approach at Inrupt is to help companies and governments see the benefits of storing their user data in a “pod”—i.e., a data store centered on the individual user. The key differentiators here are interoperability, data quality, and compliance at scale.

Categories: Computer and Information Security

Tags: VentureBeat