Blog

Blog > Challenges at the Edge: Distributed Data

Challenges at the Edge: Distributed Data

As specialized edge databases become a necessity for distributed edge computing systems, so will connection to local, interconnected colocation data centers.

It can be tempting to fall into the trap of thinking that edge computing will easily remedy all of the cloud’s shortcomings—limited bandwidth, latency issues, network dependencies, and so on. But as advantageous as ultra-low latency and on-site data processing can be, the edge is not a magic bullet that lets you simply pick up where the cloud left off. In order for your edge computing environment to operate successfully, you need to solve the problem of distributed data. 

Stateless vs. stateful data

Edge computing is very easy when the data is stateless—that is, when an application doesn’t require the server to have a record of previous interactions with other devices, programs or users. Each interaction request can be completed based entirely on the information that comes with it. Examples of stateless data include hypertext transfer protocol (HTTP), domain name system (DNS), and UDP (user datagram protocol). 

However, most applications for edge computing require processing stateful data. This means that the data comes with information about the history of all previous events and interactions with other devices, programs, and users. Applications at the edge typically need to share this contextual information in a timely and coordinated manner. IoT devices, mobile apps, gaming platforms, virtual or augmented reality, and other edge computing use cases must have the ability to synchronize stateful data with guaranteed consistency.  

Edge systems are highly distributed

Edge computing systems are by nature highly distributed. Multiple devices are processing information across a wide geographic area—even an entire smart city. In edge computing architectures, every edge device needs to be able to work independently to perform its own unique function, but these devices also need to share—and synchronize—stateful data with other devices and nodes. Coordinating multiple edge devices while still enabling them to work independently has been a problem for designers of distributed systems for some time now. 

Database limitations 

Conventional databases rely on the centralized coordination of stateful data, which makes sense when your data is being processed in one place, such as a traditional data center. When you are processing vast amounts of data across multiple locations, however, you can’t rely on centralized coordination. Why? Because such centralization can create the very problem edge computing was supposed to solve: latency. 

In order to guarantee consistency, conventional databases have certain coordination protocols in place. Databases at different points of presence (PoPs) across an edge computing system may go idle, waiting until another edge node sends a signal that it has successfully completed a task. Rather than being able to perform its own function while simultaneously processing incoming stateful data, the device is forced to wait until the stateful data is processed and coordinated by another node before it can perform its function. In other words, edge devices aren’t always able to walk and chew gum at the same time. As a result, latency creeps back into the very architecture meant to fight it.  

Edge databases

Conventional databases can scale up to meet stateful data coordination challenges within a data center. But due to their outdated design, they can’t effectively scale-out across large, dispersed geographic areas. The coordination of stateful data, therefore, becomes a constraining factor in how many devices can perform complex tasks in an edge computing system. 

In order to increasingly support edge computing at scale, edge devices need to be able to work together in a way that minimizes or, in some cases, eliminates the need for coordination among all devices. And, as it turns out, there are many operations these devices can perform without centralized coordination. Unfortunately, they are forced to rely on centralization because they are working with outdated databases. With next-generation edge databases, these old-fashion coordination protocols can be significantly mitigated and stateful data can be consistently processed in a distributed fashion. 

Edge databases are geo-distributed, multi-master data platforms that can support multiple edge locations using a coordination-free approach. They guarantee consistency without requiring centralized forms of consensus and can still arrive at a shared version of truth in real-time. These databases also don’t require the re-architecting of cloud applications to scale, nor do they require developers with specialized expertise. 

Although the edge is very much here, and edge systems are proliferating across the globe, edge databases will help overcome the data processing limitations just discussed and unleash the power and promise of the edge, IoT, and 5G. As specialized edge databases become a necessity for distributed edge computing systems, so will connection to local, interconnected colocation data centers. Netrality’s interconnected colocation facilities – located at key edge locations in Chicago, Houston, Kansas City, Philadelphia, and St. Louis – enable edge computing systems to seamlessly scale and evolve to meet tomorrow’s distributed data needs. 

Contact us for more information.