Joining BSON Data with XML Data and Aggregating in JSON -- Making it Easy and Natural

  Aug 5, 2014   |      Patrick Flickinger

json xml bson amps join

Data aggregation as easy as it is in the movies: Find out how!

We’ve all seen television’s expectation of middleware – real-time streams of data, arriving from all over the world, effortlessly joined and available in an instant, where combining new information is as easy as a few keyboard clicks – even in the middle of the night, from an underpowered laptop, while the clock is counting down to a major disaster.

The reality is that the systems of today are not how they are presented on television, where data feeds are ubiquitous and are easily aggregated in “the cloud.” Most likely, you rely on systems that are of varying age and communicate via disparate messaging types. These message types, whether XML, JSON, BSON, FIX, or even CSV, need to be converted to a common format so they can be consumed.

Where does this conversion occur? Unfortunately, conversion typically happens just before consumption in aggregation systems, which tends to limit the usefulness of the data. Furthermore, this conversion is a separate step, one that is either hand-rolled or part of a different process inside an aggregator. Such a disjoint operation is both clumsy and inefficient. It’s expensive, in both development time and processing time. No one wants to do things this way, but it’s seen as a necessary bottleneck in a data stream.

At 60East, we’re breaking the rules. Rather than converting back-and-forth between message types, we JOIN the messages directly. This allows us to do some rather unique things, such as cross-typed joins into entirely different message types, in real time! It’s like Jack Bauer meets the Matrix. But let’s not just stand around while CTU gets overtaken by sentinels, it’s time to pick up that phone and dive in for an example.

Suppose we have a live feed of XML GPS check-in data for all the taxicabs in NYC. Each taxi transmits vital information every second.

Taxi Data

Using the real-time aggregation capabilities in AMPS, we’re able to determine where each cab is, whether the driver is speeding, the total trip time for each pickup, the cab fare compounded with the gallons of fuel used, etc.

Taxi Data Topic Definition

Now suppose that we own multiple taxicab garages in the city and want to determine when to shut each down for the night to maximize profits. A NoSQL database contains the garage fleet information, but the feed is in the BSON message type.


An additional caveat is that our front-end system wants the data in JSON format. Since AMPS supports the cross joining of discrete message types (and projects into any other message type), we simply need to define the query.

Combined Topic Definition

And there you have it! As garage owners, we are able to see which garages are the most profitable as the taxis are accumulating fares. This allows us to react in real-time and close any garages that are underutilized. Using our unique JOIN technology, we’ve demonstrated how easy it is for you to join disparate entities, which you can use to save your company countless hours of development time.

But, as always, don’t take our word for it. Try it yourself! This capability is new in our upcoming 4.0 release, and as this posting goes to the web, our customers are already trying this out. If you’d like a preview of 4.0, contact 60East Technologies.

Get the preview. Try it out. See how easy it is to work with multiple types in AMPS. Then try to do the same thing in your current system – and make that solution work in real time.

Read Next:   What Would You Build With a Data Time Machine?