Not Using Content Filtering in Your Messaging Application? You're Doing it WRONG.

  Nov 4, 2013   |      Eric Mericle

content filtering topic filtering low latency regular expressions

Naive messaging systems broadcast all messages to subscribers. This style of message delivery can cause resource over-utilization on subscribers who are only interested in a subset of the entire message flow. Even worse, such a message delivery system can quickly bog down or even oversaturate a network. When this happens, the messaging system can no longer scale, at least not without costly upgrades to infrastructure.

We’ve seen first-hand how broadcast publish-subscribe systems can place the heavy burden of message filtering on trading GUI applications. In one case in particular, a subscribing application was consuming 80% of it’s CPU - the vast majority of this time was spent discarding unwanted messages. This made for unhappy users that were angry that “their” data was being delayed and the overall GUI experience was “sluggish”.

this is bad CPU utilization

With AMPS topic and content filtering, throughput requirements of the network can be reduced by delivering to subscribers only the messages they need. Additionally, end-to-end message delivery latency can be improved by reducing the time spent discarding messages that the subscriber has no desire in processing.

your message pipeline with broadcast

The same system was reimplemented using AMPS, leveraging topic and content filtering - moving the heavy burden of message filtering from the client application to an AMPS instance. This had the net effect of reducing the client’s CPU utilization from 80% to nearly 8%, all while improving the latency profile of message delivery and, most importantly, their over-all satisfaction.

The one-two punch of network capacity improvements and reduced client resource consumption enables applications to run more efficiently. If your messaging system expects your end-user application to filter the results that are sent to them, it could mean that a significant burden is being placed on every one of your end-user desktops. It could also mean that your messaging solution is doing it wrong.

Implementing topic filtering and content filtering in AMPS can improve your application from end-to-end and make certain that you continue to handle messaging the right way! Below we’d like to show you how simple it is to use one of the most powerful features of AMPS.

PCRE Primer

Before we can dive into how AMPS filtering works, we need to give a quick primer on a select group of Perl Compatible Regular Expression (PCRE) expressions supported by AMPS.

If you are unfamiliar with PCRE, fear not, the AMPS User Guide has a chapter that covers supported characters and their definitions.

To get started quickly, below are a couple of examples to illustrate PCRE and how we can use it in AMPS topic filtering.

Regular Expression Definition
"^Hello" match all strings that begin with the string ‘Hello’.
"World$" match all strings that end with the string ‘World’.

AMPS contains support for these and other regular expression symbols, but for the sake of brevity, we will only discuss these few.

Topic filtering in AMPS

With AMPS, a client can use a regular expression to subscribe to topics that match the given pattern. This feature can be used in two different ways:

  • subscribe to topics without knowing the topic names in advance. With regular expressions, you can subscribe to all topics ( Topic='.*' ), or ad-hoc topics that match a pattern, like all topic that start with the letter ‘B’ ( Topic=B.*' ). This allows a subscriber to subscribe to topics with little or no understanding of existing topics in the message stream.

  • subscribe to topics that only match a very selective pattern. An example of this type of subscription is the search for a specific token in the topic, for example, subscribing to all topics that end with the string “@Foo” ( Topic='.*@Foo$' ).

Subscription topics are interpreted as regular expressions if they include special regular expression characters. Otherwise, they must be an exact match. Some examples of regular expressions within topics are included in table below.

Topic Behavior
trade matches only “trade”.
^client.* matches “client”, “clients”, “client001”, etc.
.*trade.* matches “NYSEtrades”, “ICEtrade”, etc.

AMPS topic filtering allows subscribers to receive messages from the topics they are interested, eliminating the need to filter out unwanted messages.

Content filtering in AMPS

We have demonstrated how topic filtering is a powerful feature in AMPS, but now we’re going to go a step further and examine the most powerful feature of AMPS - content filtering.

Content filters add the query power of syntax similar to SQL-92 to create the filter, which provides a greater level of selectivity than topic filters alone.

Content filtering is used in a similar manner to a WHERE clause in a SQL SELECT statement. It enables filtering on the message body, and uses an XPath syntax to support queries of nested items in message types that support them.

Much like the SQL WHERE clause, an AMPS content filter supports logical operators (AND, OR), arithmetic operators (+, -, *, /), comparison operators (<=, =, BETWEEN, IN), and the conditional operator IF. Additionally, the LIKE operator is supported to search for a pattern within the data.

For the following examples, we’re going to use messages that are composed of name-value pairs. Our message stream will consist of the following messages:

Topic=NYSE.Technology;Symbol=MSFT,Price=34 Topic=NYSE.Technology;Symbol=TIBX,Price=14 Topic=NYSE.Technology;Symbol=IBM,Price=180 Topic=NYSE.Utilities;Symbol=XOM,Price=87 Topic=NYSE.Utilities;Symbol=XOM,Price=88 Topic=NYSE.Technology;Symbol=TIBX,Price=15 Topic=NYSE.Technology;Symbol=HP,Price=24 Topic=NYSE.Technology;Symbol=MSFT,Price=31 Topic=NYSE.Technology;Symbol=TIBX,Price=17 Topic=NYSE.Utilities;Symbol=XOM,Price=90 Topic=NYSE.Technology;Symbol=MSFT,Price=34 Topic=NYSE.Technology;Symbol=TIBX,Price=16 Topic=NYSE.Utilities;Symbol=XOM,Price=86 Topic=NYSE.Technology;Symbol=IBM,Price=185 Topic=NYSE.Utilities;Symbol=XOM,Price=87 Topic=NYSE.Utilities;Symbol=XOM,Price=88

For the first example, let’s create a client that is only interest in MSFT symbols. That client’s subscription would look like:

Topic=NYSE.Technology;Filter="/Symbol='MSFT'"

This subscription would only return results to the subscriber that contained the ‘Symbol’ matching ‘MSFT’ and using the stream from above would only contain the following messages:

Topic=NYSE.Technology;Symbol=MSFT,Price=34 Topic=NYSE.Technology;Symbol=MSFT,Price=31 Topic=NYSE.Technology;Symbol=MSFT,Price=34

In another example, a client is only interested in trades that are trading between $15 and $20 per share. This subscription would look like:

Topic=NYSE.Technology;Filter="/Symbol='TIBX' AND /Price BETWEEN 15 and 20

The range used in the `BETWEEN` operator is inclusive of both operands, meaning the expression `/A BETWEEN 0 AND 100` is equivalent to `/A >= 0 AND /A <= 100`

This subscription would return the following messages to the subscriber:

Topic=NYSE.Utilities;Symbol=TIBX,Price=15 Topic=NYSE.Utilities;Symbol=TIBX,Price=17 Topic=NYSE.Utilities;Symbol=TIBX,Price=16

Like topic filtering, using content filtering can reduce the number of superfluous messages sent to a subscriber even further. This has the net result of even further reducing demand on the network and reducing the the overall resource requirement on the client.

Taking things a step further, topic filtering and content filtering can both be applied to a message stream to provide message filtering so reliable, you’ll never need to build another client-side filter again. Using topic and content filtering will make your network happy, your users happy and your developers happy.

Conclusion

Implementing topic and content filtering is a simple and highly effective way that AMPS can deliver improved network performance, reduce CPU and memory consuming burden of filtering messages from a broadcast stream, and reduce latency in your messaging platform.

In the next blog post, we’ll look at using topic-only filtering to implement a request / response system using AMPS. We’ll also provide an example implementation of a heartbeat monitor and agent that uses this system. Stay tuned!


Read Next:   60East Talks HPC with Total Trading