Hot, Fresh, and Expressive: New AMPS Functions!

  Apr 26, 2017   |      Ray Imber

functions date and time

Microphone. image by drestwn -- CC BY 2.0AMPS 5.2 has dropped and, like a new Beyonce album, it is so awesome it will probably break the internet. AMPS 5.2 comes with a mind bending amount of new functionality, but I would like to focus on a few key new functions that have been made available to your AMPS expressions.

Functions are like the backup singers of the AMPS world: they may not be what you hear first, but they sweeten the mix and you’d miss them if they weren’t there.

First, a quick review. A key piece of the AMPS workflow is a full featured expression language that is based on XPath and SQL-92. This language is used for:

  • Content filtering
    • for client subscriptions
    • for server configuration settings such as actions
    • for filtered (content-aware) entitlements
  • Creating projected fields for views
  • Constructing fields for message enrichment (Another exciting new feature of AMPS 5.2)

If you have used AMPS, you have probably used the AMPS expression language. The AMPS expression language exposes a range of functions that allow you to do computations on your message data right inside the system, with very high performance.

These include familiar string query functions such as:

  • SUBSTR()
  • INSTR()

as well as the numeric aggregation functions:

  • AVG()
  • COUNT()
  • MIN()
  • MAX()
  • SUM()

These functions provide a lot of utility, but we weren’t satisfied! AMPS 5.2 has greatly expanded the amount of built in functions available to you in your AMPS expressions. There are now 13 new functions available to for use in your AMPS expressions.

Let me break them down for you:

Numeric Operations:

Let’s start with the most straight forward additions. These are numeric functions that will make it just a little bit easier to tailer AMPS expressions for your use case without extra effort.

Function Description
ROUND() rounds values to the nearest integer, or to the nearest specified decimal place if one is specified.
ABS() Returns the absolute value of a number.

String Operations:

String processing is an important part of a good messaging system, so we created quite a few new string functions. I’m going to break them down by sub-category to more clearly show their utility:

Search and Replace

Function Description
REPLACE() Search and replace all occurences of a string and return the resulting string.
REGEXP_REPLACE() Search and replace all occurences of a regular expression and return the resulting string.

Search and Case Sensitivity

Previous versions of AMPS required you to use to regular expressions if you wanted to handle case-insenstive matching. Regular expressions are very powerful but come with a large amount of complexity. AMPS 5.2 gives you new tools that allow you to provide case insenstive matching without resorting to the “big guns” inside regular expression matching.

Starting with:

Function Description
INSTR_I() Case-insensitive, Returns the position at which the second string starts, or 0 if the second string does not occur within the first string.
STREQUAL_I() Case-insensitive, Returns true if, when both strings are transformed to the same case, the string to be compared is identical to the string to compare.

These are exactly like their case-sensitive counterparts from previous versions of AMPS, with the very useful distinction that they ignore case. The addition of those functions are theoretically enough to handle most string case related issues, but the AMPS team did not stop there! We wanted to provide you maximum flexibility.

In some cases, particularly when using strings with the IN clause, it is more efficient to simply convert the string to a known case. This is why we introduced:

Function Description
UPPER() provide the ability to convert ASCII strings to their upper case equivalents.
LOWER() provide the ability to convert ASCII strings to their lower case equivalents.

One final note about these string functions. It’s important to point out that these functions are NOT unicode-aware.

Concatenation

I would like to take a moment to talk about CONCAT, because it is particularly flexible and powerful. The function accepts both XPath identifiers and literal values and will return a string composed of all of them. CONCAT also works with non-string values, and will attempt to convert them to strings using the normal AMPS string coercion rules.

Function Description
CONCAT() Allows concatenating the string representations of one or more fields into a single string value. CONCAT may be called with any number of arguments.

CONCAT() is be very useful for debugging or status message generation, but it can also be used in more clever ways. For example, you can create a unique record id for a view that is composed of two different underlying topics:

<ViewDefinition>
    <Topic>MechaGodzilla</Topic>
    <UnderlyingTopic>
      <Join>/GiantRobots/body_part = /GiantMonsters/body_part</Join>
    </UnderlyingTopic>
    <MessageType>json</MessageType>
    <Projection>
      <Field>CONCAT(/GiantRobots/part_id, /GiantMonsters/monster_id)
                   AS /cyborg_id</Field>
      <Field>CONCAT("Mecha", /GiantMonsters/monster_name)
                  AS /monster_name</Field>
    </Projection>
    <Grouping>
      <Field>/GiantRobots/part_id</Field>
      <Field>/GiantMonsters/monster_id</Field>
    </Grouping>
  </ViewDefinition>

Filtering

AMPS 5.2 has added one very important filtering function:

Function Description
COALESCE() returns the first non-NULL argument. COALESCE may be called with any number of arguments.

COALESCE() takes a list of field identifiers, and returns the first one that is not Null. This can be used in several powerful ways. By using COALESCE() in your views, you can create very fine grained aggregated fields. For example:

COALESCE(/userCategory,
         /employeeCategory,
         /vendorCategory,
         'restricted')

COALESCE() can be used as a cleaner alternative to IF chaining. For example, you may want to determine a total for an order, but it may have several possible values for a price. In older versions of AMPS you would write the expression like this:

/Order.Qty * IF(/Order/NormalPrice IS NOT NULL,
                /Order/NormalPrice, 
                IF(/Order/SpecialPrice IS NOT NULL,
                   /Order/SpecialPrice,
                   IF(/Order/SuperFriendsDiscount,
                      /Order/SuperFriendsDiscount, 0)
                  )
               )

With COALESCE() in AMPS 5.2, you can simply write:

/Order/Qty * COALESCE(/Order/NormalPrice,
                      /Order/SpecialPrice,
                      /Order/SuperFriendsDiscount,
                      0)

Here are a few tips for working with COALESCE(). First, notice that, to make the intent of the filter clear, this example provides a constant value for AMPS to return from the COALESCE if all of the field values are NULL.

Second, note that COALESCE() is a scalar function. It takes a list of scalar values or scalar valued fields. This means that arrays will get converted to “scalar context”! In the most simple terms this means that COALESCE() will use the first value of the array as the value, and ignore the rest of the array.

AGGREGATION

For all of you data scientists out there, AMPS 5.2 provides a few new functions that you are going to love:

Function Description
COUNT_DISTINCT() takes a single argument and returns the number of distinct values within the aggregate group.
STDDEV_POP() return the population standard deviation.
STDDEV_SAMP() return the sample standard deviation.

These functions are specifically for use with View fields. These functions return a single value for each distinct group of messages, as identified by distinct combinations of values in the Grouping clause. This class of functions existed in previous versions of AMPS with functions such as AVG(), COUNT(), MIN(), MAX(), and SUM(). The addition of these new aggregation functions allow you to perform more advanced statistical analysis on your data right inside AMPS, allowing you to get the analysis you need with lower latency and less complexity.

Conclusion

The AMPS expression language was always very powerful. It’s one of the key pillars of the flexibility of AMPS. These new functions have expanded that flexibility, giving you even more options to tailer AMPS perfectly to your use case.

AMPS 5.2 has brought a huge amount of new features; too much for a single blog post! look for the rest of this series of blog posts as we begin to highlight more new AMPS 5.2 features. AMPS Speed!


Read Next:   Crank Up Apache Flume