The What, How, and Why of OR Filters

Hello Keen community! Back in September, we announced the release of “or” filters. In this post, we’ll share what we learned in a deep dive on the what, how, and why.

Why, Part 1: Foundation for a Richer Query Language

Let’s start with the big question: why build “or” filters? One simple answer is that it has been among our most-requested features for a long time and that in and of itself was enough to justify building it. On our quest to make Keen ever more flexible, this capability allows our customers to quickly query their data in ways that up until now were either complicated and expensive, or outright impossible.

But the scenarios unlocked by the new architecture on which “or” filters were built are even more exciting. We’ll discuss the new model in detail below, but just as a teaser here are some of the enticing scenarios that it will enable us to build:

  • Filtering on functions of properties, e.g. (player.level mod 5) eq 0
  • Filtering on relationships between properties, e.g.
    player.level < (1.2 * monster.level)
  • Using expressions in the target_property of a query, e.g.
    sum (order.count * order.unit_price)
  • Using expressions in the group_by of a query, e.g.
    group_by (user.point_program exists true) or
    group_by bin(customer.age, [0, 20, 40, 60, 80, 100])

Adding these capabilities will open up whole new paradigms of querying in Keen (if you’ve got ideas, we’d love to hear them), and bring us one step closer to parity/compatibility with SQL and other mature query languages. But before we go any deeper let’s go back and cover the basics.

What, Part 1: A Simple Example

To satisfy those of you who arrived here just looking for some sample code, here’s what an “or” filter looks like:

(You can see a similar example and read more in our API docs.)

How, Part 1: A Fundamentally New Concept

Looking at the sample request above, it may seem like “or” filters are a simple feature - and in many respects they are. They allow you to query against events that match any one of a set of conditions, rather than matching all conditions. But under the hood, the implementation was actually quite complex, and it’s worth going into why.

Prior to the introduction of “or” filters, all supported filter types conformed to the same pattern: a 3-tuple of `(property_name, operator, property_value)`. This was reflected in our implementation, which explicitly defined a filter as a POJO (Plain Old Java Object) with those properties:

This allowed for simple and efficient code, but it lacked the flexibility necessary to introduce filters with fundamentally different structures. We could have hacked in support for “or” filters by just glomming onto that existing `Filter` class, but it would have degraded overall code quality and would have left us in even worse shape when we inevitably add the next filter operator with yet another structure. It seemed like there must be a better way…

How, Part 2: What is a Filter?

Good software architecture often starts with asking simple questions. What, really, is a filter? Logically speaking it’s just a predicate on an event, nothing more and nothing less. We could have modeled it that way and it would have worked, but there was an even broader generalization to make: an expression is a function that takes an event as input and produces some value (in the Keen type system) as output, and a filter is just an expression that always produces a boolean (i.e. true or false) output.

An expression can be a constant, or a reference to a property, or some function with one or more expressions as its operands. Since functions are themselves expressions, they can appear as operands to other functions and form a so-called Abstract Syntax Tree (AST). For now we have defined functions for the existing filter types (such as “eq”, “gte”, and now “or”) but the logic allows for expressions using any mathematical operations you can think of: addition, subtraction, multiplication, division, modulus, logarithms, exponentiation, or even binning etc.

What, Part 2: Using “or” Filters

The example from “What, Part 1” illustrates the mechanics of running an “or” filter query, but what problem is that query actually solving? Suppose that you want to add a graph to your embedded customer-facing analytics dashboard showing them how many of their clickthrough events were from their “high-value” customers, which you define to be customers who either (a) have a lifetime revenue over $100 or (b) explicitly subscribe to a premium tier. The example query above is solving this problem.

Without “or” filters that would be much trickier to accomplish. You could query the two parts separately and sum them, i.e.:

Chart A: How many clickthroughs from high-value customers?

Double-counted sum of all customers with LTR > $100 and 2,135 premium tier customers

But this will end up double-counting customers who are subscribed to the premium tier and have a lifetime revenue of over $100. You could correct for this (using the inclusion-exclusion principle) by subtracting out this double-counted amount:

Chart B: How many clickthroughs from high-value customers?

7,863 total minus 1,601 (premium tier accounts with LTR > $100) = 6,262

This works, but now you are running three queries to get the result you want - which means three times the compute usage, plus extra load time on your dashboard. So even this simple case illustrates the value of native “or” filters, and in more involved cases (such as an “or” of three or more conditions) the savings in time, cost, and complexity can be great.

Why, Part 2: Making Keen a One-Stop Shop

Our mission is to make it as easy as possible for you to turn your data into a valuable resource for your users. Keen already provides a lightweight and low friction way to do simple analyses, but there are many more scenarios that can be enabled by increased expressivity. The more questions that Keen can answer for your users, and the more efficiently those questions can be structured, the higher the value it can provide. “Or” filters is just one such feature that we’ve recently implemented and we’ll share many more in the future.

Until Keen can efficiently solve all your analytics needs, we’ll always have our work cut out for us - but we’re making great progress, and we’re happy to have you along for the ride. Drop us a note at with any feedback about “or” filters, expressions, or anything else you’d like to see in the product.

Kevin Litwack
Chief Platform Architect

Keen and the EU General Data Protection Regulation (GDPR)

Update on Keen and GDPR Compliance

Keen is deeply committed to doing our part to ensure that personal data is adequately protected. As such, we are actively reviewing the requirements of EU Regulation 2016/679 (more commonly referred to as “GDPR”) and how they affect us and our customers. In this blog post we’ll try to provide as much information and guidance as possible for you to remain in GDPR compliance with Keen.

Our Data Protection Philosophy

Keen stores two different classes of data: (a) the account information of our direct customers, as provided to us via accounts on the website and/or through support channels such as e-mail or chat; and (b) data about our customers’ customers in the form of events submitted to our streams API.

We have designed our system to be resistant to attack against either class of data, but the second category (Keen’s customers’ event data) is more complicated due to the fact that we allow highly flexible content and cannot directly control what information is included or how personally identifiable or sensitive the information or data might be. For this reason we always recommend against the storage of any Personally Identifiable Information (PII) or otherwise sensitive data in event properties.

We believe that most use cases for Keen do not inherently rely on personal data and such data can be anonymized, pseudonymized, or omitted entirely without losing value. As such it is more valuable for our customer base as a whole for us to focus our engineering effort on other aspects of the product, rather that building high-assurance security protections that most customers do not need.

That said, we strive to be as secure as possible, and will continue to improve our security posture. We also recognize that some customers do have legitimate use cases for storing some amount of low-sensitivity PII (such as e-mail or IP addresses, for example), and those require a somewhat more rigorous data protection strategy than what we have in place now. So over the coming months we are making investments to move in that direction.

How Keen Secures Data Today

Our data protection strategy spans several dimensions: technology, people, and processes.


The most direct way that we protect data is by limiting access to it using standard industry best practices. All data is stored on hardware in Amazon’s AWS cloud, using a VPC to isolate all servers from the outside internet. These systems can only be accessed via a set of bastion hosts which are regularly updated with the latest security patches, and which can only be connected to using SSH channels secured by a select group of Keen employees’ cryptographic access keys. We’ve also adopted strict requirements around access to the AWS environment itself, including mandatory Multi-Factor Authentication (MFA) and complex passwords.

This structure makes direct access to our internal systems quite difficult for an unauthorized person, but it cannot protect the public-facing endpoints such as (i.e. our website) or We secure these via the access keys available in each Keen Project or Organization, which adhere to cryptographic best practices.

(Please note that we currently do not encrypt traffic between various internal services within our VPC, nor do we encrypt data at rest. Up to this point we have not felt that there was much value in doing so, since the only practical exploit of this would require direct physical access to Amazon infrastructure. However we do plan to enable basic data-at-rest encryption soon; see roadmap below.)


The Keen web UI includes a mechanism by which authorized Keen employees can view customer data directly. This is used to help investigate and address any issues or questions reported to us by customers, as well as occasionally by our operational engineering team to diagnose and mitigate degradation of service. The mechanism is password-protected and limited to those who require it to provide customer support or to fulfill other responsibilities.

We also adhere to a policy of only using this access when it is necessary, and will seek permission before viewing customers’ raw event data. (In rare circumstances where the need is urgent, such as a system-wide outage, we may skip this step — but only as a last resort.)

Currently this “root” access is all or nothing and we rely on our hiring and training processes to mitigate the risk of unnecessary access by a Keen employee. The build out of a granular access control system is on our roadmap (see below).


We adhere to the following processes to help ensure that data is kept safe:

  • Access management: when a Keen employee leaves the company, we follow a checklist to ensure that all of their permissions are revoked.
  • Design and code reviews: all changes to the system are reviewed carefully by senior engineers, as well as tested in an isolated staging environment prior to deployment to production.
  • Threat modeling: periodically we review the threat model and try to identify gaps, assess risk, and determine what mitigations (if any) should be prioritized.
  • Automated backups: all data is automatically backed up to Amazon S3 to allow us to recover in the event of a catastrophic loss, whether due to malicious attack or other unexpected events. These backups age out over time, so any data which is removed from the source will eventually no longer appear in the backups. (We currently can’t offer any guarantees about how long it will be for any specific piece of data.)
  • Data retention: Keen stores data for as long as it is necessary to provide services to our customers and for an indefinite period after a customer stops using Keen. In most cases, data associated with a customer account will be kept until a customer requests deletion. (There is also a self-service delete API which is suitable for removing small amounts of data.)

Our Security and Privacy Roadmap

We will be making improvements to all of the above according to the following roadmap.

What we are intending to deliver by the GDPR deadline

GDPR goes into effect on May 25, 2018. Prior to that time Keen intends to:

  • Appoint a Data Protection Officer and a data protection working team
  • Build a formal data map
  • Perform internal threat modeling and gap analysis (and set up a recurring schedule)
  • Adopt and/or formalize written policies around core areas, including (but not necessarily limited to): data protection, data backup, data retention, access management, and breach management and reporting
  • Institute formal data protection training for all Keen employees
  • Encrypt data at rest
  • Schedule annual security audit with a 3rd party auditor (however the audit may not be completed until later in 2018)

We also intend to do the necessary legal paperwork to be able to confirm that our Data Sub-processors (primarily Amazon) are GDPR-compliant, and to be able to offer a Data Sub-processor Addendum to the contracts of customers who request it.

What we hope to improve over time

The following are examples of additional security enhancements that will not be addressed by the May 25 deadline:

  • More granular access controls, allowing Keen employees to be granted access according to the Principle of Least Privilege
  • Full data access audit history
  • Lockdown of Keen employee devices, and/or limiting access to customer data to certain approved devices
  • Integration with an intrusion detection system/service
  • Industry certifications

In addition, we expect that threat modeling and gap analysis (both our own and those done by a 3rd party auditor) will identify opportunities to further harden the system and provide redundant layers of risk mitigation. Those will be prioritized and incorporated into our roadmap as appropriate.

Next Steps

Ultimately our goal is to make Keen as valuable as possible to all of our customers. We appreciate your understanding, and also greatly value your input. If you have questions, concerns, or feedback about our approach or how it will affect your own GDPR compliance efforts, please reach out to us at!


Order and Limit Results of Grouped Queries (Hooray!)

Greetings Keen community! I’d like to make a quick feature announcement that will (hopefully) make many of you happy 😊

At Keen IO we’ve created a platform for collecting and analyzing data. In addition to the ability to count the individuals who performed a particular action, the API includes the ability to group results by one or more properties of the events (similar to the GROUP BY clause in SQL). For example: count the number of individuals who made a purchase and group by the country they live in. This makes it possible to see who made purchases in the United States versus Australia or elsewhere.

This grouping functionality can be very powerful, but there’s one annoying drawback: if there are many different values for your group_by property then the results can get quite large. (In the example above note all of the tiny slivers representing countries with only a handful of purchases.) What if I’m only interested in the top 5 or 10? Until now the only option was to post-process the response on the client (e.g. using Python or JavaScript) to sort and then discard the unwanted groups.

Today I’m excited to announce that, by popular demand, we’ve made this much easier! We recently added a feature called order_by that allows you to rank and return only the results that you’re most interested in. (To those familiar with SQL: this works very much like the ORDER BY clause, as you might expect.)

The order_by parameter orders results returned by a group_by query. The feature includes the ability to specify ascending (ASC) or descending (DESC) ordering, and allows you to order by multiple properties and/or by the result of the analysis.

Most importantly the new order_by feature includes the ability to limit the number of groups that are returned (again, mirroring the SQL LIMIT clause). This type of analysis can help answer important questions such as:

  • Who are the top 100 game players in the US?
  • What are the top 10 most popular article titles from last week?
  • Which 5 authors submitted the most number of articles last week?
  • What are the top 3 grossing states based on sum purchases during Black Friday?

order_by can be used with any Keen query that has a group_by, which in turn can be used with most Keen analysis types. (limit can be used with any order_by query.) For more details on the exact API syntax please check out the order_by API docs.

There is one important caveat to call out: using order_by and limit in and of itself won’t make your queries faster or cheaper, because Keen still has to compute the full result in order to be able to sort and truncate it. But being able to have the API take care of this clean-up for you can be a real time saver; during our brief internal beta I’ve already come to rely on it as a key part of my Keen analysis toolbox.

I’d like to extend a huge thanks to our developer community for all the honest constructive feedback they’ve given us over the years (on this issue and many others). You’re all critical in helping us understand where we can focus our engineering efforts to provide the most value. On that note: we have many more product enhancements on the radar for 2018, so if you want to place your votes we’re all ears! Feedback (both positive and negative) on the order_by feature is also welcome, of course. Please reach out to us at any time 🚀

Kevin Litwack | Platform Engineer

Don't Get a Job -- Find a Quest

I’ve been in the software industry for nearly a decade. I’ve worked at all sorts of places. Some were pretty good, and some were less good, but none of them felt truly significant. Sure, they paid the bills, with maybe a few nice perks here and there, and I learned some neat stuff. But why settle for just that?


Five months ago, I decided to join the Keen IO team as a platform engineer. Why? To change the entire world. No, but seriously.


They may take our lives, but they will never take our Keendom!

See, I didn’t want a job. I wanted a quest. A chance to do something huge and real. The possibility to change the world, or at least the right to honestly say I tried. I wanted Good vs. Evil, right against wrong, dramatic music with a voiceover saying, “And the world will never be the same again…”

Find that, I thought, and a job wouldn’t feel like a job — it’d feel like an epic battle for the forces of righteousness.


Basically, I wanted to be Harry Potter.

And, while there are tons of cool, exciting, and successful startups out there, this is the area where Keen really stood out for me.

Our quest is a momentous one. And no, I’m not talking about building a world-class analytics platform (although that’s pretty cool, too).

The real quest Keen is pursuing is this:

To restore sanity to the way that we work for each other.

And by “we,” I mean “all of humanity.

(I said it was momentous, didn’t I?)


(In this image, Gollum represents capitalism.)

Here at Keen, we’re trying to prove that it’s possible for you and your team to be genuinely, life-affirmingly happy at your work, and at the same time provide exceptional experiences for your customers. (In fact, we’re trying to prove that it’s not only possible to do this, but that this is the best way to do things, for everyone!)

We’re trying to build a culture where you work hard because your team, your partners, and your users are real live human beings, and you care about them, and you want to do right by them.

We’re trying to drive a shift in how we all think about work, and why we do it. Working to make money is fine, but working to make the world a better place makes it a calling, a way of life, a quest.

The thing is, ambitious goals are great, but we also need to actually do thingsto make a difference. Obviously, we haven’t revolutionized global employment structures (yet), but here are just a few things we already do that I think are pretty amazing:

  • We start from the assumption that everyone on the team wants to add as much value as they can. The goal of any and all process is simply to enable us to do so.
  • We recognize that people do their best work when they’re happy and rested and challenged and fulfilled. “Work-life balance” doesn’t mean you get a few extra days’ vacation — it means your outside life is important, and we’re all here to support your needs, whatever they are.
  • Self-awareness and the ability to communicate both positive and negative emotions are viewed as core competencies. (Yes, even for engineers! ❤)
  • The culture is treated like a flagship product and a major competitive advantage (our “social operating system”), and everyone is responsible for nurturing and growing it. There are no mandates handed down from on high; we’re all building a system where culture comes from the ground up.
  • At the end of the day, we’re working to make our customers happy because we like making people happy. (We also happen to believe that it’s good business, but that’s honestly kind of a logistical detail.)

I’m cynical enough to know that redefining corporate culture and surviving in a competitive market isn’t going to be easy. (Quests rarely are.) We’ll have to overcome all the usual hurdles that startups have to face, while at the same time defining entirely new social structures, and figuring out how to walk that tricky line between co-workers and family.

To do all that, we’re going to have to be open and empathetic and introspective and passionately self-improving — both as individuals and as a group — every single day. We’ll have to confront evil wizards and battle our own inner demons and never, ever give up, even when the whole world is against us.


No, no — this is totally gonna work out fine!

Despite the daunting road ahead, I’ve never felt so good about what I’m doing with my life — because this is a quest that I feel passionate about. How much better would the world be if work was merely just a framework for making each other happy? To me, that’s a vision worth fighting for.

I hope you have a quest that you’re just as passionate about. If not, I hope you can find one — or join ours! :)

Got questions or ideas about this? We’d love to hear them! Get started in the comments below, or shoot us an e-mail.