Does Your AI Know The Difference Between Order and Chaos?
If you haven’t done the feature engineering to measure the level of regularity of events and transactions, your AI won’t know how to react to timestamped data.
Welcome to the fascinating world of order and chaos. Some people in this world are highly structured. The clock and calendar determine their daily experiences, and they always arrive on time. Others are more free-spirited, following their whims, and tend to arrive late, if they arrive at all. In some cities, the buses and trains run like clockwork, on time every time. In others, you cross your fingers and hope one arrives soon.
Knowing the regularity of transactions, behaviors, and events is essential for use cases such as fraud detection, predictive maintenance, marketing, and supply chain management.
Data scientists use regularity metrics that quantify the frequency, repeatability, and consistency of timestamp components. Here are a couple of examples:
- The entropy of the day of the week name (e.g. Mon, Tue, Wed…) component of event timestamps
- The proportion of events that occur on a weekend
To feature engineer a regularity signal, here are a few tips:
- Use a metric that standardizes for the number of events e.g. use entropy or proportions
- Choose a time window that is long enough to capture all seasonal variations, preferably more than one cycle of events
- Use your domain knowledge and experiment with applying regularity metrics to a subset of events
Regularity metrics require multiple observations across time, so use a tool that gives data scientists access to the detailed data within databases – don’t rely on aggregated data extracts that hide important patterns. We’ve built an open-source feature engineering library that makes it easy to calculate regularity metrics inside your database. Click here to access our open-source SDK, with worked examples in Python.