Understanding the Prior and Posterior in Bayesian Inference for Anomalous User Behavior Detection

Today, I want to dive deeper into the technical details of how we calculate the values of \(\alpha_{prior}\) and \(\beta_{prior}\) in our Bayesian inference model at Fortscale. In my previous posts, I explained how we use these values to incorporate prior knowledge and prevent false alerts for users who have never acted anomalously.

The prior is a crucial component of Bayesian inference as it allows us to incorporate our prior knowledge when calculating probabilities. In our case, it helps us address the challenge of users with a history of zero SMART values triggering alerts for any positive value. By setting the right values for \(\alpha_{prior}\) and \(\beta_{prior}\), we can strike a balance between incorporating organizational knowledge and giving weight to the user’s actual data.

To determine the values of \(\alpha_{prior}\) and \(\beta_{prior}\), we need to consider the organization’s overall level of anomalous activities. If there are many anomalous activities, the analyst’s interest threshold is higher. We can simulate this effect using the prior by setting \(\alpha_{prior}\) to the number of SMART values in the organization and \(\beta_{prior}\) to their sum. This way, the prior represents the knowledge of the amount of anomalous activities in the organization.

However, setting \(\alpha_{prior}\) too high can make the prior too influential, leading to the user’s data having minimal impact on the calculated probability. To address this, we experimented with real-life data and found that setting \(\alpha_{prior}\) to a reasonable small number, such as 20, while updating \(\beta_{prior}\) to be \(\alpha_{prior}\) times the average of the organization’s SMART values, strikes the right balance.

Choosing a smaller \(\alpha_{prior}\) reduces the prior’s influence, allowing the user’s data to affect their threshold while still taking into account the organization’s level of anomalous activities. The variance of the prior also increases, allowing for some uncertainty in the expected value. This balance between the organization’s knowledge and the user’s data is crucial in personalizing the threshold and reducing false alerts.

In conclusion, calculating the values of \(\alpha_{prior}\) and \(\beta_{prior}\) in our Bayesian inference model requires careful consideration of the organization’s level of anomalous activities and the desired influence of the user’s data. By striking the right balance, we can effectively detect and prevent insider threats while minimizing false alerts and maintaining personalized thresholds.

Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Identifying Anomalies Efficiently with SMART Technology (Part Three)

Understanding the Prior and Posterior in Bayesian Inference for Anomalous User Behavior Detection

Latest

Real-Time Voice Agents Using Stream Vision Agents and Amazon Nova 2 Sonic

Go.Compare Introduces Insurance App Powered by ChatGPT

Dstl-Backed Robotics Innovation Revolutionizes Military Manufacturing – A Case Study

Understanding Patient Sentiment in Atopic Dermatitis Management

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

VOXI UK Launches First AI Chatbot to Support Customers

Real-Time Voice Agents Using Stream Vision Agents and Amazon Nova 2...

Create Financial Document Processing Solutions Using Pulse AI and Amazon Bedrock

Automating Schema Creation for Smart Document Processing

Popular categories

Most recent

Real-Time Voice Agents Using Stream Vision Agents and Amazon Nova 2 Sonic

Go.Compare Introduces Insurance App Powered by ChatGPT

Dstl-Backed Robotics Innovation Revolutionizes Military Manufacturing – A Case Study

Most popular

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Subscribe