Using Expanded Lookup Functionality for Security Use Cases
In this session of The Modern SOC Summit, Akash Kadakia discusses the expanded lookup functionality for security use cases. Akash leads the Security Professional Services Team at Sumo Logic. Today he will go through what's new in the expanded lookups and the difference between classic Sumo lookups and the expanded Sumo lookups. Akash then shows examples of how to use the expanded Sumo Logic lookups for deploying EUBA use cases.
Akash Kadakia: Thank you for joining today's session. My name is Akash Kadakia, and I lead the security side of the professional services team at Sumo Logic. Today, we will be talking about using the expanded lookup functionality for security use cases. For those of you who are not familiar with lookup tables, lookup table is a table of data hosted on Sumo Logic that you can use to enrich the log data received by Sumo Logic. For example, in a Sumo Logic search, you could refer to a lookup table of user account data to map the user ID in an incoming block to a row in the lookup table and return other attributes of that user, for instance, the email address or phone number. The fees you look up appear as a part of your search results, so you could then use this enriched information during the course of the investigations, or for threat detection use cases, right? Let's dig deeper into today's topic using the expanded lookup functionality for security use cases. As a part of today's agenda, we will be talking about what's new in the expanded lookups, what's the difference between the classic Sumo lookups and the expanded Sumo lookups. We will be going over how do you use these expanded Sumo Logic lookups for deploying UEBA use cases. Then we will talk about threat chains and dynamic watchlist, [allow list, deny list 00:01:39] and then leave some time aside for Q and A. What's new in the expanded Sumo lookups? Firstly, the classic Sumo lookups had a limitation of 10 megabytes for the lookup table file size, right? Ten megabytes was enough for most of the use cases, but we did sometimes run into limitations related to the size. The expanded Sumo Logic lookups have a 10X increase in the size, which is now 100 megabytes. Hundred megabytes is large enough to hold quite a lot of data in one single lookup file. Secondly, TTL capabilities, right? TTL stands for time to live, and the Sumo Logic lookups, the expanded lookups have TTL capabilities, which means you can hold the entries in a lookup file for a specified duration, right? If your use case requires you to hold the entries in the lookup table for let's say a period of 14 days, you set the TTL to 14 days and after a period of 14 days, the entries in the lookup file will just drop, right? It will be pushed. You can also have a static TTL. Meaning, no TTL, right? You could hold the data in your lookup files forever. That's an option as well. The next thing is the expanded Sumo lookups are now treated as content objects, as opposed to the classic Sumo lookups which are pretty much invisible to other users, right? When I create a classic lookup, I know where it exists, but no other user in that same account has visibility into that lookup slide. With the expanded Sumo Logic lookups, we now treat these as content objects, just like any other dashboard alert or search that you have in your account, right? This gives you full visibility about what are the different lookups existing in your environment. The next thing is role- based access control, and that's the key piece. When I create a lookup, I want to make sure that no one else can update it, but maybe I want to give some users the ability to modify these lookups, or just read through these lookups, right? The full R back capability with the expanded Sumo lookups allows you to grant read, write, or full manage permissions. You can either assign those permissions to a role, or it could be to a different use. The next piece is major, right? A performance improvement which is as per our engineering team, the metrics given to us is 12X. There's a 12X improvement in the expanded Sumo Logic lookups, which now allows you to run these lookups against larger datasets, and the lookup could hold 100 megabytes of data and run it again terabytes of data in real time. Lastly, CRUD operations, right? CRUD stands for copy, read, update, and delete. Previously, you could perform these operations from the UI. With the expanded Sumo lookups, you are now able to use the slider API to fully automate any CRUD operations, right? This enables you to maintain dynamic lookups. Let's say you want to hold a custom threat intel list in these lookup files, you could just use the API to periodically update these lookup files using the API. In the next section, let's talk about UEBA use cases. Let me emphasize here that Sumo Logic is not a native UEBA platform, but using the expanded Sumo lookups, we can build several UEBA use cases, right? UEBA is focused on detecting anomalies based on unusual activity, and unusual activity can be based on sudden spikes, or it could just be rare activity, something that's never been seen for that entity. There are several operators in Sumo Logic, which could be used to detect anomalous spikes in activity, and it creates a behavior profile for an entity at runtime, but detecting something rare, something that's unusual pertaining to that entity requires maintaining a behavior profile for that entity, which is based on the entity's historical activity. This is where this expanded Sumo lookups can be leveraged to maintain an entity's profile and update it periodically, and detect anomalies in real time. Let's take a look at an example. This is a typical UEBA use case, which is user authentication from rare geolocations. Now, this use case is looking for users who are authenticating from a geolocation that they have never authenticated from before, right? The essence here is you maintain a profile for the user. You maintain a history of what are the geolocations, or what are the countries that the user usually logs in from. The user could be logging in from let's say Italy and France which are neighboring countries, maybe the user's traveling, right? That's the usual behavior for the user, but one fine day, the user suddenly logs in from Argentina. Now, that's an anomaly for that user because the user has never logged in from the country. How would we go around building that use case using the expanded Sumo lookups? On the left- hand side, on the top query here is what I'm using to create and update the profile for the entity, right? The entity in this case is the user. This is running on Office 365 logs. I filter down to authentication events. I enrich the log messages with the country name, region, and city based on the IP address, and this comes out of the built- in geolocation lookup, which also leverages the expanded Sumo lookups, right? Now I get the username and the country that the user has logged in. I go ahead and save historical information related to every user and what countries they have logged in from and save it in a lookup file, right? This is the simple command we would use in the query to update the lookup entries. On the right hand side, what you're seeing is a lookup file which is the expanded lookup. Now, this is the name of the lookup file. This is the time- to- live, right? Every entry in the lookup file stays in there for a period of 30 days. After 30 days, the entry expires and is dropped from the lookup, unless there is a new entry for the same values in the same lookup, right? If you have something coming in for a user from Romania which is the country, it will reset the 30- day clock. Now in this lookup file, I have a source user and a country name, right? I'm blurring out the user names for data privacy reasons, but then you have the country that the user has logged in from. Now the user could have logged in from multiple countries, so those will all be separate entries in the lookup file. The second query at the bottom runs in real time on incoming date. As your data comes in, it checks against the lookup file to see if the user has previously logged in from that country. How would we do that, right? Similar query, we run it on Office 365. It's running on incoming data, filtering down the authentication events. Now we pull out the username IP address, enrich it with the geolocation information. I'm also leveraging the built- in ASN lookup, which also uses the expanded Sumo lookup functionality and populating it with the ASN and the organization. Now, once I have the country for the real- time data, I will look up against this existing profile that I created for this use case and check if the user and countries combination exists in the lookup file. If it does, there's no problem there, right? That's usual activity for that user. However, if it does not exist in the lookup file, that is what you would consider as an anomaly. Now, several UEBA tools, they all use different mechanisms to maintain behavior profiles for entities, right? Some platforms use solar indexing to maintain this. In Sumo Logic, you could do it the exact same thing using the expanded Sumo lookups. Let's look at some additional examples of UEBA use cases that could be built using the expanded Sumo lookups. Login from previously unseen countries, this is a similar use case, but in this case, I'm not running it on a specific entity. I'm just seeing what are the different countries that your employees authenticate from. There are certain countries where you have businesses and offices, and that is what you should be seeing in your previously seen countries. Now if a user authenticates from a country that no one in your organization has authenticated from, that is definitely something investigation worthy, right? We're keeping a track in the Sumo lookup. We're keeping a track of all the countries that your users authenticate from. Similarly, unusual AWS regions is also a use case where you would look for what are the different regions or AWS regions where you see activity from. If you see a region that you have never seen activity from, that is again anomalous activity. The next one is account creation by unauthorized unusual users. Typically in any organization, it's the responsibility of certain admins and IAM teams to make sure that they take care of any user management activity. Could be account creation, account deletion, right? One fine day, you see a user who is not usually performing user management activities suddenly creating an account. This use case could be done using the Sumo lookups as well. You're just keeping a track of what are the different users that have performed user management, right? It all comes down to maintaining a profile within your organization at an entity level, or it could just be at an organization level. New user account inaudible. This could also be a standard use case where you have a list of all the usernames that exist in your environment, right? Now this is where the 100 megabyte limit comes into play. There could be thousands or tens of thousands of user accounts in your environment, and keeping track of all that takes a lot of space. Leveraging the 100 megabyte limit and the 12X performance improvement, you can keep track of every single account that exists in your environment. If you see a new account created or a new user account that has never been seen before, or maybe not been seen in the last 90 days, that is something anonymous. The next use case is I would say the most interesting one, user RDP to rare host, right? This is a very common UEBA use case, and it also comes out to be pretty valuable. It's typically one of the indicators of detecting lateral movement. Your users are working on a specific project, and the project is built up with three clusters or a single cluster with three hosts, right? The user typically logs into their workstation and from the workstation, we use the RDPs to the three hosts within the cluster, right? For that user, there are four entries existing in the expanded lookup, which is the usual behavior of that user. One fine day, you see a user logging into multiple different hosts that the user has never logged into before. This is a very strong indicator of inaudible, right? That was for UEBA use cases. Let's talk about threat chains. In order to combat threats that occur over a period of time, we need to be able to tie the relevant alerts together, right? A single alert could be meaningful, but in most cases is not investigation worthy by itself, but when you tie these together, it allows you to prioritize and identify threats that occur over several steps, right? They're not actionable by itself, but together, it becomes a high priority security incident. Let's look at how we can use expanded Sumo lookups to stitch these low and slower alerts together. This is a simple representation of a threat chain, where we have rules classified into a separate tier, right? In order for a tier two rule to fire, the tier one rule has to fire and in order for the tier three rule to fire, both tier one and tier two rules lead to fire. All of this has to be for the same entity, right? It has to be correlated back to the same entity. If the tier three rule fires now, this becomes a high priority alert that is investigation worthy. Let's look at a simple example, a two- tiered approach for a data exfiltration use case. What you're looking for here is flight risk users who are exfiltrating a high volume of data from your internal environment, right? In the first year, we are detecting entities accessing job hosting websites. This rule runs and this is a pretty high volume rule, right? It's low fidelity. It's going to generate a lot of results. We're going to keep track of every user accessing job hosting websites, and we're going to write it to the Sumo expanded lookup, right? The lookup is going to keep track of every user who accesses a job hosting website, and let's say we keep that for a period of 30 days, and this is where we leverage the time- to- live feature. The second rule looks for users exfiltrating a high volume of error, but the tier two rule will only fire if the user who is exfiltrating a high volume of data exists in the first lookup that came out of the tier one rule, and it only fires if it exists when it tries to read the lookup. If it exists, that forms the entire thread chain of data exfiltration, and that alert becomes investigation ruling out. This is a pretty common use case, a user who could be potentially leaving the organization based on their browsing activity and is also exfiltrating data from the internet network prior to their departure, right? In some cases, it's accidental. The users are not aware of policies. In some cases, it's intentional. Let's look at a larger example one. This is a four- tiered approach. The tier one rule is basically looking for a user account that was created. Now this is normal activity by itself. Nothing interesting coming out of this, right? User accounts are created. The source that is the user that creates the account and the target which is the account that was created both are written to the Sumo Logic expanded lookup. The tier two rule runs on top of it, right? It's looking for users added to privileged groups, and this is a privilege escalation phase, where now it will read the first lookup. If the target account exists in the first lookup, only then it will generate an entry in the second lookup, right? It writes the source on the target to the second lookup number. In the third tier, it is looking for account deletion. This is a defensive agent phase, right? The attacker has done his job using the new account and now in order to not be detected, they're going to delete that account, but this only fires if the first two tiers have triggered and there's an entry existing in the lookup file, right? Now it reads from there and if it exists, it creates a third lookup file, where it maintains a history of the source and the target. Now in the tier four rule which is ordered law clearing, which is again a defensive agent phase, it will look for entries in the third lookup. If it exists, this forms your complete threat chain of account compromise, right? You can see how you could leverage the expanded Sumo lookups to build threat chains and UEBA use cases. Also, the expanded lookups as I was seeing, they come with additional features like the time- to- live feature, the performance improvements, the increased size, complete visibility and R back. All of that really helps you build these robust security use cases. In the last section, let's look at dynamic lists using lookup tables, right? We spoke about the Swagger API for automation. That is what you could possibly use to maintain dynamic lookup tables. You could use the API to periodically let's say daily or once a week update the lookup file and maintain it for X number of days. The second option is to just run a Sumo query every X number of hours and update the lookup tables, and keep them dynamic, right? What falls under dynamic lists? Watchlists, this is very common, terminated users, users who were recently terminated. You want to keep a close eye on them, right? Any activity coming from terminated users is bad, and you can maintain a list of terminated users in the expanded lookup, right? These could be hundreds of users, and maybe you're looking across all your log sources, which could be 10 terabytes of data a day. In order to run 10 terabytes of data in real time, again such a large lookup, you need good performance. You need to be able to detect this in real time without any significant delays, right? Terminated users, upcoming terminations is also users or contractors who are due for termination let's say in the next 30 days. You want to keep an eye on them, right? This is a watchlist, you should keep an eye on and any exfiltration activity from users with an upcoming domination is an interesting indicator, right? This happens a lot for employees leaving the organization. It's a very common insider threat use case. The next thing is compromise hosts, right? Let's tie this back to threat chains. Let's say you have a threat chain that fully was satisfied and there was a compromised host indeed, you can keep track of those compromised hosts in the expanded lookup. Now, the incident response team came in. They remediated the host and the host was considered to be clean. Now there might still be remnants of compromise on this host, so you want to keep an eye on these compromised hosts let's say for the next 180 days, right? You could dynamically populate this list of compromised hosts using queries, and then just look up against this compromise hosts in case there's any suspicious activity. Similarly, privileged accounts or high- performance accounts, HP accounts could be service accounts, certain admin accounts, or even executives in your organization, you want to keep a close eye on these accounts and dynamically maintain them based on your LDAP information, or it could be just HR data. All of this can be dynamically maintained using the expanded Sumo lookups, where you could automate the whole process of updating these lookups without having to worry about manually updating the entries. Another example is allow list, right? This is key. Scanners are the main source of false positives. There are certain things you want to maintain dynamically, you want to automate the process, and just allow this act, right? Don't send an alert if it's in the allow list, if the entity exists in the allow list. This could be a scanner host, this could be DNS servers, proxy servers, or any authorized businesses or domains. You might have certain partners, or certain businesses that you have business with. You want a white list or rather allow those domains to be not flagged at any time, there's an alert coming out. Allow lists are pretty handy, can be maintained dynamically and similarly, deny list could also be maintained based on your alerts or threat chains, right? Let's say you have a rule looking for port scanning activity. Just dynamically update the expanded Sumo lookup and maintain a list of those items. Any activity from those IPs is a direct alert you want to look at. Similarly restricted geolocations, locations where you do not have any business, you do not have any offices. Any activity from those geolocations should be a direct alert, just like competitor domains and also threat intelligence, right? Sumo Logic comes with CrowdStrike thread intel out of the box, which leverages this expanded lookup, but you could also maintain your own custom list using the Swagger API and the 100 megabyte limit, which really allows you to do a lot more with lookups in Sumo. If you want to learn more about lookups, I would suggest going through the help pages and see how do you use the new lookups or the expanded lookups. Also, there are some micro lessons which go deeper into the Sumo's built- in lookups, which is could be the ASN lookup, geolocation lookup, the CrowdStrike threat inaudible feeds, right? These are all good resources to brush up on the expanded lookups. If you have any questions, this is my contact information. Feel free to reach out. Thank you so much for joining this session. I'll open up for Q and A.