Use case: How to improve the efficiency of cybersecurity teams using AI
Introduction
Cybersecurity is a field that requires constant updating and adaptation to the new threats and vulnerabilities that arise in the digital world. Cybersecurity teams face increasingly complex and varied challenges, which demand a high capacity of analysis, response and prevention. In this context, artificial intelligence (AI) is presented as a key tool to improve the efficiency and effectiveness of cybersecurity teams.
In particular, we will focus on large language models (LLM). AI systems that can process and generate natural language automatically and accurately. These models have multiple applications in the field of cybersecurity, such as:
- The detection of cyberattacks through the analysis of data and its context in real time, such as texts, emails, social networks, etc.
- The generation of reports, alerts and recommendations on the security measures to take.
- The automation of repetitive or routine tasks, such as the classification of incidents, alert overload, false positives and some manual analysis.
In addition to reducing cybersecurity operation tasks, the use of artificial intelligence solves one of the biggest problems that security organizations face today: recruiting and retaining qualified security professionals during a chronic talent shortage in the sector, allowing to maintain smaller teams, but highly qualified dedicated to tasks of higher added value, something good for both the professional and the companies.
This article does not intend to enter into complex topics of how AI works at a low level and technical inaccuracies may be committed, but the objective is that the reader can get a real idea of how the application of AI can help with simple integrations to the SOC, DFIR or Threat Hunting teams in the day-to-day tasks.
Basic concepts
To better understand the use cases and flow diagrams, let’s first review the essential concepts of artificial intelligence, explained in the simplest and perhaps inaccurate way:
· LLM (Large Language Model): LLMs are machine learning algorithms trained with large sets of unstructured data with the aim of understanding, interpreting and generating text in natural language. We have had some of these models in the market for some time, such as GPT4 (OpenAI), Claude2 (Anthropic), Llama2 (Meta) or BARD (Google).
· Token: A token in artificial intelligence is a basic unit of text (words or parts of words) that is used to train and query natural language models, such as the LLMs. The size of the context (which is limited as we will see later) is measured in these units, tokens.
· Embeddings: Embeddings are a way of representing words or phrases as numerical vectors in a low-dimensional space. In simple terms, embeddings are a way of making computers understand the meaning of words and phrases in human language. For example, an embedding could represent the word “dog” as a specific numerical vector, which can then be used to train a natural language model to recognize the word “dog” in different contexts.
One of the most interesting practical uses of embeddings and that we will touch on in other articles, is the fact that if we use embeddings to “index” our own texts and knowledge bases, we can perform searches with natural language. For example, in conventional indexing systems, searches are performed by matching a term, for example, searching for “phishing” will result in all the paragraphs and documents where that word is found. However, if we have our data “indexed” in a vector database (embeddings) such as ChromaDB or FAISS, we can perform the search by asking questions such as: What are the incidents related to phishing where exfiltration has occurred?
· Context: We could define context as the “short-term memory” of the LLMs, the context is everything that surrounds the question we want to ask and gives it enough meaning for the model to answer as accurately as possible. We use context naturally when we use ChatGPT or BingChat, simply noticing how the tools are able to follow the thread of the conversation and become more accurate as we progress.
The size of the context is something that varies depending on the type of model we select and it is vital that its choice adapts to our needs. Some examples of size are:
- Llama2: 4k
- GPT4: 32k
- Claude2: 100k
- GPT4-Turbo: 128k
- Dolphin Yi 34b: 200k
· Prompt: A prompt in artificial intelligence is a phrase or set of words that is used to guide a natural language model, such as the LLMs, to generate a specific response. Prompts are useful to provide context and guidance to machine learning models. In simple terms, prompts are like questions or instructions that are given to a natural language model to generate a specific response. Each prompt that we execute will be added to the context until we exhaust the size of the context or close the session with the LLM.
Use Case: Smart Autogenerated Playbooks
In future articles we will see different use cases of AI in cybersecurity environments. In this case we want to solve one of the classic problems of cybersecurity playbooks, how quickly they become outdated and how difficult it is sometimes to find the right playbook when you do not understand well the investigation/incident/alert (I will use alert to simplify) that you are facing.
Now, imagine that we want that every time an alert is generated in our system, our AI system understands the context of the alert, searches for all the information of similar alerts, how they were resolved and generates a customized playbook in which it explains what the alert is about and how it should be addressed for optimal resolution. (Obviously the playbook has to be customized for each area of cybersecurity, but later we will see how this playbook can be customized by forcing the prompt of the question a little bit).
Well, in principle it seems something simple to approach if it were not because surely the LLM that we are going to use has no idea of our knowledge base accumulated in our company, either in documents or management tools. To solve this problem in AI we have two approaches that we can do: RAG and Model Fine-tuning. With what we have to go back to review some concept:
· Model Fine-tuning: The concept is simple, choose an existing model, either opensource or private and retrain it with our data. For this we will need to generate what is known as a “dataset”, which is nothing more than a collection of questions and answers in a specific format that makes our model able to learn more “natively” about our environment.
For these fine-tuning works we have private services such as Azure OpenAI Studio or we can opt to retrain opensource models such as llama2 and its derivatives.
As an advantage of this solution, is that, if we have a good technical team capable of generating a quality dataset (and in case of going for the opensource option, of adequate processing equipment), the accuracy and speed in the responses of the model will be remarkable.
But as a disadvantage we have that fine-tuning is an expensive and slow process, which will result in the data with which the model was trained, soon becoming outdated, and more in the field of cybersecurity.
· RAG (Retrieval-Augmented Generation): RAG is the most fashionable framework nowadays, as it allows us to work with updated data sources, whether public or private, without the need to retrain the model.
We could use many technical words to explain what RAG is, but we can simplify it by saying that RAG is a method that takes advantage of the context windows of the LLMs to “dope” the model with extra information with which we want it to give us the answer.
For example, imagine that we open a session with GPT4-Turbo from OpenAI and ask “Can you give me a summary of the DOGO incident from last month?”, obviously the model will answer that it does not know what you are talking about, but, if before asking the question, we paste the text of the whole report (Taking into account that GPT4-Turbo supports 128k tokens, almost 250 pages of report would fit in…) and at the end we formulate the same question, the model will be able not only to answer this question, but also all the ones you ask related to the report.
And here lies the “trick” of RAG, to enrich the context optimally to obtain accurate and personalized answers with my scope.
As a security note, as they are very sensitive data, we must be careful with the LLM services to use:
- Safe Mode: Use private services such as Azure OpenAI or OpenAI Enterprise, which do not use the data, neither to retrain their models, nor third parties nor anything outside the scope of the client.
- Paranoia Mode: Use an opensource model like Llama2 (or its variants) on On-Premise equipment.
Use case design
For this use case we will use RAG as it is the simplest, fastest and cheapest approach to set up our automatic playbook system.
For this use case we will need the following elements:
- A ticketing system where we have registered the alerts or incidents with their corresponding follow-ups and solutions.
- It is very important for the AI to be able to generate good playbooks, that our technicians have detailed precisely and professionally those tickets when closing them, that is, that the alerts do not go with the typical “Resolved”, but that they try to explain, even briefly, how the ticket has been resolved.
- Have a minimum base of tickets to work with.
The flow would be as follows:
We detail these 6 simple steps for our automatic playbook:
- Step 1: The ticketing system receives a new alert from our cybersecurity systems.
- Step 2: The AI system collects the data of the new alert to start generating a context.
- Step 3: The AI system based on the name of the alert, the user, the machine, the commands used… or whatever each company considers relevant, generates data of tickets related to that alert to add them as context.
- Step 4: The AI system uses all that previous context and generates a prompt with the appropriate questions to force the structure of the desired playbook and executes it in the LLM.
- Step 5: The AI system generates the playbook and attaches it to the ticket.
- Step 6: The user can now use the new playbook to carry out their investigation more efficiently
This flow could be done in a few lines with frameworks like Langchain. An example code would be:
import os
from dotenv import load_dotenv
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chat_models import AzureChatOpenAI
from libs.libticketing import ticket_login, get_ticket,
from libs.libticketing import get_related_text, post_playbook
load_dotenv() # Load environment variables
ticketid = "107545" # Incident ID (For this example)
model = "gpt-35-turbo" # Model name
token = ticket_login () # Login to Ticketing System
alert = get_ticket (token, ticketid) # Get ticket information
related = get_related_text(token, related) # Get related tickets
# Context Template
template = """Question:
Acting as Morgoth, a cybersecurity Advanced AI evaluate the information
step by step before answering the questions at the end of the text:
{alert}
{related}
Answer the following in a structured way:
- In the first one, could you explain in a short resume what is happening
in this alert?.
- Make a list of all related tickets, follows the following rules:
- Use bullet point format.
- Choose a maximum of five most relevant related tickets.
No more than five.
- Read carefully the follow-ups and solutions of the previous tickets
and make a brief summary.
- Every bullet only with Ticket ID, Type of Positive, and a
short description of the solution.
- And the most important part, could you recommend
how this alert should be solved?.
Answer: """
prompt = PromptTemplate(template=template,
input_variables=['alert','related'])
llm = AzureChatOpenAI(deployment_name=model, temperature=1,
verbose=True, max_tokens=2048)
llm_chain = LLMChain(llm=llm, prompt=prompt)
answer = llm_chain.run(alert=alert, related=related)
print (answer)
post_playbook(token, answer)
This small example code that can be quickly implemented in any workflow of a cybersecurity team that uses management tools such as ServiceNow, Jira, GLPI… and would generate a small playbook of this type for each open ticket:
In this alert, it is reported that a user named john.doe has successfully logged in to inet facing hosts through a phishing attack. MDATP has identified this activity as high severity initial access attack using the tactic of TA0001.
List of Related tickets:
- TicketID: #12345, Type of Positive: Suspicious Activity, Description: A similar alert was generated on the same day, which highlighted suspicious activity on a corporate device. In this incident, it was identified that the device was compromised through a phishing attack. Immediate remediation steps were taken by disconnecting the device from the network and conducting a thorough investigation.
- TicketID: #23456, Type of Positive: Malware, Description: Another alert generated a few days before this incident highlighted malware activity on the system. The malware was identified as a RAT and was used to gain unauthorized access to the system from a remote server. The device was immediately isolated from the network and all credentials were reset.
- TicketID: #34567, Type of Positive: Account Compromise, Description: This alert identified that an account belonging to a senior employee was compromised and used to access sensitive files. The investigation found that the attack vector was a phishing email, and the remediation steps involved resetting the compromised user account and advising all users to be vigilant against suspicious emails.
- TicketID: #45678, Type of Positive: Credential Theft, Description: This alert highlighted that a user’s credentials were stolen and used to log in to a corporate device. The investigation found that the user had fallen victim to a phishing email, and their credentials were used to gain unauthorized access to the system. The device was immediately isolated from the network, and the user was advised to change their password and enable multi-factor authentication.
- TicketID: #56789, Type of Positive: Insider Threat, Description: This alert highlighted suspicious activity on a corporate device belonging to an employee who had recently resigned. It was identified that the employee had accessed sensitive files and attempted to transfer them to an external drive. The device was immediately disconnected from the network, and the employee’s access to sensitive files was revoked.
Based on the related tickets, it is recommended that immediate remediation steps are taken, including disconnecting the device from the network and conducting a thorough investigation to identify the extent of the damage. All credentials for the affected user should be reset, and multi-factor authentication should be enabled. Additionally, all users should be advised to be vigilant against suspicious emails, and regular security awareness training should be provided. Further, it is recommended to conduct a threat hunting exercise to identify similar incidents in the environment.
In future articles we will add embeddings and vector databases to improve the accuracy in the search for related solutions.
Conclusion
We can conclude that we are facing a new stage in the world of cybersecurity where AI will have a fundamental role not only in terms of threats (which also), but to help cybersecurity teams to be much more efficient and fast in the response. As we have seen in this use case and in those that will come in future articles, it is not something difficult, so there is no excuse not to start using these technologies that are at our disposal.