Creating products and developing expertise, we are primarily guided by the desire to improve the security of companies. However, in our research we are not only driven by customer care. For quite some time now, we had a desire to conduct research for the information security community on a volunteer basis and now we are actively doing this: we publish on Twitter high-profile network detectors attacks, we supply traffic analysis rules to the ANY.RUN service and replenish the set ETOpen Rules . There are many open source projects to which you can send a pull request, but until recently, host detectors still couldn’t get their hands on it.

And then we learned that a group of enthusiasts decided to arrange a two-week sprint for writing rules for Sigma project , which was created to develop a single format for describing rules for SIEM systems and is supported by more than 140 participants. We were interested in the news about the event, because as SIEM vendor we closely monitor the development of the community.

What was our surprise when the organizers contacted us and invited the team PT Expert Security Center to participate in the sprint ! The participants of the event formed the Open Security Collaborative Development (OSCD) - an international initiative of information security experts aimed at disseminating knowledge and improving computer security in general. We gladly agreed to participate in order to apply our experience for the benefit of general security.

How this article appeared

We understood when we started writing the rules that there is no comprehensive description of the syntax of Sigma-rules, especially in Russian. The main sources of knowledge are GitHub and personal experience. There are some good articles (in Russian and in English ), but the focus is shifted from the syntax of the rules to the analysis of the scope of Sigma-rules or the creation of a specific rule. We decided to make it easier for beginners to get acquainted with the Sigma project, to share our own experience, to collect in one place information about the syntax and features of its application. And of course, we hope that this will contribute to the expansion of the OSCD initiative and will allow us to form a large community in the future.

Since there was a lot of material, we decided to issue a description in a cycle of three articles:

  1. A small introduction, an example of creating a simple rule and a description of the sources of events (and you are reading this article now).
  2. Description of the detection logic. This is the most important part of the syntax, knowledge of which is necessary to understand existing rules and write your own.
  3. A description of meta-information (attributes that are informative or infrastructural in nature, such as a description or identifier) ​​and rule collections.

What is the Sigma format and why is it needed

Sigma is a unified format for describing detection rules based on data from logs. Rules are stored in separate YAML files. Sigma allows you to write a rule using a unified syntax once, and then using a special converter to get the rule in the syntax of a supported SIEM system. In addition to query syntax for various SIEM systems, the creation of queries of the following types is supported:

  • Elasticsearch Query,
  • grep utility launch line with the necessary parameters,
  • String for accessing Windows audit logs in PowerShell.

The last two species are notable for the fact that they do not require additional software for analyzing logs. The grep utility and PowerShell are supported “out of the box” on Linux and Windows, respectively.

The existence of a single log-based detection description format makes it easier to share knowledge, develop open-source security and help the information security community deal with emerging threats.

General Syntax

First of all, it is worth saying that there are mandatory and optional parts of the rule. This is described in the official wiki on GitHub. The outline of the rule (source: official Wiki) is presented below:

ITKarma picture

Almost any rule can be divided into three parts:

  1. attributes describing the rule (meta-information);
  2. attributes describing data sources;
  3. attributes describing the conditions for the rule to be triggered.

Each of the parts corresponds to the required high-level attributes title (in addition to the title, the last group includes the remaining optional high-level attributes), logsource and detection .

There is another feature of the rule structure, which is worth talking about. Since the rules are described in the YAML markup language, Sigma developers have found application for some of the features of this, because the YAML format allows you to place several YAML documents in one file. And for Sigma, it’s possible to combine several rules in one file, that is, create “rule collections”. This approach is convenient when there are several ways to detect an attack and you do not want to duplicate the narrative (as will be described in the corresponding section, you can duplicate not only the narrative of the rule).

In this case, the rule is conditionally divided into two parts:

  • part with common attributes for collection elements (usually all fields except the logsource and detection sections),
  • one or more parts with a description of the detection (logsource and detection sections).

If the file contains a single rule, this statement is also true, since we get a degenerate collection from one rule. The rule collections will be discussed in detail in the third part of the article series.

Next, we consider an example of a hypothetical rule. It is worth noting that comments in this form are usually not used in the rules, here they are only for describing fields.

Description of the model rule

ITKarma picture

Example of creating a Sigma rule

Before describing the details of the syntax and talking about the capabilities of Sigma-rules, let us consider a small example of creating such a rule in order to understand where in practice these or those attribute values ​​come from. There is a good article in English on this subject. If you have already tried to write your own rules and figured out what data should be specified in the attribute of the YAML file, you can proceed to the next section with a detailed description of the event sources section (we will also call this section log sources).

We describe the creation of a rule that detects the use of SettingSyncHost.exe as Living Off The Land Binary (LOLBin). Creating a rule usually consists of three steps:

  1. conducting an attack and collecting the necessary logs,
  2. description of the detection in the form of a rule,
  3. checking the created rule.

Conducting an attack

The idea for the rule is well described in the Hexacorn blog . After a careful reading, it becomes clear what steps need to be done to repeat the result described in the article:

  1. Copy the program you want to run to any directory that is writable. The article suggests choosing% TEMP%, however you can choose the path of your choice. It is worth considering that in this directory a subdirectory will be created with the name that you specify in step 4.
  2. Name the program you want to run as one of the names specified in the article (wevtutil.exe, makecab.exe, reg.exe, ipconfig.exe, settingsynchost.exe, tracelog.exe). During the analysis of the logs, it turned out that in addition to this list, the name findstr.exe can be used. This is exactly what the files need to be named because SettingSyncHost.exe is vulnerable to the Binary Search Order Hijacking (MITER ATT & amp; CK ID: T1574.008).
  3. Make the selected directory the current working directory for all processes that you will continue to run (if you run settingsynchost.exe via cmd or PowerShell, simply run the CDMY0CDMY command).
  4. Run the command: CDMY1CDMY
  5. The executable file was launched by the legitimate program SettingSyncHost.exe.

ITKarma picture

Sysmon is installed in the system with the configuration file from the project sysmon-modular . Thus, the collection of logs was carried out automatically. Which logs are useful for writing a detect will be seen as the rule is written.

Detection description as a Sigma rule

Two approaches are possible at this step: find the existing rule closest in detection logic and modify it to fit your needs or write a rule from scratch. In the initial stages, it is recommended to adhere to the first approach. For clarity, we will write a rule using the second approach.

We create a new file and try to briefly and succinctly describe its essence in the name. Here you should adhere to the style of existing rules. In our case, we chose the name win_using_settingsynchost_to_run_hijacked_binary.yml. Next, we begin to fill it with content. Let's start by filling in the meta information at the beginning of the rule. All the data necessary for this, we already have.
We describe briefly what attack the rule reveals in the CDMY2CDMY field, more detailed explanations in the description field, it is customary to set status: experimental for new rules. The unique identifier can be generated in various ways; in the Windows environment, it is easiest to run the following code in the PowerShell interpreter:

PS C:\> "id: $(New-Guid)" id: b2ddd389-f676-4ac4-845a-e00781a48e5f 

The remaining fields speak for themselves, I only note that it is desirable to indicate links to sources that helped to understand the attack. This will help people who will further understand this rule, as well as a tribute to the efforts that the author of the original study made to describe the attack.

Our rule at this stage is as follows:

ITKarma picture

Next, you need to describe the sources of the logs. As mentioned above, we will rely on Sysmon logs, however, with the advent of generalized categories, it is customary to use the process_creation category to create processes. About generalized categories will be described in more detail below. Note that in the definition field it is customary to write comments and tips for setting up sources, such as the features of the Sysmon configuration:

ITKarma picture

Now you need to describe the detection logic. This is the most time consuming part. This attack can be detected in many ways, our example does not pretend to cover all possible detection paths, so we will describe one of the possible options.

If you look at the events that happened, you can build the following chain.
First started the process (PID: 4712) with the launch line c: \ windows \ system32 \ SettingSyncHost.exe -LoadAndRunDiagScript join_oscd

ITKarma picture

Note that the current working directory of the process is the TEMP user directory.

Next, the running process creates a batch file and starts its execution.

ITKarma picture

ITKarma picture

The process of executing the instructions of the batch file received the identifier 7076. Upon further analysis of the events, we see the ipconfig file being launched.exe, which does not contain the meta-information inherent in system files and plus everything is located in a folder with temporary files:

ITKarma picture

It is proposed to consider as a sign of an attack the launch of processes whose executable files are not in the system directory (C: \ Windows \ System32), and also if the launch line of the parent process contains the substrings “cmd.exe/c”, “RoamDiag.cmd” and “- outputpath ". We describe this in the Sigma syntax and get the final rule (a detailed analysis of the constructions that can be used to describe the detection logic will be given in the next part of our series of articles about Sigma):

ITKarma picture

Checking rule health

Run the converter into a PowerShell query:

ITKarma picture

For our case, this query will not give the desired result, since the exclusion filter also finds the path to the image of the executable file of the parent process. Therefore, we simply indicate that the word Image must not be preceded by the letter t - the end of the word Parent:

ITKarma picture

We see that our event was found. The rule works.

So in practice, Sigma rules are created. Next, we describe in detail the fields responsible for the detection, namely, for the description of the log sources.

Detection Description

The main part of the rule is the description of the detection, since it is here that contains information about where and how to look for signs of an attack. This information is contained in the attribute fields logsource (where) and detection (how). In this article, we will take a closer look at the logsource section, and the detection section will be described in the next part of our series.

Description of the event sources section (logsource attribute)

A description of the event sources is contained in the value of the logsource field. This section describes the data sources from which events for the detection section will be delivered (the detection attribute is discussed in the next part). The section describes the source itself, the platform and the application, which are necessary for detection. It may contain three attributes that are automatically processed by the converters, and an arbitrary number of optional elements. Main fields:

  • Category - describes product classes. Examples of values ​​for this field: firewall, web, antivirus. The field may also contain generalized categories, which will be described below.
  • Product - a software product or operating system that creates logs.
  • Service - restriction of logs to a specific subset of services, for example, "sshd" for Linux or "Security" for Windows.
  • Definition - an additional field for describing the features of the source, for example, requirements for setting up an audit (rarely used, an example of a rule with this field can be found at GitHub ). It is recommended to use this attribute if the source has any features.

The official wiki on GitHub defines a set of fields that must be used in order for the rules to be cross-product. These fields are tabulated below.

Category Product Service
windows security
linux auth
apache access
process_creation windows

Далее опишем подробнее некоторые источники логов с указанием используемых полей событий и приведем примеры правил, в которых данные поля используются.

Поля событий категории Proxy

Category Product/Service Fields Examples
proxy c-uri proxy_ursl
cs-bytes -
cs-cookie proxyaaama
cs-host prosp_calt
cs-method proxy_radle
r-dns proxy_apt40.yml
cs-referrer -
cs-version -
sc-bytes -
sc-status proxy_ursn "
src_ip -
dst_ip -

Описание полей событий данного источника
---------------------------------------------------- ------------- c-uri - URI, запрошенный клиентом c-uri-extension - Расширение URI.This is usually the extension of the requested file. c-uri-query - Part of the URI containing the path to the requested resource c-uri-stem - This is usually part of the URL from the host (or host: port) to the query string. Most often, the URIstem contains the path to the resource relative to the root directory of the web server c-useragent - UserAgent header in client HTTP request cs-bytes - The number of bytes sent from the client to the server cs-cookie - cookie values ​​that the client sends to the server cs-host - Host header in client HTTP request cs-method - Client HTTP request method r-dns - DNS name of the requested server cs-referrer - Referrer header in client HTTP request cs-version - The HTTP protocol version used by the client sc-bytes - Number of bytes sent from server to client sc-status - HTTP response code src_ip - client IP address dst_ip - server IP address 

Firewall Category Event Fields

Category Product/Service Fields Examples
firewall src_ip -
src_port -
dst_ip -
dst_port net_high_dns_bytes_out.yml
username -

Description of the event fields for this source
-------------------------------------------------- ------------- src_ip - client IP address src_port - The port from which the connection is made dst_ip - server IP address dst_port - The port to connect to username - The name of the user who is connecting 

Event fields of the Web server category

Category Product/Service Fields Examples
webserver c-uri web_cve_2020_0688_msexchange.yml
c-uri-extension -
c-uri-query -
c-uri-stem -
c-useragent -
cs-bytes -
cs-cookie -
cs-host -
cs-method web_cve_2020_0688_msexchange.yml
r-dns -
cs-referrer -
cs-version -
sc-bytes -
sc-status -
src_ip -
dst_ip -

Description of the event fields for this source
-------------------------------------------------- ------------- c-uri - URI requested by client c-uri-extension - URI extension. This is usually the extension of the requested file. c-uri-query - Part of the URI containing the path to the requested resource c-uri-stem - This is usually part of the URI from the host (or host: port) to the query string.Most often, the URI stem contains the path to the resource relative to the root directory of the web server c-useragent - UserAgent header in client HTTP request cs-bytes - The number of bytes sent from the client to the server cs-cookie - cookie values ​​that the client sends to the server cs-host - Host header in client HTTP request cs-method - Client HTTP request method r-dns - DNS name of the requested server cs-referrer - Referrer header in client HTTP request cs-version - The HTTP protocol version used by the client sc-bytes - Number of bytes sent from server to client sc-status - HTTP response code src_ip - client IP address dst_ip - server IP address 

Generalized Categories

Since Sigma is a generalized format for describing log-based detection rules, the syntax of such rules should be able to describe the detection logic for different systems. Some systems use tables with aggregated data instead of events, and data from different sources may come in to describe the same situation. To unify the syntax and solve such problems, a mechanism of generic logsources was introduced. At the moment, one such category has been created - process_creation. Read more about this in the blog . A list of fields in this category can be found on the page with taxonomy (other supported categories are also described on this page).

Event fields of the generalized process_creation category

Category Product Fields Examples
process_creation windows UtcTime -
ProcessGuid -
FileVersion td>
Product td>
CommandLine win_meterpreter_or_cobaltstrike_getsystem_service_start.yml
LogonGuid -
LogonId -
TerminalSessionId -
IntegrityLevel -
imphash win_renamed_paexec.yml
md5 -
sha256 -
ParentProcessGuid -
ParentProcessId -
ParentImage win_meterpreter_or_cobaltstrike_getsystem_service_start.yml
ParentCommandLine win_cmstp_com_object_access.yml

Описание полей событий данного источника
--------------------------------------------------------------- UtcTime -Время события в формате UTC ProcessGuid - GUID созданного процесса ProcessId - PID созданного процесса Image - Путь к запущенному исполняемому файлу FileVersion - Версия программы, взятая из ресурсов исполняемого файла Description - Описание программы, взятое из ресурсов исполняемого файла Product - Название программы, взятое из ресурсов исполняемого файла Company - Название компании — разработчика программы, взятое из ресурсов исполняемого файла CommandLine - Строка запуска создаваемого процесса CurrentDirectory - Текущая директория созданного процесса User - Пользователь, от имени которого запускается процесс LogonGuid - GUID текущей пользовательской сессии LogonId - Идентификатор текущей пользовательской сессии TerminalSessionId - Идентификатор текущей терминальной сессии IntegrityLevel - Уровень целостности, с которым запускается процесс imphash - Хеш-сумма на основе данных из таблицы импорта исполняемого файла md5 - MD5-хеш исполняемого файла, на основе которого создается процесс sha256 - SHA256-хеш исполняемого файла, на основе которого создается процесс ParentProcessGuid - GUID родительского процесса ParentProcessId - PID родительского процесса ParentImage - Путь к исполняемому файлу родительского процесса ParentCommandLine - Строка запуска родительского процесс 

Статистика использования источников событий в существующих правилах

Ниже в таблице приведены наиболее часто встречающиеся конструкции для описания источников логов. Скорее всего, вы найдете среди них ту, которая подходит для вашего правила.

Статистика по использованию комбинации полей описания некоторых наиболее распространенных источников (прочерк означает отсутствие данного поля):
Кол-во правил Category Product Service Пример синтаксиса Комментарий
197 process_creation windows logsource:
category: process_creation
product: windows
Обобщенная категория логов создания процессов на Windows-системах.Включает события Sysmon
и события Windows Security Event Log
68 windows sysmon logsource:
product: windows
service: sysmon
События sysmon
windows security logsource:
product: windows
service: security
События из журнала Windows Security Event Log
24 proxy logsource:
category: proxy
События из логов прокси-сервера
15 windows system logsource:
product: windows
service: system
События из журнала Windows System Event Log
12 accounting cisco aaa logsource:
category: accounting
product: cisco
service: aaa
События из журнала Cisco AAA Security Services
10 windows powershell logsource:
product: windows
service: powershell
События из журнала
Microsoft Windows PowerShell
Event Log
9 linux logsource:
product: linux
События аудита в Linux
8 linux auditd logsource:
product: linux
service: auditd
События Linux, уточнение до логов конкретного сервиса (подсистема AuditD)

Советы по написанию правил

При написании нового правила возможны такие ситуации:

  1. Нужный источник событий уже использовался в существующих правилах.
  2. В репозитории нет ни одного правила, которое использовало бы данный источник событий.

Если вы столкнулись с первым случаем, то используйте одно из существующих правил в качестве шаблона. Возможно, нужный источник логов уже используется в других правилах, тогда это значит, что авторы плагинов (бэкенд-конвертеры) под разные SIEM-системы, скорее всего, учли его в своем маппинге и ваше правило должно сразу корректно обрабатываться.

Во второй ситуации необходимо на примере существующих правил понять, как правильно использовать идентификаторы category, product и service. При создании своего источника логов рекомендуется добавить его во все маппинги существующих бэкендов. Это могут сделать и другие контрибьюторы или даже разработчики, главное сообщить о такой необходимости.

Мы создали визуализацию сочетания полей описания источников логов в существующих правилах:

Распределение источников логов

ITKarma picture

Статистика использования комбинаций подполей атрибута logsource

ITKarma picture

В этом материале мы привели пример создания простого правила и рассказали про описание источников событий. Теперь вы можете применить полученные знания, посмотреть на правила в репозитории Sigma и разобраться, какие источники используются в том или ином правиле. Следите за нашими публикациями: в следующей части мы рассмотрим наиболее сложную часть Sigma-правил — секцию описания логики детектирования.

Автор : Антон Кутепов, специалист отдела экспертных сервисов и развития Positive Technologies (PT Expert Security Center).