Abstract: this note discusses one of the methods for analyzing memory consumption by components of a Go application.

Often in the program memory data structures are stored that change their size dynamically, as the program runs. An example of such a structure can be a data cache or a program operation log or data received from external systems. In this case, a situation may arise when memory consumption is growing, equipment capabilities are not enough, and the specific leakage mechanism is not clear.

The main way to profile Go applications is to connect the pprof tool from the net/http/pprof package. As a result, you can get a table or graph with memory allocation in a running program. But using this tool requires very high overhead and may not be applicable, especially if you cannot run multiple instances of the program with real data.

In this case, there is a desire to measure memory consumption by program objects on demand, for example, to display system statistics or transfer metrics to a monitoring system. However, by language means this is generally not possible. Go has no tools for sizing variables while the program is running.

Therefore, I decided to write a small package that provides such an opportunity. The main tool is reflection (package "reflection" ). I invite everyone interested in the issue of such profiling to further reading.

ITKarma picture

First, a few words about the built-in functions

unsafe.Sizeof(value) 

and

reflect.TypeOf(value).Size() 

These functions are equivalent and are often recommended on the Internet for determining the size of variables. But these functions do not return the size of the actual variable, but the size in bytes for the variable container (roughly - the size of the pointer). For example, for a variable of type int64, these functions will return the correct result, since a variable of this type contains the actual value, and not a reference to it. But for data types that contain a pointer to the actual value, such as a slice or a string, these functions will return the same value for all variables of this type. This value corresponds to the size of the container containing a link to the variable data. I will illustrate with an example:

func main() { s1 := "ABC" s2 := "ABCDEF" arr1 := []int{1, 2} arr2 := []int{1, 2, 3, 4, 5, 6} fmt.Printf("Var: %s, Size: %v\n", s1, unsafe.Sizeof(s1)) fmt.Printf("Var: %s, Size: %v\n", s2, unsafe.Sizeof(s2)) fmt.Printf("Var: %v, Size: %v\n", arr1, reflect.TypeOf(arr1).Size()) fmt.Printf("Var: %v, Size: %v\n", arr2, reflect.TypeOf(arr2).Size()) } 

As a result, we get:

Var: ABC, Size: 16 Var: ABCDEF, Size: 16 Var: [1 2], Size: 24 Var: [1 2 3 4 5 6], Size: 24 

As you can see, the actual size of the variable is not calculated.

The standard library has a function binary.Size () which returns the size of the variable in bytes, but only for fixed size types. That is, if a string, slice, associative array, or just int is found in the fields of your structure, then the function is not applicable. However, I took this function as the basis of the size package, in which I tried to expand the capabilities of the above mechanism to data types without fixed size.

To determine the size of an object during program operation, it is necessary to understand its type, together with the types of all nested objects, if it is a structure. The final structure that needs to be analyzed is generally presented in the form of a tree. Therefore, you must use recursion to determine the size of complex data types.

Thus, the calculation of the amount of memory consumed for an arbitrary object is as follows:

  • algorithm for determining the size of a variable of simple (non-composite) type;
  • recursive algorithm call for elements of arrays, structure fields, keys, and values ​​of associative arrays;
  • definition of infinite loops;

To determine the actual size of a simple type variable (not an array or structure), you can use the Size () function from the “reflection” package above. This function works correctly for variables containing the actual value. For variables that are arrays, strings, i.e. containing links to the value you need to go through the elements or fields and calculate the value of each element.

To analyze the type and value of a variable, the “reflection” package packs the variable into an empty interface ( interface {} ). In Go, an empty interface can contain any object. In addition, the interface in Go is represented by a container containing two fields: the type of the actual value and a link to the actual value.

It was the mapping of the analyzed value to the empty interface and vice versa that served as the basis for the name of the reception itself - reflection .

For a better understanding of how reflection works in Go, I recommend the Rob Pike article on the official Go blog. The translation of this article was on Habr.

Ultimately, the size package was developed, which you can use in your programs as follows:

package main import ( "fmt" "github.com/DmitriyVTitov/size" ) func main() { a := struct { a int b string c bool d int32 e []byte f [3]int64 }{ a: 10,//8 bytes b: "Text",//4 bytes c: true,//1 byte d: 25,//4 bytes e: []byte{'c', 'd', 'e'},//3 bytes f: [3]int64{1, 2, 3},//24 bytes } fmt.Println(size.Of(a)) }//Output: 44 

Remarks:

  • In practice, calculating the size of volume structures of about 10 GB with a large nesting takes 10-20 minutes. This is the result of the fact that reflection is a rather expensive operation, requiring each variable to be packed into an empty interface and subsequent analysis (see the article at the link above).
  • As a result of the relatively low speed, the package should be used for an approximate determination of the size of variables, since in a real system, during the analysis of a large structure, the actual data will probably have time to change. Or provide exclusive access to data at the time of calculation using the mutex, if applicable.
  • The program does not take into account the size of the "containers" for arrays, interfaces and associative arrays (this is 24 bytes for the array and slice, 8 bytes for map and interface). Therefore, if you have a large number of such elements of small size, then the losses will be significant.
.

Source