In the above diagram, the user program Mutator requests memory on the heap through the memory allocator Allocator and the garbage collector Collector reclaims the memory space on the heap. The memory allocator and the garbage collector together manage the heap memory space in the program.
In this section, we will go over the key theories involved in Go language garbage collection to help us better understand the rest of this section. The Mark-Sweep algorithm is the most common garbage collection algorithm. The Mark-Sweep collector is a trace-based garbage collector whose execution can be divided into two phases: Mark and Sweep. The remaining three objects, B, E and F, are treated as garbage because they are not reachable from the root node. After the tagging phase, it enters the cleanup phase, in which the collector traverses all objects in the heap in turn, releasing the untagged B, E, and F objects and chaining the new free memory space in a chain structure for the memory allocator to use.
Here is the most traditional marker removal algorithm, the garbage collector starts from the root object of garbage collection, recursively iterates through the sub-objects pointed to by these objects and marks all the reachable objects as alive; after the marking phase, the garbage collector will sequentially iterate through the objects in the heap and remove the garbage from them, the whole process needs to mark the alive status of the objects, and the user program cannot be executed during the garbage collection process, we need to use a more complex mechanism to solve the problem of STW.
To address the long STW caused by the original marker removal algorithm, most modern trace garbage collectors implement a variant of the three-color marker algorithm to reduce the STW time. The three-color tagging algorithm classifies objects in the program into three categories: white, black, and gray.
When the garbage collector starts working, there are no black objects in the program, the root object of garbage collection will be marked as gray, the garbage collector will only take objects from the gray object collection and start scanning, the marking phase will end when there are no objects in the gray collection. The working principle of the three-colored marker garbage collector is simple and we can summarize it in the following steps. When the marking phase of the three-color marker cleanup is over, there are no gray objects in the heap of the application, we can only see black surviving objects and white garbage objects, which can be recycled by the garbage collector.
In the three-color marker process shown below, the user program creates a reference from object A to object D, but since there is no longer a gray object in the program, object D is incorrectly reclaimed by the garbage collector.
Objects that should not be reclaimed but are reclaimed are very serious errors in memory management. We call such errors hanging pointers, i. To mark objects concurrently or incrementally still requires the use of barrier techniques. The memory barrier technique is a barrier instruction that allows the CPU or compiler to follow specific constraints when executing memory-related operations. Most modern processors today execute instructions out of order to maximize performance, but the technique ensures that memory operations are sequential, and that operations executed before the memory barrier must precede those executed after the memory barrier.
To guarantee correctness in concurrent or incremental marking algorithms, we need to reach one of the following two types of Tri-color invariant TCI. The above diagram shows the heap memory with strong and weak tricolor invariance. By following either of the two invariants, we can guarantee the correctness of the garbage collection algorithm, and the barrier technique is an important technique to guarantee tricolor invariance during concurrent or incremental marking.
The barrier technique in garbage collection is more like a hook method, which is a piece of code that is executed when the user program reads an object, creates a new object, and updates the object pointer. Depending on the type of operation, we can divide them into Read barriers and Write barriers. Since Read barriers require code fragments to be added to the read operation, which has a significant impact on the performance of the user program, programming languages tend to use Write barriers to ensure tri-color invariance.
Here we would like to introduce two write barrier techniques used in Go, namely, the insertion write barrier proposed by Dijkstra and the deletion write barrier proposed by Yuasa, and analyze how they guarantee tri-color invariance and correctness of the garbage collector.
Dijkstra proposed in the insertion of a write barrier, by which the user program and the garbage collector can guarantee correct program execution while working alternately as follows. The above pseudo-code for inserting the write barrier is pretty self-explanatory.
If the ptr pointer is white, then this function will set the object to gray, otherwise it will stay the same. In the garbage collection process shown above, object B, which is no longer alive, is not recovered; if we change the pointer to object C back to B between the second and third steps, the garbage collector still considers object C alive, and these incorrectly marked garbage objects are recovered only in the next loop.
Although the plug-in Dijkstra write barrier is very simple to implement and guarantees strong tricolor invariance, it also has significant drawbacks.
Because objects on the stack are also considered root objects in garbage collection, Dijkstra must either add a write barrier to objects on the stack or rescan the stack at the end of the marking phase in order to ensure memory safety, each of which has its own disadvantages. The designer of the algorithm needs to make a trade-off between the two.
In his paper Real-time garbage collection on general-purpose machines, Yuasa proposed removing the write barrier because once it starts working, it guarantees the reachability of all objects on the heap when the write barrier is turned on. This is why it is also called Snapshot GC.
This guarantees that no objects will become unreachable to the garbage collector traversal all objects which are live at the beginning of garbage collection will be reached even if the pointers to them are overwritten. The algorithm will guarantee the correctness of the program when garbage collection is performed incrementally or concurrently using a write barrier as shown below.
The above code will paint the old object in white gray when the reference to the old object is removed, so that removing the write barrier will ensure weak tricolor invariance and the downstream object referenced by the old object can definitely be referenced by the gray object.
The third step in the above procedure triggers Yuasa to remove the coloring of the write barrier because the user program removes the pointer of B to the C object, so the two objects C and D will violate strong and weak trichromatic invariance, respectively. Yuasa removes the write barrier by coloring the C object to ensure that the C object and the downstream D object survive this garbage collection loop and avoid hanging pointers to ensure the correctness of the user program.
Traditional garbage collection algorithms suspend the application during the execution of garbage collection. Once garbage collection is triggered, the garbage collector will seize CPU usage to occupy a large amount of computational resources to complete the marking and purging work, however, many applications pursuing real-time cannot accept a long STW. In order to reduce the maximum time that an application can pause and the total pause time for garbage collection, we would optimize the modern garbage collector using the following strategy.
Because both incremental and concurrent can run alternately with the user program, we need to use barrier techniques to ensure correct garbage collection; at the same time, the application cannot wait until memory overflows to trigger garbage collection, because when memory is low, the application can no longer allocate memory, which is no different from directly suspending the program.
Incremental and concurrent garbage collection needs to be triggered in advance and complete the whole loop before memory runs out, avoiding a long suspension of the program. Incremental garbage collection is a solution to reduce the maximum amount of time a program can be paused by slicing an otherwise long pause into multiple smaller GC time slices, which reduces the maximum amount of time an application can be paused, even though it takes longer from the start of garbage collection to the end.
It should be noted that incremental garbage collection needs to be used with the tri-color marking method. To ensure correct garbage collection, we need to turn on the write barrier before garbage collection starts, so that user programs modifying memory will go through the write barrier first, ensuring strong tri-color invariance or weak tri-color invariance of object relationships in heap memory. Although incremental garbage collection can reduce the maximum program pause time, incremental collection also increases the total time of a GC loop, and during garbage collection, the user program also needs to bear additional computational overhead because of the write barrier, so incremental garbage collection does not bring only benefits, but the overall benefits still outweigh the disadvantages.
Concurrent garbage collection reduces not only the maximum suspension time of the program, but also the entire garbage collection phase. Although the concurrent collector can run with the user program, not all phases can run with the user program, and some phases still need to suspend the user program, but compared with the traditional algorithm, concurrent garbage collection can execute the work that can be executed concurrently as much as possible; of course, because of the introduction of the read-write barrier, concurrent garbage collector must also bring extra overhead, which will not only increase the total time of garbage collection, but also affect the user program, which is something we must pay attention to when designing the garbage collection strategy.
The Go language garbage collector has been evolving since day one, except for a few versions without major updates, almost every minor release will improve the performance of garbage collection, and along with the performance improvement is the complexity of the garbage collector code, this section will analyze the evolution of the garbage collector from the Go language v1. We can see from the evolution of the Go language garbage collector that the implementation and algorithms of the component have become more and more complex.
The initial garbage collector was an imprecise single-threaded STW collector, but the latest version of the garbage collector supports concurrent garbage collection, decentralized coordination, and other features, and we will introduce the components and features associated with the latest version of the garbage collector here.
The Go language introduced concurrent garbage collector in v1. The garbage collector uses the three-color abstraction and write barrier techniques we mentioned above to ensure the correctness of the garbage collector execution. Garbage collection is a tool that saves time for programmers.
For example it replaces the need for functions such as malloc and free which are found in C. It can also help in preventing memory leaks. The downside of garbage collection is that it has a negative impact on performance. GC has to regularly run though the program, checking object references and cleaning out memory. This takes up resources and often requires the program to pause. If an object has no references is no longer reachable then it is eligible for garbage collection.
This means it is then unreachable and will have its memory unallocated by the garbage collector. One example of garbage collection is ARC, short for automatic reference counting. This is used in Swift, for example. ARC boils down to keeping track of the references to all objects that are created.
If the amount of references drops to 0, the object will be marked for deallocation. If this article was helpful, tweet it.
Learn to code for free. Get started. There are many garbage collection algorithms such as reference counting, mark, and sweep, copying, etc. For example, when we see the reference counting algorithm, each dynamic memory will have a reference count. When a reference is created the reference count will increase and whenever a reference is deleted the reference count is decremented.
Once the reference count becomes zero it shows that the memory is unused and can be released. A specific pointer type should be declared and this can be used for purposes such as keeping track of all the references created, keeping track of the reference count when reference is created and deleted. According to the need, either the normal pointer or the specific garbage collector pointer can be used.
Thus, to sum up, garbage collection is a method opposite to manual memory management. In a garbage collector, the memory is released automatically based on a periodic time basis or based on specific criteria which tells if it is no more in use. Both methods have their own advantages and disadvantages. This can be implemented and used according to the complexity of the function, depending on the language used and its scope.
Here we discuss manual memory management and the garbage collection algorithm along with the advantages and disadvantages. You can also go through our other suggested articles to learn more —. Submit Next Question. By signing up, you agree to our Terms of Use and Privacy Policy. Forgot Password?
0コメント