Anton ZhiyanovEverything about Go, SQL, and software in general.https://antonz.org/https://antonz.org/assets/favicon/favicon.pngAnton Zhiyanovhttps://antonz.org/Hugo -- gohugo.ioen-usMon, 05 Jan 2026 13:00:00 +0000Go 1.26 interactive tourhttps://antonz.org/go-1-26/Mon, 05 Jan 2026 13:00:00 +0000https://antonz.org/go-1-26/New with expressions, type-safe error checking, and faster everything.Go 1.26 is coming out in February, so it's a good time to explore what's new. The official release notes are pretty dry, so I prepared an interactive version with lots of examples showing what has changed and what the new behavior is.

Read on and see!

new(expr) • Recursive type constraints • Type-safe error checking • Green Tea GC • Faster cgo and syscalls • Faster memory allocation • Vectorized operations • Secret mode • Reader-less cryptography • Goroutine leak profile • Goroutine metrics • Reflective iterators • Peek into a buffer • Process handle • Signal as cause • Compare IP subnets • Context-aware dialing • Fake example.com • Optimized fmt.Errorf • Optimized io.ReadAll • Multiple log handlers • Test artifacts • Modernized go fix • Final thoughts

This article is based on the official release notes from The Go Authors and the Go source code, licensed under the BSD-3-Clause license. This is not an exhaustive list; see the official release notes for that.

I provide links to the documentation (𝗗), proposals (𝗣), commits (𝗖𝗟), and authors (𝗔) for the features described. Check them out for motivation, usage, and implementation details. I also have dedicated guides (𝗚) for some of the features.

Error handling is often skipped to keep things simple. Don't do this in production ツ

# new(expr)

Previously, you could only use the new built-in with types:

p := new(int)
*p = 42
fmt.Println(*p)
42

Now you can also use it with expressions:

// Pointer to a int variable with the value 42.
p := new(42)
fmt.Println(*p)
42

If the argument expr is an expression of type T, then new(expr) allocates a variable of type T, initializes it to the value of expr, and returns its address, a value of type *T.

This feature is especially helpful if you use pointer fields in a struct to represent optional values that you marshal to JSON or Protobuf:

type Cat struct {
    Name string `json:"name"`
    Fed  *bool  `json:"is_fed"` // you can never be sure
}

cat := Cat{Name: "Mittens", Fed: new(true)}
data, _ := json.Marshal(cat)
fmt.Println(string(data))
{"name":"Mittens","is_fed":true}

You can use new with composite values:

s := new([]int{11, 12, 13})
fmt.Println(*s)

type Person struct{ name string }
p := new(Person{name: "alice"})
fmt.Println(*p)
[11 12 13]
{alice}

And function calls:

f := func() string { return "go" }
p := new(f())
fmt.Println(*p)
go

Passing nil is still not allowed:

p := new(nil)
// compilation error

𝗗 spec • 𝗣 45624 • 𝗖𝗟 704935, 704737, 704955, 705157 • 𝗔 Alan Donovan

# Recursive type constraints

Generic functions and types take types as parameters:

// A list of values.
type List[T any] struct {}

// Reverses a slice in-place.
func Reverse[T any](s []T)

We can further restrict these type parameters by using type constraints:

// The map key must have a comparable type.
type Map[K comparable, V any] struct {}

// S is a slice with values of a comparable type,
// or a type derived from such a slice (e.g., type MySlice []int).
func Compact[S ~[]E, E comparable](s S) S

Previously, type constraints could't directly or indirectly refer to type parameters:

type T[P T[P]] struct{}
// compile error:
// invalid recursive type: T refers to itself

Now they can:

type T[P T[P]] struct{}
ok

A typical use case is a generic type that supports operations with arguments or results of the same type as itself:

// A value that can be compared to other values
// of the same type using the less-than operation.
type Ordered[T Ordered[T]] interface {
    Less(T) bool
}

Now we can create a generic container with Ordered values and use it with any type that implements Less:

// A tree stores comparable values.
type Tree[T Ordered[T]] struct {
    nodes []T
}

// netip.Addr has a Less method with the right signature,
// so it meets the requirements for Ordered[netip.Addr].
t := Tree[netip.Addr]{}
_ = t
ok

This makes Go's generics a bit more expressive.

𝗣 68162, 75883 • 𝗖𝗟 711420, 711422 • 𝗔 Robert Griesemer

# Type-safe error checking

The new errors.AsType function is a generic version of errors.As:

// go 1.13+
func As(err error, target any) bool
// go 1.26+
func AsType[E error](err error) (E, bool)

It's type-safe and easier to use:

// using errors.As
var target *AppError
if errors.As(err, &target) {
    fmt.Println("application error:", target)
}
application error: database is down
// using errors.AsType
if target, ok := errors.AsType[*AppError](err); ok {
    fmt.Println("application error:", target)
}
application error: database is down

AsType is especially handy when checking for multiple types of errors. It makes the code shorter and keeps error variables scoped to their if blocks:

if connErr, ok := errors.AsType[*net.OpError](err); ok {
    fmt.Println("Network operation failed:", connErr.Op)
} else if dnsErr, ok := errors.AsType[*net.DNSError](err); ok {
    fmt.Println("DNS resolution failed:", dnsErr.Name)
} else {
    fmt.Println("Unknown error")
}
DNS resolution failed: does.not.exist

Another issue with As is that it uses reflection and can cause runtime panics if used incorrectly (like if you pass a non-pointer or a type that doesn't implement error):

// using errors.As
var target AppError
if errors.As(err, &target) {
    fmt.Println("application error:", target)
}
panic: errors: *target must be interface or implement error

AsType doesn't cause a runtime panic; it gives a clear compile-time error instead:

// using errors.AsType
if target, ok := errors.AsType[AppError](err); ok {
    fmt.Println("application error:", target)
}
./main.go:24:32: AppError does not satisfy error (method Error has pointer receiver)

AsType doesn't use reflect, executes faster, and allocates less than As:

goos: darwin
goarch: arm64
cpu: Apple M1
BenchmarkAs-8        12606744    95.62 ns/op    40 B/op    2 allocs/op
BenchmarkAsType-8    37961869    30.26 ns/op    24 B/op    1 allocs/op

source

Since AsType can handle everything that As does, it's a recommended drop-in replacement for new code.

𝗗 errors.AsType • 𝗣 51945 • 𝗖𝗟 707235 • 𝗔 Julien Cretel

# Green Tea garbage collector

The new garbage collector (first introduced as experimental in 1.25) is designed to make memory management more efficient on modern computers with many CPU cores.

Motivation

Go's traditional garbage collector algorithm operates on graph, treating objects as nodes and pointers as edges, without considering their physical location in memory. The scanner jumps between distant memory locations, causing frequent cache misses.

As a result, the CPU spends too much time waiting for data to arrive from memory. More than 35% of the time spent scanning memory is wasted just stalling while waiting for memory accesses. As computers get more CPU cores, this problem gets even worse.

Implementation

Green Tea shifts the focus from being processor-centered to being memory-aware. Instead of scanning individual objects, it scans memory in contiguous 8 KiB blocks called spans. The algorithm focuses on small objects (up to 512 bytes) because they are the most common and hardest to scan efficiently.

Each span is divided into equal slots based on its assigned size class, and it only contains objects of that size class. For example, if a span is assigned to the 32-byte size class, the whole block is split into 32-byte slots, and objects are placed directly into these slots, each starting at the beginning of its slot. Because of this fixed layout, the garbage collector can easily find an object's metadata using simple address arithmetic, without checking the size of each object it finds.

When the algorithm finds an object that needs to be scanned, it marks the object's location in its span but doesn't scan it immediately. Instead, it waits until there are several objects in the same span that need scanning. Then, when the garbage collector processes that span, it scans multiple objects at once. This is much faster than going over the same area of memory multiple times.

To make better use of CPU cores, GC workers share the workload by stealing tasks from each other. Each worker has its own local queue of spans to scan, and if a worker is idle, it can grab tasks from the queues of other busy workers. This decentralized approach removes the need for a central global list, prevents delays, and reduces contention between CPU cores.

Green Tea uses vectorized CPU instructions (only on amd64 architectures) to process memory spans in bulk when there are enough objects.

Benchmarks

Benchmark results vary, but the Go team expects a 10–40% reduction in garbage collection overhead in real-world programs that rely heavily on the garbage collector. Plus, with vectorized implementation, an extra 10% reduction in GC overhead when running on CPUs like Intel Ice Lake or AMD Zen 4 and newer.

Unfortunately, I couldn't find any public benchmark results from the Go team for the latest version of Green Tea, and I wasn't able to create a good synthetic benchmark myself. So, no details this time :(

The new garbage collector is enabled by default. To use the old garbage collector, set GOEXPERIMENT=nogreenteagc at build time (this option is expected to be removed in Go 1.27).

𝗣 73581 • 𝗔 Michael Knyszek

# Faster cgo and syscalls

In the Go runtime, a processor (often referred to as a P) is a resource required to run the code. For a thread (a machine or M) to execute a goroutine (G), it must first acquire a processor.

Processors move through different states. They can be _Prunning (executing code), _Pidle (waiting for work), or _Pgcstop (paused because of the garbage collection).

Previously, processors had a state called _Psyscall used when a goroutine is making a system or cgo call. Now, this state has been removed. Instead of using a separate processor state, the system now checks the status of the goroutine assigned to the processor to see if it's involved in a system call.

This reduces internal runtime overhead and simplifies code paths for cgo and syscalls. The Go release notes say -30% in cgo runtime overhead, and the commit mentions an 18% sec/op improvement:

goos: linux
goarch: amd64
pkg: internal/runtime/cgobench
cpu: AMD EPYC 7B13
                   │ before.out  │             after.out              │
                   │   sec/op    │   sec/op     vs base               │
CgoCall-64           43.69n ± 1%   35.83n ± 1%  -17.99% (p=0.002 n=6)
CgoCallParallel-64   5.306n ± 1%   5.338n ± 1%        ~ (p=0.132 n=6)

I decided to run the CgoCall benchmarks locally as well:

goos: darwin
goarch: arm64
cpu: Apple M1
                      │ go1_25.txt  │             go1_26.txt              │
                      │   sec/op    │   sec/op     vs base                │
CgoCall-8               28.55n ± 4%   19.02n ± 2%  -33.40% (p=0.000 n=10)
CgoCallWithCallback-8   72.76n ± 5%   57.38n ± 2%  -21.14% (p=0.000 n=10)
geomean                 45.58n        33.03n       -27.53%

Either way, both a 20% and a 30% improvement are pretty impressive.

And here are the results from a local syscall benchmark:

goos: darwin
goarch: arm64
cpu: Apple M1
          │ go1_25.txt  │             go1_26.txt             │
          │   sec/op    │   sec/op     vs base               │
Syscall-8   195.6n ± 4%   178.1n ± 1%  -8.95% (p=0.000 n=10)
source
func BenchmarkSyscall(b *testing.B) {
    for b.Loop() {
        _, _, _ = syscall.Syscall(syscall.SYS_GETPID, 0, 0, 0)
    }
}

That's pretty good too.

𝗖𝗟 646198 • 𝗔 Michael Knyszek

# Faster memory allocation

The Go runtime now has specialized versions of its memory allocation function for small objects (from 1 to 512 bytes). It uses jump tables to quickly choose the right function for each size, instead of relying on a single general-purpose implementation.

The Go release notes say "the compiler will now generate calls to size-specialized memory allocation routines". But based on the code, that's not completely accurate: the compiler still emits calls to the general-purpose mallocgc function. Then, at runtime, mallocgc dispatches those calls to the new specialized allocation functions.

This change reduces the cost of small object memory allocations by up to 30%. The Go team expects the overall improvement to be ~1% in real allocation-heavy programs.

I couldn't find any existing benchmarks, so I came up with my own. And indeed, running it on Go 1.25 compared to 1.26 shows a significant improvement:

goos: darwin
goarch: arm64
cpu: Apple M1
           │  go1_25.txt   │              go1_26.txt              │
           │    sec/op     │    sec/op     vs base                │
Alloc1-8      8.190n ±  6%   6.594n ± 28%  -19.48% (p=0.011 n=10)
Alloc8-8      8.648n ± 16%   7.522n ±  4%  -13.02% (p=0.000 n=10)
Alloc64-8     15.70n ± 15%   12.57n ±  4%  -19.88% (p=0.000 n=10)
Alloc128-8    56.80n ±  4%   17.56n ±  4%  -69.08% (p=0.000 n=10)
Alloc512-8    81.50n ± 10%   55.24n ±  5%  -32.23% (p=0.000 n=10)
geomean       21.99n         14.33n        -34.83%
source
var sink *byte

func benchmarkAlloc(b *testing.B, size int) {
    b.ReportAllocs()
    for b.Loop() {
        obj := make([]byte, size)
        sink = &obj[0]
    }
}

func BenchmarkAlloc1(b *testing.B)   { benchmarkAlloc(b, 1) }
func BenchmarkAlloc8(b *testing.B)   { benchmarkAlloc(b, 8) }
func BenchmarkAlloc64(b *testing.B)  { benchmarkAlloc(b, 64) }
func BenchmarkAlloc128(b *testing.B) { benchmarkAlloc(b, 128) }
func BenchmarkAlloc512(b *testing.B) { benchmarkAlloc(b, 512) }

The new implementation is enabled by default. You can disable it by setting GOEXPERIMENT=nosizespecializedmalloc at build time (this option is expected to be removed in Go 1.27).

𝗖𝗟 665835 • 𝗔 Michael Matloob

# Vectorized operations (experimental)

The new simd/archsimd package provides access to architecture-specific vectorized operations (SIMD — single instruction, multiple data). This is a low-level package that exposes hardware-specific functionality. It currently only supports amd64 platforms.

Because different CPU architectures have very different SIMD operations, it's hard to create a single portable API that works for all of them. So the Go team decided to start with a low-level, architecture-specific API first, giving "power users" immediate access to SIMD features on the most common server platform — amd64.

The package defines vector types as structs, like Int8x16 (a 128-bit SIMD vector with sixteen 8-bit integers) and Float64x8 (a 512-bit SIMD vector with eight 64-bit floats). These match the hardware's vector registers. The package supports vectors that are 128, 256, or 512 bits wide.

Most operations are defined as methods on vector types. They usually map directly to hardware instructions with zero overhead.

To give you a taste, here's a custom function that uses SIMD instructions to add 32-bit float vectors:

func Add(a, b []float32) []float32 {
    if len(a) != len(b) {
        panic("slices of different length")
    }

    // If AVX-512 isn't supported, fall back to scalar addition,
    // since the Float32x16.Add method needs the AVX-512 instruction set.
    if !archsimd.X86.AVX512() {
        return fallbackAdd(a, b)
    }

    res := make([]float32, len(a))
    n := len(a)
    i := 0

    // 1. SIMD loop: Process 16 elements at a time.
    for i <= n-16 {
        // Load 16 elements from a and b vectors.
        va := archsimd.LoadFloat32x16Slice(a[i : i+16])
        vb := archsimd.LoadFloat32x16Slice(b[i : i+16])

        // Add all 16 elements in a single instruction
        // and store the results in the result vector.
        vSum := va.Add(vb) // translates to VADDPS asm instruction
        vSum.StoreSlice(res[i : i+16])

        i += 16
    }

    // 2. Scalar tail: Process any remaining elements (0-15).
    for ; i < n; i++ {
        res[i] = a[i] + b[i]
    }

    return res
}

Let's try it on two vectors:

func main() {
    a := []float32{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17}
    b := []float32{17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1}
    res := Add(a, b)
    fmt.Println(res)
}
[18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18]

Common operations in the archsimd package include:

  • Load a vector from array/slice, or Store a vector to array/slice.
  • Arithmetic: Add, Sub, Mul, Div, DotProduct.
  • Bitwise: And, Or, Not, Xor, Shift.
  • Comparison: Equal, Greater, Less, Min, Max.
  • Conversion: As, SaturateTo, TruncateTo.
  • Masking: Compress, Masked, Merge.
  • Rearrangement: Permute.

The package uses only AVX instructions, not SSE.

Here's a simple benchmark for adding two vectors (both the "plain" and SIMD versions use pre-allocated slices):

goos: linux
goarch: amd64
cpu: AMD EPYC 9575F 64-Core Processor
BenchmarkAddPlain/1k-2         	 1517698	       889.9 ns/op	13808.74 MB/s
BenchmarkAddPlain/65k-2        	   23448	     52613 ns/op	14947.46 MB/s
BenchmarkAddPlain/1m-2         	    2047	   1005628 ns/op	11932.84 MB/s
BenchmarkAddSIMD/1k-2          	36594340	        33.58 ns/op	365949.74 MB/s
BenchmarkAddSIMD/65k-2         	  410742	      3199 ns/op	245838.52 MB/s
BenchmarkAddSIMD/1m-2          	   12955	     94228 ns/op	127351.33 MB/s

source

The package is experimental and can be enabled by setting GOEXPERIMENT=simd at build time.

𝗗 simd/archsimd • 𝗣 73787 • 𝗖𝗟 701915, 712880, 729900, 732020 • 𝗔 Junyang Shao, Sean Liao, Tom Thorogood

# Secret mode (experimental)

Cryptographic protocols like WireGuard or TLS have a property called "forward secrecy". This means that even if an attacker gains access to long-term secrets (like a private key in TLS), they shouldn't be able to decrypt past communication sessions. To make this work, ephemeral keys (temporary keys used to negotiate the session) need to be erased from memory immediately after the handshake. If there's no reliable way to clear this memory, these keys could stay there indefinitely. An attacker who finds them later could re-derive the session key and decrypt past traffic, breaking forward secrecy.

In Go, the runtime manages memory, and it doesn't guarantee when or how memory is cleared. Sensitive data might remain in heap allocations or stack frames, potentially exposed in core dumps or through memory attacks. Developers often have to use unreliable "hacks" with reflection to try to zero out internal buffers in cryptographic libraries. Even so, some data might still stay in memory where the developer can't reach or control it.

The Go team's solution to this problem is the new runtime/secret package. It lets you run a function in secret mode. After the function finishes, it immediately erases (zeroes out) the registers and stack it used. Heap allocations made by the function are erased as soon as the garbage collector decides they are no longer reachable.

secret.Do(func() {
    // Generate an ephemeral key and
    // use it to negotiate the session.
})

This helps make sure sensitive information doesn't stay in memory longer than needed, lowering the risk of attackers getting to it.

Here's an example that shows how secret.Do might be used in a more or less realistic setting. Let's say you want to generate a session key while keeping the ephemeral private key and shared secret safe:

// DeriveSessionKey does an ephemeral key exchange to create a session key.
func DeriveSessionKey(peerPublicKey *ecdh.PublicKey) (*ecdh.PublicKey, []byte, error) {
    var pubKey *ecdh.PublicKey
    var sessionKey []byte
    var err error

    // Use secret.Do to contain the sensitive data during the handshake.
    // The ephemeral private key and the raw shared secret will be
    // wiped out when this function finishes.
    secret.Do(func() {
        // 1. Generate an ephemeral private key.
        // This is highly sensitive; if leaked later, forward secrecy is broken.
        privKey, e := ecdh.P256().GenerateKey(rand.Reader)
        if e != nil {
            err = e
            return
        }

        // 2. Compute the shared secret (ECDH).
        // This raw secret is also highly sensitive.
        sharedSecret, e := privKey.ECDH(peerPublicKey)
        if e != nil {
            err = e
            return
        }

        // 3. Derive the final session key (e.g., using HKDF).
        // We copy the result out; the inputs (privKey, sharedSecret)
        // will be destroyed by secret.Do when they become unreachable.
        sessionKey = performHKDF(sharedSecret)
        pubKey = privKey.PublicKey()
    })

    // The session key is returned for use, but the "recipe" to recreate it
    // is destroyed. Additionally, because the session key was allocated
    // inside the secret block, the runtime will automatically zero it out
    // when the application is finished using it.
    return pubKey, sessionKey, err
}

Here, the ephemeral private key and the raw shared secret are effectively "toxic waste" — they are necessary to create the final session key, but dangerous to keep around.

If these values stay in the heap and an attacker later gets access to the application's memory (for example, via a core dump or a vulnerability like Heartbleed), they could use these intermediates to re-derive the session key and decrypt past conversations.

By wrapping the calculation in secret.Do, we make sure that as soon as the session key is created, the "ingredients" used to make it are permanently destroyed. This means that even if the server is compromised in the future, this specific past session can't be exposed, which ensures forward secrecy.

func main() {
    // Generate a dummy peer public key.
    priv, _ := ecdh.P256().GenerateKey(nil)
    peerPubKey := priv.PublicKey()

    // Derive the session key.
    pubKey, sessionKey, err := DeriveSessionKey(peerPubKey)
    fmt.Printf("public key = %x...\n", pubKey.Bytes()[:16])
    fmt.Printf("error = %v\n", err)
    var _ = sessionKey
}
public key = 04288d5ade66bab4320a86d80993f628...
error = <nil>

The current secret.Do implementation only supports Linux (amd64 and arm64). On unsupported platforms, Do invokes the function directly. Also, trying to start a goroutine within the function causes a panic (this will be fixed in Go 1.27).

The runtime/secret package is mainly for developers who work on cryptographic libraries. Most apps should use higher-level libraries that use secret.Do behind the scenes.

The package is experimental and can be enabled by setting GOEXPERIMENT=runtimesecret at build time.

𝗗 runtime/secret • 𝗣 21865 • 𝗖𝗟 704615 • 𝗔 Daniel Morsing

# Reader-less cryptography

Current cryptographic APIs, like ecdsa.GenerateKey or rand.Prime, often accept an io.Reader as the source of random data:

// Generate a new ECDSA private key for the specified curve.
key, _ := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
fmt.Println(key.D)

// Generate a 64-bit integer that is prime with high probability.
prim, _ := rand.Prime(rand.Reader, 64)
fmt.Println(prim)
31253152889057471714062019675387570049552680140182252615946165331094890182019
17433987073571224703

These APIs don't commit to a specific way of using random bytes from the reader. Any change to underlying cryptographic algorithms can change the sequence or amount of bytes read. Because of this, if the application code (mistakenly) relies on a specific implementation in Go version X, it might fail or behave differently in version X+1.

The Go team chose a pretty bold solution to this problem. Now, most crypto APIs will just ignore the random io.Reader parameter and always use the system random source (crypto/internal/sysrand.Read).

// The reader parameter is no longer used, so you can just pass nil.

// Generate a new ECDSA private key for the specified curve.
key, _ := ecdsa.GenerateKey(elliptic.P256(), nil)
fmt.Println(key.D)

// Generate a 64-bit integer that is prime with high probability.
prim, _ := rand.Prime(nil, 64)
fmt.Println(prim)
16265662996876675161677719946085651215874831846675169870638460773593241527197
14874320216361938581

The change applies to the following crypo subpackages:

// crypto/dsa
func GenerateKey(priv *PrivateKey, rand io.Reader) error

// crypto/ecdh
type Curve interface {
    // ...
    GenerateKey(rand io.Reader) (*PrivateKey, error)
}

// crypto/ecdsa
func GenerateKey(c elliptic.Curve, rand io.Reader) (*PrivateKey, error)
func SignASN1(rand io.Reader, priv *PrivateKey, hash []byte) ([]byte, error)
func Sign(rand io.Reader, priv *PrivateKey, hash []byte) (r, s *big.Int, err error)
func (priv *PrivateKey) Sign(rand io.Reader, digest []byte, opts crypto.SignerOpts) ([]byte, error)

// crypto/rand
func Prime(rand io.Reader, bits int) (*big.Int, error)

// crypto/rsa
func GenerateKey(random io.Reader, bits int) (*PrivateKey, error)
func GenerateMultiPrimeKey(random io.Reader, nprimes int, bits int) (*PrivateKey, error)
func EncryptPKCS1v15(random io.Reader, pub *PublicKey, msg []byte) ([]byte, error)

ed25519.GenerateKey(rand) still uses the random reader if provided. But if rand is nil, it uses an internal secure source of random bytes instead of crypto/rand.Reader (which could be overridden).

To support deterministic testing, there's a new testing/cryptotest package with a single SetGlobalRandom function. It sets a global, deterministic cryptographic randomness source for the duration of the given test:

func Test(t *testing.T) {
    cryptotest.SetGlobalRandom(t, 42)

    // All test runs will generate the same numbers.
    p1, _ := rand.Prime(nil, 32)
    p2, _ := rand.Prime(nil, 32)
    p3, _ := rand.Prime(nil, 32)

    got := [3]int64{p1.Int64(), p2.Int64(), p3.Int64()}
    want := [3]int64{3713413729, 3540452603, 4293217813}
    if got != want {
        t.Errorf("got %v, want %v", got, want)
    }
}
PASS

SetGlobalRandom affects crypto/rand and all implicit sources of cryptographic randomness in the crypto/* packages:

func Test(t *testing.T) {
    cryptotest.SetGlobalRandom(t, 42)

    t.Run("rand.Read", func(t *testing.T) {
        var got [4]byte
        rand.Read(got[:])
        want := [4]byte{34, 48, 31, 184}
        if got != want {
            t.Errorf("got %v, want %v", got, want)
        }
    })

    t.Run("rand.Int", func(t *testing.T) {
        got, _ := rand.Int(rand.Reader, big.NewInt(10000))
        const want = 6185
        if got.Int64() != want {
            t.Errorf("got %v, want %v", got.Int64(), want)
        }
    })
}
PASS

To temporarily restore the old reader-respecting behavior, set GODEBUG=cryptocustomrand=1 (this option will be removed in a future release).

𝗗 testing/cryptotest • 𝗣 70942 • 𝗖𝗟 724480 • 𝗔 Filippo Valsorda, qiulaidongfeng

# Goroutine leak profile (experimental)

A leak occurs when one or more goroutines are indefinitely blocked on synchronization primitives like channels, while other goroutines continue running and the program as a whole keeps functioning. Here's a simple example:

func leak() <-chan int {
    out := make(chan int)
    go func() {
        out <- 42 // leaks if nobody reads from out
    }()
    return out
}

If we call leak and don't read from the output channel, the inner leak goroutine will stay blocked trying to send to the channel for the rest of the program:

func main() {
    leak()
    // ...
}
ok

Unlike deadlocks, leaks do not cause panics, so they are much harder to spot. Also, unlike data races, Go's tooling did not address them for a long time.

Things started to change in Go 1.24 with the introduction of the synctest package. Not many people talk about it, but synctest is a great tool for catching leaks during testing.

Go 1.26 adds a new experimental goroutineleak profile designed to report leaked goroutines in production. Here's how we can use it in the example above:

func main() {
    prof := pprof.Lookup("goroutineleak")
    leak()
    time.Sleep(50 * time.Millisecond)
    prof.WriteTo(os.Stdout, 2)
    // ...
}
goroutine 7 [chan send (leaked)]:
main.leak.func1()
    /tmp/sandbox/main.go:16 +0x1e
created by main.leak in goroutine 1
    /tmp/sandbox/main.go:15 +0x67

As you can see, we have a nice goroutine stack trace that shows exactly where the leak happens.

The goroutineleak profile finds leaks by using the garbage collector's marking phase to check which blocked goroutines are still connected to active code. It starts with runnable goroutines, marks all sync objects they can reach, and keeps adding any blocked goroutines waiting on those objects. When it can't add any more, any blocked goroutines left are waiting on resources that can't be reached — so they're considered leaked.

Tell me more

Here's the gist of it:

   [ Start: GC mark phase ]
             │ 1. Collect live goroutines
             v
   ┌───────────────────────┐
   │   Initial roots       │ <────────────────┐
   │ (runnable goroutines) │                  │
   └───────────────────────┘                  │
             │                                │
             │ 2. Mark reachable memory       │
             v                                │
   ┌───────────────────────┐                  │
   │   Reachable objects   │                  │
   │  (channels, mutexes)  │                  │
   └───────────────────────┘                  │
             │                                │
             │ 3a. Check blocked goroutines   │
             v                                │
   ┌───────────────────────┐          (Yes)   │
   │ Is blocked G waiting  │ ─────────────────┘
   │ on a reachable obj?   │ 3b. Add G to roots
   └───────────────────────┘
             │ (No - repeat until no new Gs found)
             v
   ┌───────────────────────┐
   │   Remaining blocked   │
   │      goroutines       │
   └───────────────────────┘
             │ 5. Report the leaks
             v
      [   LEAKED!   ]
 (Blocked on unreachable
  synchronization objects)
  1. Collect live goroutines. Start with currently active (runnable or running) goroutines as roots. Ignore blocked goroutines for now.
  2. Mark reachable memory. Trace pointers from roots to find which synchronization objects (like channels or wait groups) are currently reachable by these roots.
  3. Resurrect blocked goroutines. Check all currently blocked goroutines. If a blocked goroutine is waiting for a synchronization resource that was just marked as reachable — add that goroutine to the roots.
  4. Iterate. Repeat steps 2 and 3 until there are no more new goroutines blocked on reachable objects.
  5. Report the leaks. Any goroutines left in the blocked state are waiting for resources that no active part of the program can access. They're considered leaked.

For even more details, see the paper by Saioc et al.

If you want to see how goroutineleak (and synctest) can catch typical leaks that often happen in production — check out my article on goroutine leaks.

The goroutineleak profile is experimental and can be enabled by setting GOEXPERIMENT=goroutineleakprofile at build time. Enabling the experiment also makes the profile available as a net/http/pprof endpoint, /debug/pprof/goroutineleak.

According to the authors, the implementation is already production-ready. It's only marked as experimental so they can get feedback on the API, especially about making it a new profile.

𝗗 runtime/pprof • 𝗚 Detecting leaks • 𝗣 74609, 75280 • 𝗖𝗟 688335 • 𝗔 Vlad Saioc

# Goroutine metrics

New metrics in the runtime/metrics package give better insight into goroutine scheduling:

  • Total number of goroutines since the program started.
  • Number of goroutines in each state.
  • Number of active threads.

Here's the full list:

/sched/goroutines-created:goroutines
    Count of goroutines created since program start.

/sched/goroutines/not-in-go:goroutines
    Approximate count of goroutines running
    or blocked in a system call or cgo call.

/sched/goroutines/runnable:goroutines
    Approximate count of goroutines ready to execute,
    but not executing.

/sched/goroutines/running:goroutines
    Approximate count of goroutines executing.
    Always less than or equal to /sched/gomaxprocs:threads.

/sched/goroutines/waiting:goroutines
    Approximate count of goroutines waiting
    on a resource (I/O or sync primitives).

/sched/threads/total:threads
    The current count of live threads
    that are owned by the Go runtime.

Per-state goroutine metrics can be linked to common production issues. For example, an increasing waiting count can show a lock contention problem. A high not-in-go count means goroutines are stuck in syscalls or cgo. A growing runnable backlog suggests the CPUs can't keep up with demand.

You can read the new metric values using the regular metrics.Read function:

func main() {
    go work() // omitted for brevity
    time.Sleep(100 * time.Millisecond)

    fmt.Println("Goroutine metrics:")
    printMetric("/sched/goroutines-created:goroutines", "Created")
    printMetric("/sched/goroutines:goroutines", "Live")
    printMetric("/sched/goroutines/not-in-go:goroutines", "Syscall/CGO")
    printMetric("/sched/goroutines/runnable:goroutines", "Runnable")
    printMetric("/sched/goroutines/running:goroutines", "Running")
    printMetric("/sched/goroutines/waiting:goroutines", "Waiting")

    fmt.Println("Thread metrics:")
    printMetric("/sched/gomaxprocs:threads", "Max")
    printMetric("/sched/threads/total:threads", "Live")
}

func printMetric(name string, descr string) {
    sample := []metrics.Sample{{Name: name}}
    metrics.Read(sample)
    // Assuming a uint64 value; don't do this in production.
    // Instead, check sample[0].Value.Kind and handle accordingly.
    fmt.Printf("  %s: %v\n", descr, sample[0].Value.Uint64())
}
Goroutine metrics:
  Created: 57
  Live: 21
  Syscall/CGO: 0
  Runnable: 0
  Running: 1
  Waiting: 20
Thread metrics:
  Max: 2
  Live: 4

The per-state numbers (not-in-go + runnable + running + waiting) are not guaranteed to add up to the live goroutine count (/sched/goroutines:goroutines, available since Go 1.16).

All new metrics use uint64 counters.

𝗗 runtime/metrics • 𝗣 15490 • 𝗖𝗟 690397, 690398, 690399 • 𝗔 Michael Knyszek

# Reflective iterators

The new Type.Fields and Type.Methods methods in the reflect package return iterators for a type's fields and methods:

// List the fields of a struct type.
typ := reflect.TypeFor[http.Client]()
for f := range typ.Fields() {
    fmt.Println(f.Name, f.Type)
}
Transport http.RoundTripper
CheckRedirect func(*http.Request, []*http.Request) error
Jar http.CookieJar
Timeout time.Duration
// List the methods of a struct type.
typ := reflect.TypeFor[*http.Client]()
for m := range typ.Methods() {
    fmt.Println(m.Name, m.Type)
}
CloseIdleConnections func(*http.Client)
Do func(*http.Client, *http.Request) (*http.Response, error)
Get func(*http.Client, string) (*http.Response, error)
Head func(*http.Client, string) (*http.Response, error)
Post func(*http.Client, string, string, io.Reader) (*http.Response, error)
PostForm func(*http.Client, string, url.Values) (*http.Response, error)

The new methods Type.Ins and Type.Outs return iterators for the input and output parameters of a function type:

typ := reflect.TypeFor[filepath.WalkFunc]()

fmt.Println("Inputs:")
for par := range typ.Ins() {
    fmt.Println("-", par.Name())
}

fmt.Println("Outputs:")
for par := range typ.Outs() {
    fmt.Println("-", par.Name())
}
Input params:
- string
- FileInfo
- error
Output params:
- error

The new methods Value.Fields and Value.Methods return iterators for a value's fields and methods. Each iteration yields both the type information (StructField or Method) and the value:

client := &http.Client{}
val := reflect.ValueOf(client)

fmt.Println("Fields:")
for f, v := range val.Elem().Fields() {
    fmt.Printf("- name=%s kind=%s\n", f.Name, v.Kind())
}

fmt.Println("Methods:")
for m, v := range val.Methods() {
    fmt.Printf("- name=%s kind=%s\n", m.Name, v.Kind())
}
Fields:
- name=Transport kind=interface
- name=CheckRedirect kind=func
- name=Jar kind=interface
- name=Timeout kind=int64
Methods:
- name=CloseIdleConnections kind=func
- name=Do kind=func
- name=Get kind=func
- name=Head kind=func
- name=Post kind=func
- name=PostForm kind=func

Previously, you could get all this information by using a for-range loop with NumX methods (which is what iterators do internally):

// go 1.25
typ := reflect.TypeFor[http.Client]()
for i := range typ.NumField() {
    field := typ.Field(i)
    fmt.Println(field.Name, field.Type)
}
Transport http.RoundTripper
CheckRedirect func(*http.Request, []*http.Request) error
Jar http.CookieJar
Timeout time.Duration

Using an iterator is more concise. I hope it justifies the increased API surface.

𝗗 reflect • 𝗣 66631 • 𝗖𝗟 707356 • 𝗔 Quentin Quaadgras

# Peek into a buffer

The new Buffer.Peek method in the bytes package returns the next N bytes from the buffer without advancing it:

buf := bytes.NewBufferString("I love bytes")

sample, err := buf.Peek(1)
fmt.Printf("peek=%s err=%v\n", sample, err)

buf.Next(2)

sample, err = buf.Peek(4)
fmt.Printf("peek=%s err=%v\n", sample, err)
peek=I err=<nil>
peek=love err=<nil>

If Peek returns fewer than N bytes, it also returns io.EOF:

buf := bytes.NewBufferString("hello")
sample, err := buf.Peek(10)
fmt.Printf("peek=%s err=%v\n", sample, err)
peek=hello err=EOF

The slice returned by Peek points to the buffer's content and stays valid until the buffer is changed. So, if you change the slice right away, it will affect future reads:

buf := bytes.NewBufferString("car")
sample, err := buf.Peek(3)
fmt.Printf("peek=%s err=%v\n", sample, err)

sample[2] = 't' // changes the underlying buffer

data, err := buf.ReadBytes(0)
fmt.Printf("data=%s err=%v\n", data, err)
peek=car err=<nil>
data=cat err=EOF

The slice returned by Peek is only valid until the next call to a read or write method.

𝗗 Buffer.Peek • 𝗣 73794 • 𝗖𝗟 674415 • 𝗔 Ilia Choly

# Process handle

After you start a process in Go, you can access its ID:

attr := &os.ProcAttr{Files: []*os.File{os.Stdin, os.Stdout, os.Stderr}}
proc, _ := os.StartProcess("/bin/echo", []string{"echo", "hello"}, attr)
defer proc.Wait()

fmt.Println("pid =", proc.Pid)
pid = 41
hello

Internally, the os.Process type uses a process handle instead of the PID (which is just an integer), if the operating system supports it. Specifically, in Linux it uses pidfd, which is a file descriptor that refers to a process. Using the handle instead of the PID makes sure that Process methods always work with the same OS process, and not a different process that just happens to have the same ID.

Previously, you couldn't access the process handle. Now you can, thanks to the new Process.WithHandle method:

func (p *Process) WithHandle(f func(handle uintptr)) error

WithHandle calls a specified function and passes a process handle as an argument:

attr := &os.ProcAttr{Files: []*os.File{os.Stdin, os.Stdout, os.Stderr}}
proc, _ := os.StartProcess("/bin/echo", []string{"echo", "hello"}, attr)
defer proc.Wait()

fmt.Println("pid =", proc.Pid)
proc.WithHandle(func(handle uintptr) {
    fmt.Println("handle =", handle)
})
pid = 49
handle = 6
hello

The handle is guaranteed to refer to the process until the callback function returns, even if the process has already terminated. That's why it's implemented as a callback instead of a Process.Handle field or method.

WithHandle is only supported on Linux 5.4+ and Windows. On other operating systems, it doesn't execute the callback and returns an os.ErrNoHandle error.

𝗗 Process.WithHandle • 𝗣 70352 • 𝗖𝗟 699615 • 𝗔 Kir Kolyshkin

# Signal as cause

signal.NotifyContext returns a context that gets canceled when any of the specified signals is received. Previously, the canceled context only showed the standard "context canceled" cause:

// go 1.25

// The context will be canceled on SIGINT signal.
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
defer stop()

// Send SIGINT to self.
p, _ := os.FindProcess(os.Getpid())
_ = p.Signal(syscall.SIGINT)

// Wait for SIGINT.
<-ctx.Done()
fmt.Println("err =", ctx.Err())
fmt.Println("cause =", context.Cause(ctx))
err = context canceled
cause = context canceled

Now the context's cause shows exactly which signal was received:

// go 1.26

// The context will be canceled on SIGINT signal.
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
defer stop()

// Send SIGINT to self.
p, _ := os.FindProcess(os.Getpid())
_ = p.Signal(syscall.SIGINT)

// Wait for SIGINT.
<-ctx.Done()
fmt.Println("err =", ctx.Err())
fmt.Println("cause =", context.Cause(ctx))
err = context canceled
cause = interrupt signal received

The returned type, signal.signalError, is based on string, so it doesn't provide the actual os.Signal value — just its string representation.

𝗗 signal.NotifyContext • 𝗖𝗟 721700 • 𝗔 Filippo Valsorda

# Compare IP subnets

An IP address prefix represents an IP subnet. These prefixes are usually written in CIDR notation:

10.0.0.0/16
127.0.0.0/8
169.254.0.0/16
203.0.113.0/24

In Go, an IP prefix is represented by the netip.Prefix type.

The new Prefix.Compare method lets you compare two IP prefixes, making it easy to sort them without having to write your own comparison code:

prefixes := []netip.Prefix{
    netip.MustParsePrefix("10.1.0.0/16"),
    netip.MustParsePrefix("203.0.113.0/24"),
    netip.MustParsePrefix("10.0.0.0/16"),
    netip.MustParsePrefix("169.254.0.0/16"),
    netip.MustParsePrefix("203.0.113.0/8"),
}

slices.SortFunc(prefixes, netip.Prefix.Compare)

for _, p := range prefixes {
    fmt.Println(p.String())
}
10.0.0.0/16
10.1.0.0/16
169.254.0.0/16
203.0.113.0/8
203.0.113.0/24

Compare orders two prefixes as follows:

  • First by validity (invalid before valid).
  • Then by address family (IPv4 before IPv6).
    10.0.0.0/8 < ::/8
  • Then by masked IP address (network IP).
    10.0.0.0/16 < 10.1.0.0/16
  • Then by prefix length.
    10.0.0.0/8 < 10.0.0.0/16
  • Then by unmasked address (original IP).
    10.0.0.0/8 < 10.0.0.1/8

This follows the same order as Python's netaddr.IPNetwork and the standard IANA (Internet Assigned Numbers Authority) convention.

𝗗 Prefix.Compare • 𝗣 61642 • 𝗖𝗟 700355 • 𝗔 database64128

# Context-aware dialing

The net package has top-level functions for connecting to an address using different networks (protocols) — DialTCP, DialUDP, DialIP, and DialUnix. They were made before context.Context was introduced, so they don't support cancellation:

raddr, _ := net.ResolveTCPAddr("tcp", "127.0.0.1:12345")
conn, err := net.DialTCP("tcp", nil, raddr)
fmt.Printf("connected, err=%v\n", err)
defer conn.Close()
connected, err=<nil>

There's also a net.Dialer type with a general-purpose DialContext method. It supports cancellation and can be used to connect to any of the known networks:

var d net.Dialer
ctx := context.Background()
conn, err := d.DialContext(ctx, "tcp", "127.0.0.1:12345")
fmt.Printf("connected, err=%v\n", err)
defer conn.Close()
connected, err=<nil>

However, DialContext a bit less efficient than network-specific functions like net.DialTCP — because of the extra overhead from address resolution and network type dispatching.

So, network-specific functions in the net package are more efficient, but they don't support cancellation. The Dialer type supports cancellation, but it's less efficient. The Go team decided to resolve this contradiction.

The new context-aware Dialer methods (DialTCP, DialUDP, DialIP, and DialUnix) combine the efficiency of the existing network-specific net functions with the cancellation capabilities of Dialer.DialContext:

var d net.Dialer
ctx := context.Background()
raddr := netip.MustParseAddrPort("127.0.0.1:12345")
conn, err := d.DialTCP(ctx, "tcp", netip.AddrPort{}, raddr)
fmt.Printf("connected, err=%v\n", err)
defer conn.Close()
connected, err=<nil>

I wouldn't say that having three different ways to dial is very convenient, but that's the price of backward compatibility.

𝗗 net.Dialer • 𝗣 49097 • 𝗖𝗟 490975 • 𝗔 Michael Fraenkel

# Fake example.com

The default httptest.Server certificate already lists example.com in its DNSNames (a list of hostnames or domain names that the certificate is authorized to secure). Because of this, Server.Client doesn't trust responses from the real example.com:

// go 1.25
func Test(t *testing.T) {
    handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        w.Write([]byte("hello"))
    })
    srv := httptest.NewTLSServer(handler)
    defer srv.Close()

    _, err := srv.Client().Get("https://example.com")
    if err != nil {
        t.Fatal(err)
    }
}
--- FAIL: Test (0.29s)
    main_test.go:19: Get "https://example.com":
    tls: failed to verify certificate:
    x509: certificate signed by unknown authority

To fix this issue, the HTTP client returned by httptest.Server.Client now redirects requests for example.com and its subdomains to the test server:

// go 1.26
func Test(t *testing.T) {
    handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        w.Write([]byte("hello"))
    })
    srv := httptest.NewTLSServer(handler)
    defer srv.Close()

    resp, err := srv.Client().Get("https://example.com")
    if err != nil {
        t.Fatal(err)
    }

    body, _ := io.ReadAll(resp.Body)
    resp.Body.Close()

    if string(body) != "hello" {
        t.Errorf("Unexpected response body: %s", body)
    }
}
PASS

𝗗 Server.Client • 𝗖𝗟 666855 • 𝗔 Sean Liao

# Optimized fmt.Errorf

People often point out that using fmt.Errorf("x") for plain strings causes more memory allocations than errors.New("x"). Because of this, some suggest switching code from fmt.Errorf to errors.New when formatting isn't needed.

The Go team disagrees. Here's a quote from Russ Cox:

Using fmt.Errorf("foo") is completely fine, especially in a program where all the errors are constructed with fmt.Errorf. Having to mentally switch between two functions based on the argument is unnecessary noise.

With the new Go release, this debate should finally be settled. For unformatted strings, fmt.Errorf now allocates less and generally matches the allocations for errors.New.

Specifically, fmt.Errorf goes from 2 allocations to 0 allocations for a non-escaping error, and from 2 allocations to 1 allocation for an escaping error:

_ = fmt.Errorf("foo")    // non-escaping error
sink = fmt.Errorf("foo") // escaping error

This matches the allocations for errors.New in both cases.

The difference in CPU cost is also much smaller now. Previously, it was ~64ns vs. ~21ns for fmt.Errorf vs. errors.New for escaping errors, now it's ~25ns vs. ~21ns.

Tell me more

Here are the "before and after" benchmarks for the fmt.Errorf change. The non-escaping case is called local, and the escaping case is called sink. If there's just a plain error string, it's no-args. If the error includes formatting, it's int-arg.

Seconds per operation:

goos: linux
goarch: amd64
pkg: fmt
cpu: AMD EPYC 7B13
                         │    old.txt    │        new.txt        │
                         │      sec/op   │   sec/op     vs base  │
Errorf/no-arsg/local-16     63.76n ± 1%     4.874n ± 0%  -92.36% (n=120)
Errorf/no-args/sink-16      64.25n ± 1%     25.81n ± 0%  -59.83% (n=120)
Errorf/int-arg/local-16     90.86n ± 1%     90.97n ± 1%        ~ (p=0.713 n=120)
Errorf/int-arg/sink-16      91.81n ± 1%     91.10n ± 1%   -0.76% (p=0.036 n=120)

Bytes per operation:

                         │    old.txt    │        new.txt       │
                         │       B/op    │    B/op     vs base  │
Errorf/no-args/local-16      19.00 ± 0%      0.00 ± 0%  -100.00% (n=120)
Errorf/no-args/sink-16       19.00 ± 0%     16.00 ± 0%   -15.79% (n=120)
Errorf/int-arg/local-16      24.00 ± 0%     24.00 ± 0%         ~ (p=1.000 n=120)
Errorf/int-arg/sink-16       24.00 ± 0%     24.00 ± 0%         ~ (p=1.000 n=120)

Allocations per operation:

                         │    old.txt    │        new.txt       │
                         │    allocs/op  │  allocs/op   vs base │
Errorf/no-args/local-16      2.000 ± 0%     0.000 ± 0%  -100.00% (n=120)
Errorf/no-args/sink-16       2.000 ± 0%     1.000 ± 0%   -50.00% (n=120)
Errorf/int-arg/local-16      2.000 ± 0%     2.000 ± 0%         ~ (p=1.000 n=120)
Errorf/int-arg/sink-16       2.000 ± 0%     2.000 ± 0%         ~ (p=1.000 n=120)

source

If you're interested in the details, I highly recommend reading the CL — it's perfectly written.

𝗗 fmt.Errorf • 𝗖𝗟 708836 • 𝗔 thepudds

# Optimized io.ReadAll

Previously, io.ReadAll allocated a lot of intermediate memory as it grew its result slice to the size of the input data. Now, it uses intermediate slices of exponentially growing size, and then copies them into a final perfectly-sized slice at the end.

The new implementation is about twice as fast and uses roughly half the memory for a 65KiB input; it's even more efficient with larger inputs. Here are the geomean results comparing the old and new versions for different input sizes:

                      │     old     │      new       vs base    │
          sec/op           132.2µ        66.32µ     -49.83%
            B/op          645.4Ki       324.6Ki     -49.70%
  final-capacity           178.3k        151.3k     -15.10%
    excess-ratio            1.216         1.033     -15.10%

See the full benchmark results in the commit. Unfortunately, the author didn't provide the benchmark source code.

Ensuring the final slice is minimally sized is also quite helpful. The slice might persist for a long time, and the unused capacity in a backing array (as in the old version) would just waste memory.

As with the fmt.Errorf optimization, I recommend reading the CL — it's very good. Both changes come from thepudds, whose change descriptions are every reviewer's dream come true.

𝗗 io.ReadAll • 𝗖𝗟 722500 • 𝗔 thepudds

# Multiple log handlers

The log/slog package, introduced in version 1.21, offers a reliable, production-ready logging solution. Since its release, many projects have switched from third-party logging packages to use it. However, it was missing one key feature: the ability to send log records to multiple handlers, such as stdout or a log file.

The new MultiHandler type solves this problem. It implements the standard Handler interface and calls all the handlers you set up.

For example, we can create a log handler that writes to stdout:

stdoutHandler := slog.NewTextHandler(os.Stdout, nil)

And another handler that writes to a file:

const flags = os.O_CREATE | os.O_WRONLY | os.O_APPEND
file, _ := os.OpenFile("/tmp/app.log", flags, 0644)
defer file.Close()
fileHandler := slog.NewJSONHandler(file, nil)

Finally, combine them using a MultiHandler:

// MultiHandler that writes to both stdout and app.log.
multiHandler := slog.NewMultiHandler(stdoutHandler, fileHandler)
logger := slog.New(multiHandler)

// Log a sample message.
logger.Info("login",
    slog.String("name", "whoami"),
    slog.Int("id", 42),
)
time=2025-12-31T11:46:14.521Z level=INFO msg=login name=whoami id=42
{"time":"2025-12-31T11:46:14.521126342Z","level":"INFO","msg":"login","name":"whoami","id":42}

I'm also printing the file contents here to show the results.

When the MultiHandler receives a log record, it sends it to each enabled handler one by one. If any handler returns an error, MultiHandler doesn't stop; instead, it combines all the errors using errors.Join:

hInfo := slog.NewTextHandler(
    os.Stdout, &slog.HandlerOptions{Level: slog.LevelInfo},
)
hErrorsOnly := slog.NewTextHandler(
    os.Stdout, &slog.HandlerOptions{Level: slog.LevelError},
)
hBroken := &BrokenHandler{
    Handler: hInfo,
    err:     fmt.Errorf("broken handler"),
}

handler := slog.NewMultiHandler(hBroken, hInfo, hErrorsOnly)
rec := slog.NewRecord(time.Now(), slog.LevelInfo, "hello", 0)

// Calls hInfo and hBroken, skips hErrorsOnly.
// Returns an error from hBroken.
err := handler.Handle(context.Background(), rec)
fmt.Println(err)
time=2025-12-31T13:32:52.110Z level=INFO msg=hello
broken handler

The Enable method reports whether any of the configured handlers is enabled:

hInfo := slog.NewTextHandler(
    os.Stdout, &slog.HandlerOptions{Level: slog.LevelInfo},
)
hErrors := slog.NewTextHandler(
    os.Stdout, &slog.HandlerOptions{Level: slog.LevelError},
)
handler := slog.NewMultiHandler(hInfo, hErrors)

// hInfo is enabled.
enabled := handler.Enabled(context.Background(), slog.LevelInfo)
fmt.Println(enabled)
true

Other methods — WithAttr and WithGroup — call the corresponding methods on each of the enabled handlers.

𝗗 slog.MultiHandler • 𝗣 65954 • 𝗖𝗟 692237 • 𝗔 Jes Cok

# Test artifacts

Test artifacts are files created by tests or benchmarks, such as execution logs, memory dumps, or analysis reports. They are important for debugging failures in remote environments (like CI), where developers can't step through the code manually.

Previously, the Go test framework and tools didn't support test artifacts. Now they do.

The new methods T.ArtifactDir, B.ArtifactDir, and F.ArtifactDir return a directory where you can write test output files:

func TestFunc(t *testing.T) {
    dir := t.ArtifactDir()
    logFile := filepath.Join(dir, "app.log")
    content := []byte("Loading user_id=123...\nERROR: Connection failed\n")
    os.WriteFile(logFile, content, 0644)
    t.Log("Saved app.log")
}

If you use go test with -artifacts, this directory will be inside the output directory (specified by -outputdir, or the current directory by default):

go1.26rc1 test -v -artifacts -outputdir=/tmp/output
=== RUN   TestFunc
=== ARTIFACTS TestFunc /tmp/output/_artifacts/2933211134
    artifacts_test.go:14: Saved app.log
--- PASS: TestFunc (0.00s)

As you can see, the first time ArtifactDir is called, it writes the directory location to the test log, which is quite handy.

If you don't use -artifacts, artifacts are stored in a temporary directory which is deleted after the test completes.

Each test or subtest within each package has its own unique artifact directory. Subtest outputs are not stored inside the parent test's output directory — all artifact directories for a given package are created at the same level:

func TestFunc(t *testing.T) {
    t.ArtifactDir()
    t.Run("subtest 1", func(t *testing.T) {
        t.ArtifactDir()
    })
    t.Run("subtest 2", func(t *testing.T) {
        t.ArtifactDir()
    })
}
=== RUN   TestFunc
=== ARTIFACTS TestFunc /tmp/output/_artifacts/2878232317
=== RUN   TestFunc/subtest_1
=== ARTIFACTS TestFunc/subtest_1 /tmp/output/_artifacts/1651881503
=== RUN   TestFunc/subtest_2
=== ARTIFACTS TestFunc/subtest_2 /tmp/output/_artifacts/3341607601

The artifact directory path normally looks like this:

<output dir>/_artifacts/<test package>/<test name>/<random>

But if this path can't be safely converted into a local file path (which, for some reason, always happens on my machine), the path will simply be:

<output dir>/_artifacts/<random>

(which is what happens in the examples above)

Repeated calls to ArtifactDir in the same test or subtest return the same directory.

𝗗 T.ArtifactDir • 𝗣 71287 • 𝗖𝗟 696399 • 𝗔 Damien Neil

# Modernized go fix

Over the years, the go fix command became a sad, neglected bag of rewrites for very ancient Go features. But now, it's making a comeback.

The new go fix is re-implemented using the Go analysis framework — the same one go vet uses.

While go fix and go vet now use the same infrastructure, they have different purposes and use different sets of analyzers:

  • Vet is for reporting problems. Its analyzers describe actual issues, but they don't always suggest fixes, and the fixes aren't always safe to apply.
  • Fix is (mostly) for modernizing the code to use newer language and library features. Its analyzers produce fixes are always safe to apply, but don't necessarily indicate problems with the code.
usage: go fix [build flags] [-fixtool prog] [fix flags] [packages]

Fix runs the Go fix tool (cmd/fix) on the named packages
and applies suggested fixes.

It supports these flags:

  -diff
        instead of applying each fix, print the patch as a unified diff

The -fixtool=prog flag selects a different analysis tool with
alternative or additional fixers.

By default, go fix runs a full set of analyzers (currently, there are more than 20). To choose specific analyzers, use the -NAME flag for each one, or use -NAME=false to run all analyzers except the ones you turned off.

For example, here we only enable the forvar analyzer:

go fix -forvar .

And here, we enable all analyzers except omitzero:

go fix -omitzero=false .

Currently, there's no way to suppress specific analyzers for certain files or sections of code.

To give you a taste of go fix analyzers, here's one of them in action. It replaces loops with slices.Contains or slices.ContainsFunc:

// before go fix
func find(s []int, x int) bool {
    for _, v := range s {
        if x == v {
            return true
        }
    }
    return false
}
// after go fix
func find(s []int, x int) bool {
    return slices.Contains(s, x)
}

If you're interested, check out the dedicated blog post for the full list of analyzers with examples.

𝗗 cmd/fix • 𝗚 go fix • 𝗣 71859 • 𝗔 Alan Donovan

# Final thoughts

Go 1.26 is incredibly big — it's the largest release I've ever seen, and for good reason:

  • It brings a lot of useful updates, like the improved new builtin, type-safe error checking, and goroutine leak detector.
  • There are also many performance upgrades, including the new garbage collector, faster cgo and memory allocation, and optimized fmt.Errorf and io.ReadAll.
  • On top of that, it adds quality-of-life features like multiple log handlers, test artifacts, and the updated go fix tool.
  • Finally, there are two specialized experimental packages: one with SIMD support and another with protected mode for forward secrecy.

All in all, a great release!

You might be wondering about the json/v2 package that was introduced as experimental in 1.25. It's still experimental and available with the GOEXPERIMENT=jsonv2 flag.

P.S. To catch up on other Go releases, check out the Go features by version list or explore the interactive tours for Go 1.25 and 1.24.

P.P.S. Want to learn more about Go? Check out my interactive book on concurrency

]]>
Fear is not advocacyhttps://antonz.org/ai-advocacy/Sun, 04 Jan 2026 12:00:00 +0000https://antonz.org/ai-advocacy/And you are going to be fine.AI advocates seem to be the only kind of technology advocates who feel this imminent urge to constantly criticize developers for not being excited enough about their tech.

It would be crazy if I presented new Go features like this:

If you still don't use the synctest package, all your systems will eventually succumb to concurrency bugs.

or

If you don't use iterators, you have absolutely nothing interesting to build.

The job of an advocate is to spark interest, not to reproach people or instill FOMO. And yet that's exactly what AI advocates do.

What a weird way to advocate.

It's okay not to be early

This whole "devote your life to AI right now, or you'll be out of a job soon" narrative is false.

You don't have to be a world-class algorithm expert to write good software. You don't have to be a Linux expert to use containers. And you don't have to spend all your time now trying to become an expert in chasing ever-changing AI tech.

As with any new technology, developers adopting AI typically fall into four groups: early adopters, early majority, late majority, and laggards. Right now, AI advocates are trying to shame everyone into becoming early adopters. But it's perfectly okay to wait if you're sceptical. Being part of the late majority is a safe and reasonable choice. If anything, you'll have fewer bugs to deal with.

As the industry adopts AI practices, you'll naturally absorb just the right amount of them.

You are going to be fine.

]]>
2026https://antonz.org/2026/Thu, 01 Jan 2026 00:00:00 +0000https://antonz.org/2026/'Better C' playgroundshttps://antonz.org/better-c/Fri, 26 Dec 2025 19:00:00 +0000https://antonz.org/better-c/Playgrounds for C3, Hare, Odin, V, and Zig.I have a soft spot for the "better C" family of languages: C3, Hare, Odin, V, and Zig.

I'm not saying these languages are actually better than C — they're just different. But I needed to come up with an umbrella term for them, and "better C" was the only thing that came to mind.

I believe playgrounds and interactive documentation make programming languages easier for more people to learn. That's why I created online sandboxes for these langs. You can try them out below, embed them on your own website, or self-host and customize them.

If you're already familiar with one of these languages, maybe you could even create an interactive guide for it? I'm happy to help if you want to give it a try.

C3 • Hare • Odin • V • Zig • Editors

C3

An ergonomic, safe, and familiar evolution of C.

import std::io;

fn void greet(String name)
{
    io::printfn("Hello, %s!", name);
}

fn void main()
{
    greet("World");
}
Hello, World!

⛫ homepage • αω tutorial • ⚘ community

Hare

A systems programming language designed to be simple, stable, and robust.

use fmt;

fn greet(user: str) void = {
	fmt::printfln("Hello, {}!", user)!;
};

export fn main() void = {
	greet("World");
};
Hello, World!

⛫ homepage • αω tutorial • ⚘ community

Odin

A high-performance, data-oriented systems programming language.

package main

import "core:fmt"

greet :: proc(name: string) {
    fmt.printf("Hello, %s!\n", name)
}

main :: proc() {
    greet("World")
}
Hello, World!

⛫ homepage • αω tutorial • ⚘ community

V

A language with C-level performance and rapid compilation speeds.

fn greet(name string) {
	println('Hello, ${name}!')
}

fn main() {
    greet("World")
}
Hello, World!

⛫ homepage • αω tutorial • ⚘ community

Zig

A language designed for performance and explicit control with powerful metaprogramming.

const std = @import("std");

pub fn greet(name: []const u8) void {
    std.debug.print("Hello, {s}!\n", .{name});
}

pub fn main() void {
    greet("World");
}
Hello, World!

⛫ homepage • αω tutorial • ⚘ community

Editors

If you want to do more than just "hello world," there are also full-size online editors. They're pretty basic, but still can be useful.

]]>
Go feature: Modernized go fixhttps://antonz.org/accepted/modernized-go-fix/Sat, 20 Dec 2025 16:00:00 +0000https://antonz.org/accepted/modernized-go-fix/With a fresh set of analyzers and the same backend as go vet.Part of the Accepted! series: Go proposals and features explained in simple terms.

The modernized go fix command uses a fresh set of analyzers and the same infrastructure as go vet.

Ver. 1.26 • Tools • Medium impact

Summary

The go fix is re-implemented using the Go analysis framework — the same one go vet uses.

While go fix and go vet now use the same infrastructure, they have different purposes and use different sets of analyzers:

  • Vet is for reporting problems. Its analyzers describe actual issues, but they don't always suggest fixes, and the fixes aren't always safe to apply.
  • Fix is (mostly) for modernizing the code to use newer language and library features. Its analyzers produce fixes are always safe to apply, but don't necessarily indicate problems with the code.

See the full set of fix's analyzers in the Analyzers section.

Motivation

The main goal is to bring modernization tools from the Go language server (gopls) to the command line. If go fix includes the modernize suite, developers can easily and safely update their entire codebase after a new Go release with just one command.

Re-implementing go fix also makes the Go toolchain simpler. The unified go fix and go vet use the same backend framework and extension mechanism. This makes the tools more consistent, easier to maintain, and more flexible for developers who want to use custom analysis tools.

Description

Implement the new go fix command:

usage: go fix [build flags] [-fixtool prog] [fix flags] [packages]

Fix runs the Go fix tool (cmd/fix) on the named packages
and applies suggested fixes.

It supports these flags:

  -diff
        instead of applying each fix, print the patch as a unified diff

The -fixtool=prog flag selects a different analysis tool with
alternative or additional fixers.

By default, go fix runs a full set of analyzers (see the list below). To choose specific analyzers, use the -NAME flag for each one, or use -NAME=false to run all analyzers except the ones you turned off.

For example, here we only enable the forvar analyzer:

go fix -forvar .

And here, we enable all analyzers except omitzero:

go fix -omitzero=false .

Currently, there's no way to suppress specific analyzers for certain files or sections of code.

The -fixtool=prog flag selects a different analysis tool instead of the default one. For example, you can build and run the "stringintconv" analyzer, which fixes string(int) conversions, by using these commands:

go install golang.org/x/tools/go/analysis/passes/stringintconv/cmd/stringintconv@latest
go fix -fixtool=$(which stringintconv)

Alternative fix tools should be built atop unitchecker, which handles the interaction with go fix.

Analyzers

Here's the list of fixes currently available in go fix, along with examples.

any • bloop • fmtappendf • forvar • hostport • inline • mapsloop • minmax • newexpr • omitzero • plusbuild • rangeint • reflecttypefor • slicescontains • slicessort • stditerators • stringsbuilder • stringscut • stringcutprefix • stringsseq • testingcontext • waitgroup

any

Replace interface{} with any:

// before
func main() {
    var val interface{}
    val = 42
    fmt.Println(val)
}
// after
func main() {
    var val any
    val = 42
    fmt.Println(val)
}

bloop

Replace for-range over b.N with b.Loop and remove unnecessary manual timer control:

// before
func Benchmark(b *testing.B) {
    s := make([]int, 1000)
    for i := range s {
        s[i] = i
    }
    b.ResetTimer()

    for range b.N {
        Calc(s)
    }
}
// after
func Benchmark(b *testing.B) {
    s := make([]int, 1000)
    for i := range s {
        s[i] = i
    }

    for b.Loop() {
        Calc(s)
    }
}

fmtappendf

Replace []byte(fmt.Sprintf) with fmt.Appendf to avoid intermediate string allocation:

// before
func format(id int, name string) []byte {
    return []byte(fmt.Sprintf("ID: %d, Name: %s", id, name))
}
// after
func format(id int, name string) []byte {
    return fmt.Appendf(nil, "ID: %d, Name: %s", id, name)
}

forvar

Remove unnecessary shadowing of loop variables:

// before
func main() {
    for x := range 4 {
        x := x
        go func() {
            fmt.Println(x)
        }()
    }
}
// after
func main() {
    for x := range 4 {
        go func() {
            fmt.Println(x)
        }()
    }
}

hostport

Replace network addresses created with fmt.Sprintf by using net.JoinHostPort instead, because host-port pairs made with %s:%d or %s:%s format strings don't work with IPv6:

// before
func main() {
    host := "::1"
    port := 8080
    addr := fmt.Sprintf("%s:%d", host, port)
    net.Dial("tcp", addr)
}
// after
func main() {
    host := "::1"
    port := 8080
    addr := net.JoinHostPort(host, fmt.Sprintf("%d", port))
    net.Dial("tcp", addr)
}

inline

Inline function calls accoring to the go:fix inline comment directives:

// before
//go:fix inline
func Square(x float64) float64 {
    return math.Pow(float64(x), 2)
}

func main() {
    fmt.Println(Square(5))
}
// after
//go:fix inline
func Square(x float64) float64 {
    return math.Pow(float64(x), 2)
}

func main() {
    fmt.Println(math.Pow(float64(5), 2))
}

mapsloop

Replace explicit loops over maps with calls to maps package (Copy, Insert, Clone, or Collect depending on the context):

// before
func copyMap(src map[string]int) map[string]int {
    dest := make(map[string]int, len(src))
    for k, v := range src {
        dest[k] = v
    }
    return dest
}
// after
func copyMap(src map[string]int) map[string]int {
    dest := make(map[string]int, len(src))
    maps.Copy(dest, src)
    return dest
}

minmax

Replace if/else statements with calls to min or max:

// before
func calc(a, b int) int {
    var m int
    if a > b {
        m = a
    } else {
        m = b
    }
    return m * (b - a)
}
// after
func calc(a, b int) int {
    var m int
    m = max(a, b)
    return m * (b - a)
}

newexpr

Replace custom "pointer to" functions with new(expr):

// before
type Pet struct {
    Name  string
    Happy *bool
}

func ptrOf[T any](v T) *T {
    return &v
}

func main() {
    p := Pet{Name: "Fluffy", Happy: ptrOf(true)}
    fmt.Println(p)
}
// after
type Pet struct {
    Name  string
    Happy *bool
}

//go:fix inline
func ptrOf[T any](v T) *T {
    return new(v)
}

func main() {
    p := Pet{Name: "Fluffy", Happy: new(true)}
    fmt.Println(p)
}

omitzero

Remove omitempty from struct-type fields because this tag doesn't have any effect on them:

// before
type Person struct {
    Name string `json:"name"`
    Pet  Pet    `json:"pet,omitempty"`
}

type Pet struct {
    Name string
}
// after
type Person struct {
    Name string `json:"name"`
    Pet  Pet    `json:"pet"`
}

type Pet struct {
    Name string
}

plusbuild

Remove obsolete //+build comments:

//go:build linux && amd64
// +build linux,amd64

package main

func main() {
    var _ = 42
}
//go:build linux && amd64

package main

func main() {
    var _ = 42
}

rangeint

Replace 3-clause for loops with for-range over integers:

// before
func main() {
    for i := 0; i < 5; i++ {
        fmt.Print(i)
    }
}
// after
func main() {
    for i := range 5 {
        fmt.Print(i)
    }
}

reflecttypefor

Replace reflect.TypeOf(x) with reflect.TypeFor[T]() when the type is known at compile time:

// before
func main() {
    n := uint64(0)
    typ := reflect.TypeOf(n)
    fmt.Println("size =", typ.Bits())
}
// after
func main() {
    typ := reflect.TypeFor[uint64]()
    fmt.Println("size =", typ.Bits())
}

slicescontains

Replace loops with slices.Contains or slices.ContainsFunc:

// before
func find(s []int, x int) bool {
    for _, v := range s {
        if x == v {
            return true
        }
    }
    return false
}
// after
func find(s []int, x int) bool {
    return slices.Contains(s, x)
}

slicessort

Replace sort.Slice with slices.Sort for basic types:

// before
func main() {
    s := []int{22, 11, 33, 55, 44}
    sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
    fmt.Println(s)
}
// after
func main() {
    s := []int{22, 11, 33, 55, 44}
    slices.Sort(s)
    fmt.Println(s)
}

stditerators

Use iterators instead of Len/At-style APIs for certain types in the standard library:

// before
func main() {
    typ := reflect.TypeFor[Person]()
    for i := range typ.NumField() {
        field := typ.Field(i)
        fmt.Println(field.Name, field.Type.String())
    }
}
// after
func main() {
    typ := reflect.TypeFor[Person]()
    for field := range typ.Fields() {
        fmt.Println(field.Name, field.Type.String())
    }
}

stringsbuilder

Replace repeated += with strings.Builder:

// before
func abbr(s []string) string {
    res := ""
    for _, str := range s {
        if len(str) > 0 {
            res += string(str[0])
        }
    }
    return res
}
// after
func abbr(s []string) string {
    var res strings.Builder
    for _, str := range s {
        if len(str) > 0 {
            res.WriteString(string(str[0]))
        }
    }
    return res.String()
}

stringscut

Replace some uses of strings.Index and string slicing with strings.Cut or strings.Contains:

// before
func nospace(s string) string {
    idx := strings.Index(s, " ")
    if idx == -1 {
        return s
    }
    return strings.ReplaceAll(s, " ", "")
}
// after
func nospace(s string) string {
    found := strings.Contains(s, " ")
    if !found {
        return s
    }
    return strings.ReplaceAll(s, " ", "")
}

stringscutprefix

Replace strings.HasPrefix/TrimPrefix with strings.CutPrefix and strings.HasSuffix/TrimSuffix with string.CutSuffix:

// before
func unindent(s string) string {
    if strings.HasPrefix(s, "> ") {
        return strings.TrimPrefix(s, "> ")
    }
    return s
}
// after
func unindent(s string) string {
    if after, ok := strings.CutPrefix(s, "> "); ok {
        return after
    }
    return s
}

stringsseq

Replace ranging over strings.Split/Fields with strings.SplitSeq/FieldsSeq:

// before
func main() {
    s := "go is awesome"
    for _, word := range strings.Fields(s) {
        fmt.Println(len(word))
    }
}
// after
func main() {
    s := "go is awesome"
    for word := range strings.FieldsSeq(s) {
        fmt.Println(len(word))
    }
}

testingcontext

Replace context.WithCancel with t.Context in tests:

// before
func Test(t *testing.T) {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()
    if ctx.Err() != nil {
        t.Fatal("context should be active")
    }
}
// after
func Test(t *testing.T) {
    ctx := t.Context()
    if ctx.Err() != nil {
        t.Fatal("context should be active")
    }
}

waitgroup

Replace wg.Add+wg.Done with wg.Go:

// before
func main() {
    var wg sync.WaitGroup

    wg.Add(1)
    go func() {
        defer wg.Done()
        fmt.Println("go!")
    }()

    wg.Wait()
}
// after
func main() {
    var wg sync.WaitGroup

    wg.Go(func() {
        fmt.Println("go!")
    })

    wg.Wait()
}

𝗣 71859 • 𝗔 Alan Donovan

]]>
Detecting goroutine leaks with synctest/pprofhttps://antonz.org/detecting-goroutine-leaks/Thu, 18 Dec 2025 14:30:00 +0000https://antonz.org/detecting-goroutine-leaks/Explore different types of leaks and how to detect them in modern Go versions.Deadlocks, race conditions, and goroutine leaks are probably the three most common problems in concurrent Go programming. Deadlocks usually cause panics, so they're easier to spot. The race detector can help find data races (although it doesn't catch everything and doesn't help with other types of race conditions). As for goroutine leaks, Go's tooling did not address them for a long time.

A leak occurs when one or more goroutines are indefinitely blocked on synchronization primitives like channels, while other goroutines continue running and the program as a whole keeps functioning. We'll look at some examples shortly.

Things started to change in Go 1.24 with the introduction of the synctest package. There will be even bigger changes in Go 1.26, which adds a new experimental goroutineleak profile that reports leaked goroutines. Let's take a look!

A simple leak • Detection: goleak • Detection: synctest • Detection: pprof • Algorithm • Range over channel • Double send • Early return • Take first • Cancel/timeout • Orphans • Final thoughts

A simple leak

Let's say there's a function that runs the given functions concurrently and sends their results to an output channel:

// Gather runs the given functions concurrently
// and collects the results.
func Gather(funcs ...func() int) <-chan int {
    out := make(chan int)
    for _, f := range funcs {
        go func() {
            out <- f()
        }()
    }
    return out
}

And a simple test:

func Test(t *testing.T) {
    out := Gather(
        func() int { return 11 },
        func() int { return 22 },
        func() int { return 33 },
    )

    total := 0
    for range 3 {
        total += <-out
    }

    if total != 66 {
        t.Errorf("got %v, want 66", total)
    }
}
PASS

Send three functions to be executed and collect the results from the output channel. The test passed, so the function works correctly. But does it really?

Let's pass three functions to Gather without collecting the results, and count the goroutines:

func main() {
    Gather(
        func() int { return 11 },
        func() int { return 22 },
        func() int { return 33 },
    )

    time.Sleep(50 * time.Millisecond)
    nGoro := runtime.NumGoroutine() - 1 // minus the main goroutine
    fmt.Println("nGoro =", nGoro)
}
nGoro = 3

After 50 ms — when all the functions should definitely have finished — there are still three running goroutines (runtime.NumGoroutine). In other words, all the goroutines are stuck.

The reason is that the out channel is unbuffered. If the client doesn't read from it, or doesn't read all the results, the goroutines inside Gather get blocked on sending the f() result to out.

Let's modify the test to catch the leak.

Detecting the leak: goleak

Obviously, we don't want to rely on runtime.NumGoroutine in tests — such check is too fragile. Let's use a third-party goleak package instead:

// Gather runs the given functions concurrently
// and collects the results.
func Gather(funcs ...func() int) <-chan int {
    out := make(chan int)
    for _, f := range funcs {
        go func() {
            out <- f()
        }()
    }
    return out
}

func Test(t *testing.T) {
    defer goleak.VerifyNone(t)

    Gather(
        func() int { return 11 },
        func() int { return 22 },
        func() int { return 33 },
    )
}

playground ▶

--- FAIL: Test (0.44s)
goleak_test.go:28: found unexpected goroutines:

Goroutine 8 in state chan send, with play.Gather.func1 on top of the stack:
play.Gather.func1()
    /tmp/sandbox4216740326/prog_test.go:16 +0x37
created by play.Gather in goroutine 7
    /tmp/sandbox4216740326/prog_test.go:15 +0x45

Goroutine 9 in state chan send, with play.Gather.func1 on top of the stack:
play.Gather.func1()
    /tmp/sandbox4216740326/prog_test.go:16 +0x37
created by play.Gather in goroutine 7
    /tmp/sandbox4216740326/prog_test.go:15 +0x45

Goroutine 10 in state chan send, with play.Gather.func1 on top of the stack:
play.Gather.func1()
    /tmp/sandbox4216740326/prog_test.go:16 +0x37
created by play.Gather in goroutine 7
    /tmp/sandbox4216740326/prog_test.go:15 +0x45

The test output clearly shows where the leak occurs.

Goleak uses time.Sleep internally, but it does so quite efficiently. It inspects the stack for unexpected goroutines up to 20 times, with the wait time between checks increasing exponentially, starting at 1 microsecond and going up to 100 milliseconds. This way, the test runs almost instantly.

Still, I'd prefer not to use third-party packages and time.Sleep.

Detecting the leak: synctest

Let's check for leaks without any third-party packages by using the synctest package (experimental in Go 1.24, production-ready in Go 1.25+):

// Gather runs the given functions concurrently
// and collects the results.
func Gather(funcs ...func() int) <-chan int {
    out := make(chan int)
    for _, f := range funcs {
        go func() {
            out <- f()
        }()
    }
    return out
}

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        Gather(
            func() int { return 11 },
            func() int { return 22 },
            func() int { return 33 },
        )
        synctest.Wait()
    })
}
--- FAIL: Test (0.00s)
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain [recovered, repanicked]

goroutine 10 [chan send (durable), synctest bubble 1]:
sandbox.Gather.func1()
    /tmp/sandbox/main_test.go:34 +0x37
created by sandbox.Gather in goroutine 9
    /tmp/sandbox/main_test.go:33 +0x45

goroutine 11 [chan send (durable), synctest bubble 1]:
sandbox.Gather.func1()
    /tmp/sandbox/main_test.go:34 +0x37
created by sandbox.Gather in goroutine 9
    /tmp/sandbox/main_test.go:33 +0x45

goroutine 12 [chan send (durable), synctest bubble 1]:
sandbox.Gather.func1()
    /tmp/sandbox/main_test.go:34 +0x37
created by sandbox.Gather in goroutine 9
    /tmp/sandbox/main_test.go:33 +0x45

I'll keep this explanation short since synctest isn't the main focus of this article. If you want to learn more about it, check out the Concurrency testing guide. I highly recommend it — synctest is super useful!

Here's what happens:

  1. The call to synctest.Test starts a testing bubble in a separate goroutine.
  2. The call to Gather starts three goroutines.
  3. The call to synctest.Wait blocks the root bubble goroutine.
  4. One of the goroutines executes f, tries to write to out, and gets blocked (because no one is reading from out).
  5. The same thing happens to the other two goroutines.
  6. synctest.Wait sees that all the child goroutines in the bubble are durably blocked, so it unblocks the root goroutine.
  7. The inner test function finishes.

Next, synctest.Test comes into play. It tries to wait for all child goroutines to finish before it returns. But if it sees that some goroutines are durably blocked (in our case, all three are blocked trying to send to the channel), it panics:

main bubble goroutine has exited but blocked goroutines remain

So, here we found the leak without using time.Sleep or goleak. Pretty useful!

Detecting the leak: pprof

Let's check for leaks using the new profile type goroutineleak (experimental in Go 1.26). We'll use a helper function to run the profiled code and print the results when the profile is ready:

func printLeaks(f func()) {
    prof := pprof.Lookup("goroutineleak")

    defer func() {
        time.Sleep(50 * time.Millisecond)
        var content strings.Builder
        prof.WriteTo(&content, 2)
        // Print only the leaked goroutines.
        goros := strings.Split(content.String(), "\n\n")
        for _, goro := range goros {
            if strings.Contains(goro, "(leaked)") {
                fmt.Println(goro + "\n")
            }
        }
    }()

    f()
}

(If you try this locally, don't forget to set the GOEXPERIMENT=goroutineleakprofile environment variable.)

Call Gather with three functions and observe all three leaks:

func main() {
    printLeaks(func() {
        Gather(
            func() int { return 11 },
            func() int { return 22 },
            func() int { return 33 },
        )
    })
}
goroutine 5 [chan send (leaked)]:
main.Gather.func1()
    /tmp/sandbox/main.go:35 +0x37
created by main.Gather in goroutine 1
    /tmp/sandbox/main.go:34 +0x45

goroutine 6 [chan send (leaked)]:
main.Gather.func1()
    /tmp/sandbox/main.go:35 +0x37
created by main.Gather in goroutine 1
    /tmp/sandbox/main.go:34 +0x45

goroutine 7 [chan send (leaked)]:
main.Gather.func1()
    /tmp/sandbox/main.go:35 +0x37
created by main.Gather in goroutine 1
    /tmp/sandbox/main.go:34 +0x45

We have a nice goroutine stack trace that shows exactly where the leak happens. Unfortunately, we had to use time.Sleep again, so this probably isn't the best way to test — unless we combine it with synctest to use the fake clock.

On the other hand, we can collect a profile from a running program, which makes it really useful for finding leaks in production systems (unlike synctest). Pretty neat.

Leak detection algorithm

This goroutineleak profile uses the garbage collector's marking phase to find goroutines that are permanently blocked (leaked). The approach is explained in detail in the proposal and the paper by Saioc et al. — check it out if you're interested.

Here's the gist of it:

   [ Start: GC mark phase ]
             │ 1. Collect live goroutines
             v
   ┌───────────────────────┐
   │   Initial roots       │ <────────────────┐
   │ (runnable goroutines) │                  │
   └───────────────────────┘                  │
             │                                │
             │ 2. Mark reachable memory       │
             v                                │
   ┌───────────────────────┐                  │
   │   Reachable objects   │                  │
   │  (channels, mutexes)  │                  │
   └───────────────────────┘                  │
             │                                │
             │ 3a. Check blocked goroutines   │
             v                                │
   ┌───────────────────────┐          (Yes)   │
   │ Is blocked G waiting  │ ─────────────────┘
   │ on a reachable obj?   │ 3b. Add G to roots
   └───────────────────────┘
             │ (No - repeat until no new Gs found)
             v
   ┌───────────────────────┐
   │   Remaining blocked   │
   │      goroutines       │
   └───────────────────────┘
             │ 5. Report the leaks
             v
      [   LEAKED!   ]
 (Blocked on unreachable
  synchronization objects)
  1. Collect live goroutines. Start with currently active (runnable or running) goroutines as roots. Ignore blocked goroutines for now.
  2. Mark reachable memory. Trace pointers from roots to find which synchronization objects (like channels or wait groups) are currently reachable by these roots.
  3. Resurrect blocked goroutines. Check all currently blocked goroutines. If a blocked goroutine is waiting for a synchronization resource that was just marked as reachable — add that goroutine to the roots.
  4. Iterate. Repeat steps 2 and 3 until there are no more new goroutines blocked on reachable objects.
  5. Report the leaks. Any goroutines left in the blocked state are waiting for resources that no active part of the program can access. They're considered leaked.

In the rest of the article, we'll review the different types of leaks often observed in production and see whether synctest and goroutineleak are able to detect each of them (spoiler: they are).

Based on the code examples from the common-goroutine-leak-patterns repository by Georgian-Vlad Saioc, licensed under the Apache-2.0 license.

Range over channel

One or more goroutines receive from a channel using range, but the sender never closes the channel, so all the receivers eventually leak:

func RangeOverChan(list []any, workers int) {
    ch := make(chan any)

    // Launch workers.
    for range workers {
        go func() {
            // Each worker processes items one by one.
            // The channel is never closed, so every worker leaks
            // once there are no more items left to process.
            for item := range ch {
                _ = item
            }
        }()
    }

    // Send items for processing.
    for _, item := range list {
        ch <- item
    }

    // close(ch) // (X) uncomment to fix
}

Using synctest:

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        RangeOverChan([]any{11, 22, 33, 44}, 2)
        synctest.Wait()
    })
}
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain

goroutine 10 [chan receive (durable), synctest bubble 1]:
sandbox.RangeOverChan.func1()
    /tmp/sandbox/main_test.go:36 +0x34
created by sandbox.RangeOverChan in goroutine 9
    /tmp/sandbox/main_test.go:34 +0x45

goroutine 11 [chan receive (durable), synctest bubble 1]:
sandbox.RangeOverChan.func1()
    /tmp/sandbox/main_test.go:36 +0x34
created by sandbox.RangeOverChan in goroutine 9
    /tmp/sandbox/main_test.go:34 +0x45

Using goroutineleak:

func main() {
    printLeaks(func() {
        RangeOverChan([]any{11, 22, 33, 44}, 2)
    })
}
goroutine 19 [chan receive (leaked)]:
main.RangeOverChan.func1()
    /tmp/sandbox/main.go:36 +0x34
created by main.RangeOverChan in goroutine 1
    /tmp/sandbox/main.go:34 +0x45

goroutine 20 [chan receive (leaked)]:
main.RangeOverChan.func1()
    /tmp/sandbox/main.go:36 +0x34
created by main.RangeOverChan in goroutine 1
    /tmp/sandbox/main.go:34 +0x45

Notice how synctest and goroutineleak give almost the same stack traces, clearly showing the root cause of the problem. You'll see this in the next examples as well.

Fix: The sender should close the channel after it finishes sending.

Try uncommenting the ⓧ line and see if both checks pass.

Double send

The sender accidentally sends more values to a channel than intended, and leaks:

func DoubleSend() <-chan any {
    ch := make(chan any)

    go func() {
        res, err := work(13)
        if err != nil {
            // In case of an error, send nil.
            ch <- nil
            // return // (X) uncomment to fix
        }
        // Otherwise, continue with normal behaviour.
        // This leaks if err != nil.
        ch <- res
    }()

    return ch
}

Using synctest:

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        <-DoubleSend()
        synctest.Wait()
    })
}
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain

goroutine 22 [chan send (durable), synctest bubble 1]:
sandbox.DoubleSend.func1()
    /tmp/sandbox/main_test.go:42 +0x4c
created by sandbox.DoubleSend in goroutine 21
    /tmp/sandbox/main_test.go:32 +0x5f

Using goroutineleak:

func main() {
    printLeaks(func() {
        <-DoubleSend()
    })
}
goroutine 19 [chan send (leaked)]:
main.DoubleSend.func1()
    /tmp/sandbox/main.go:42 +0x4c
created by main.DoubleSend in goroutine 1
    /tmp/sandbox/main.go:32 +0x67

Fix: Make sure that each possible path in the code sends to the channel no more times than the receiver is ready for. Alternatively, make the channel's buffer large enough to handle all possible sends.

Try uncommenting the ⓧ line and see if both checks pass.

Early return

The parent goroutine exits without receiving a value from the child goroutine, so the child leaks:

func EarlyReturn() {
    ch := make(chan any) // (X) should be buffered

    go func() {
        res, _ := work(42)
        // Leaks if the parent goroutine terminates early.
        ch <- res
    }()

    _, err := work(13)
    if err != nil {
        // Early return in case of error.
        // The child gorouine leaks.
        return
    }

    // Only receive if there is no error.
    <-ch
}

Using synctest:

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        EarlyReturn()
        synctest.Wait()
    })
}
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain

goroutine 22 [chan send (durable), synctest bubble 1]:
sandbox.EarlyReturn.func1()
    /tmp/sandbox/main_test.go:35 +0x45
created by sandbox.EarlyReturn in goroutine 21
    /tmp/sandbox/main_test.go:32 +0x5f

Using goroutineleak:

func main() {
    printLeaks(func() {
        EarlyReturn()
    })
}
goroutine 7 [chan send (leaked)]:
main.EarlyReturn.func1()
    /tmp/sandbox/main.go:35 +0x45
created by main.EarlyReturn in goroutine 1
    /tmp/sandbox/main.go:32 +0x67

Fix: Make the channel buffered so the child goroutine doesn't get blocked when sending.

Try making the channel buffered at line ⓧ and see if both checks pass.

Cancel/timeout

Similar to "early return". If the parent is canceled before receiving a value from the child goroutine, the child leaks:

func Canceled(ctx context.Context) {
    ch := make(chan any) // (X) should be buffered

    go func() {
        res, _ := work(100)
        // Leaks if the parent goroutine gets canceled.
        ch <- res
    }()

    // Wait for the result or for cancellation.
    select {
    case <-ctx.Done():
        // The child goroutine leaks.
        return
    case res := <-ch:
        // Process the result.
        _ = res
    }
}

Using synctest:

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        ctx, cancel := context.WithCancel(t.Context())
        cancel()
        Canceled(ctx)

        time.Sleep(time.Second)
        synctest.Wait()
    })
}
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain

goroutine 22 [chan send (durable), synctest bubble 1]:
sandbox.Canceled.func1()
    /tmp/sandbox/main_test.go:35 +0x45
created by sandbox.Canceled in goroutine 21
    /tmp/sandbox/main_test.go:32 +0x76

Using goroutineleak:

func main() {
    printLeaks(func() {
        ctx, cancel := context.WithCancel(context.Background())
        cancel()
        Canceled(ctx)
    })
}
goroutine 19 [chan send (leaked)]:
main.Canceled.func1()
    /tmp/sandbox/main.go:35 +0x45
created by main.Canceled in goroutine 1
    /tmp/sandbox/main.go:32 +0x7b

Fix: Make the channel buffered so the child goroutine doesn't get blocked when sending.

Try making the channel buffered at line ⓧ and see if both checks pass.

Take first

The parent launches N child goroutines, but is only interested in the first result. The rest N-1 children leak:

func TakeFirst(items []any) {
    ch := make(chan any)

    // Iterate over every item.
    for _, item := range items {
        go func() {
            ch <- process(item)
        }()
    }

    // Retrieve the first result. All other children leak.
    // Also, the parent leaks if len(items) == 0.
    <-ch
}

Using synctest (zero items, the parent leaks):

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        go TakeFirst(nil)
        synctest.Wait()
    })
}
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain

goroutine 22 [chan receive (durable), synctest bubble 1]:
sandbox.TakeFirst({0x0, 0x0, 0x0?})
    /tmp/sandbox/main_test.go:40 +0xdd
created by sandbox.Test.func1 in goroutine 21
    /tmp/sandbox/main_test.go:44 +0x1a

Using synctest (multiple items, children leak):

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        go TakeFirst([]any{11, 22, 33})
        synctest.Wait()
    })
}
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain

goroutine 10 [chan send (durable), synctest bubble 1]:
sandbox.TakeFirst.func1()
    /tmp/sandbox/main_test.go:35 +0x2e
created by sandbox.TakeFirst in goroutine 9
    /tmp/sandbox/main_test.go:34 +0x51

goroutine 11 [chan send (durable), synctest bubble 1]:
sandbox.TakeFirst.func1()
    /tmp/sandbox/main_test.go:35 +0x2e
created by sandbox.TakeFirst in goroutine 9
    /tmp/sandbox/main_test.go:34 +0x51

Using goroutineleak (zero items, the parent leaks):

func main() {
    printLeaks(func() {
        go TakeFirst(nil)
    })
}
goroutine 19 [chan receive (leaked)]:
main.TakeFirst({0x0, 0x0, 0x0?})
    /tmp/sandbox/main.go:40 +0xeb
created by main.main.func1 in goroutine 1
    /tmp/sandbox/main.go:44 +0x1a

Using goroutineleak (multiple items, children leak):

func main() {
    printLeaks(func() {
        go TakeFirst([]any{11, 22, 33})
    })
}
goroutine 20 [chan send (leaked)]:
main.TakeFirst.func1()
    /tmp/sandbox/main.go:35 +0x2e
created by main.TakeFirst in goroutine 19
    /tmp/sandbox/main.go:34 +0x51

goroutine 21 [chan send (leaked)]:
main.TakeFirst.func1()
    /tmp/sandbox/main.go:35 +0x2e
created by main.TakeFirst in goroutine 19
    /tmp/sandbox/main.go:34 +0x51

Fix: Make the channel's buffer large enough to hold values from all child goroutines. Also, return early if the source collection is empty.

Try changing the TakeFirst implementation as follows and see if both checks pass:

func TakeFirst(items []any) {
    if len(items) == 0 {
        // Return early if the source collection is empty.
        return
    }
    // Make the channel's buffer large enough.
    ch := make(chan any, len(items))

    // Iterate over every item
    for _, item := range items {
        go func() {
            ch <- process(item)
        }()
    }

    // Retrieve first result.
    <-ch
}

Orphans

Inner goroutines leak because the client doesn't follow the contract described in the type's interface and documentation.

Let's say we have a Worker type with the following contract:

// A worker processes a queue of items one by one in the background.
// A started worker must eventually be stopped.
// Failing to stop a worker results in a goroutine leak.
type Worker struct {
    // ...
}

// NewWorker creates a new worker.
func NewWorker() *Worker

// Start starts the processing.
func (w *Worker) Start()

// Stop stops the the processing.
func (w *Worker) Stop()

// Push adds an item to the processing queue.
func (w *Worker) Push(item any)

The implementation isn't particularly important — what really matters is the public contract.

Let's say the client breaks the contract and doesn't stop the worker:

func Orphans() {
    w := NewWorker()
    w.Start()
    // defer w.Stop() // (X) uncomment to fix

    items := make([]any, 10)
    for _, item := range items {
        w.Push(item)
    }
}

Then the worker goroutines will leak, just like the documentation says.

Using synctest:

func Test(t *testing.T) {
    synctest.Test(t, func(t *testing.T) {
        Orphans()
        synctest.Wait()
    })
}
panic: deadlock: main bubble goroutine has exited but blocked goroutines remain

goroutine 10 [select (durable), synctest bubble 1]:
sandbox.(*Worker).run(0xc00009c190)
    /tmp/sandbox/main_test.go:113 +0xcc
created by sandbox.(*Worker).Start.func1 in goroutine 9
    /tmp/sandbox/main_test.go:89 +0xb6

goroutine 11 [select (durable), synctest bubble 1]:
sandbox.(*Worker).run(0xc00009c190)
    /tmp/sandbox/main_test.go:113 +0xcc
created by sandbox.(*Worker).Start.func1 in goroutine 9
    /tmp/sandbox/main_test.go:90 +0xf6

Using goroutineleak:

func main() {
    printLeaks(func() {
        Orphans()
    })
}
goroutine 19 [select (leaked)]:
main.(*Worker).run(0x147fe4630000)
    /tmp/sandbox/main.go:112 +0xce
created by main.(*Worker).Start.func1 in goroutine 1
    /tmp/sandbox/main.go:88 +0xba

goroutine 20 [select (leaked)]:
main.(*Worker).run(0x147fe4630000)
    /tmp/sandbox/main.go:112 +0xce
created by main.(*Worker).Start.func1 in goroutine 1
    /tmp/sandbox/main.go:89 +0x105

Fix: Follow the contract and stop the worker to make sure all goroutines are stopped.

Try uncommenting the ⓧ line and see if both checks pass.

Final thoughts

Thanks to improvements in Go 1.24-1.26, it's now much easier to catch goroutine leaks, both during testing and in production.

The synctest package is available in 1.24 (experimental) and 1.25+ (production-ready). If you're interested, I have a detailed interactive guide on it.

The goroutineleak profile will be available in 1.26 (experimental). According to the authors, the implementation is already production-ready. It's only marked as experimental so they can get feedback on the API, especially about making it a new profile.

See the proposal and the commits for more details on goroutineleak:

𝗣 74609, 75280 • 𝗖𝗟 688335 • 𝗔 Vlad Saioc

P.S. If you are into concurrency, check out my interactive book.

]]>
Timing 'Hello, world'https://antonz.org/timing-hello-world/Mon, 15 Dec 2025 10:00:00 +0000https://antonz.org/timing-hello-world/Compiling and running 'Hello, World!' in 20 programming languages.Here's a little unscientific chart showing the compile+run times of a "hello world" program in different languages:

Hello world timings

Bash        <0.4s  ■
C           <0.4s  ■
JavaScript  <0.4s  ■
Lua         <0.4s  ■
PHP         <0.4s  ■
Python      <0.4s  ■
Ruby        <0.4s  ■
Rust         0.5s  ■■
V            0.5s  ■■
R            0.5s  ■■
Swift        0.6s  ■■■
Go           0.6s  ■■■
Haskell      0.8s  ■■■■■
C++          0.9s  ■■■■■■
Zig          1.0s  ■■■■■■■
Elixir       1.2s  ■■■■■■■■■
C#           1.3s  ■■■■■■■■■■
Java         1.7s  ■■■■■■■■■■■■■■
Odin         1.7s  ■■■■■■■■■■■■■■
Dart         1.9s  ■■■■■■■■■■■■■■■■
Kotlin       8.4s  ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■...■■■■■■■■■■■■■■■■■■■■■■■■■■■■■

I had to shorten the Kotlin bar a bit to make it fit within 80 characters.

All measurements were done in single-core, containerized sandboxes on an ancient CPU, and the timings include the overhead of docker run. So the exact times aren't very interesting, especially for the top group (Bash to Ruby) — they all took about the same amount of time. But the difference in speed between different compilers is real: for example, Rust is actually faster than C++, C# is faster than Java, and everyone is faster than Kotlin by a large margin.

Here is the program source code in C:

#include <stdio.h>

void greet(const char* name) {
    printf("Hello, %s!\n", name);
}

int main() {
    greet("World");
}
Hello, World!

Other languages: Bash · C# · C++ · Dart · Elixir · Go · Haskell · Java · JavaScript · Kotlin · Lua · Odin · PHP · Python · R · Ruby · Rust · Swift · V · Zig

Of course, this ranking will be different for real-world projects with lots of code and dependencies. Still, it makes a lot of sense for my use case (I run small snippets of untrusted code in one-off dockerized sandboxes), so I decided to share it.

]]>
'Gist of Go: Concurrency' is out!https://antonz.org/go-concurrency-released/Fri, 12 Dec 2025 10:30:00 +0000https://antonz.org/go-concurrency-released/Interactive book on concurrent programming with auto-tested exercises.

My book on concurrent programming in Go is finally finished. It walks you through goroutines, channels, select, pipelines, synchronization, race prevention, time handling, signaling, atomicity, testing, and concurrency internals.

The book follows my usual style: clear explanations with interactive examples, plus auto-tested exercises so you can practice as you go. I genuinely think it's the best practical guide for everyone learning concurrency from scratch or looking to go beyond the basics.

Book cover

There's a dedicated page with all the book details — check it out!

]]>
Go feature: Secret modehttps://antonz.org/accepted/runtime-secret/Tue, 09 Dec 2025 10:00:00 +0000https://antonz.org/accepted/runtime-secret/Automatically erase memory to prevent secret leaks.Part of the Accepted! series: Go proposals and features explained in simple terms.

Automatically erase used memory to prevent secret leaks.

Ver. 1.26 • Stdlib • Low impact

Summary

The new runtime/secret package lets you run a function in secret mode. After the function finishes, it immediately erases (zeroes out) the registers and stack it used. Heap allocations made by the function are erased as soon as the garbage collector decides they are no longer reachable.

secret.Do(func() {
    // Generate an ephemeral key and
    // use it to negotiate the session.
})

This helps make sure sensitive information doesn't stay in memory longer than needed, lowering the risk of attackers getting to it.

The package is experimental and is mainly for developers of cryptographic libraries, not for application developers.

Motivation

Cryptographic protocols like WireGuard or TLS have a property called "forward secrecy". This means that even if an attacker gains access to long-term secrets (like a private key in TLS), they shouldn't be able to decrypt past communication sessions. To make this work, ephemeral keys (temporary keys used to negotiate the session) need to be erased from memory immediately after the handshake. If there's no reliable way to clear this memory, these keys could stay there indefinitely. An attacker who finds them later could re-derive the session key and decrypt past traffic, breaking forward secrecy.

In Go, the runtime manages memory, and it doesn't guarantee when or how memory is cleared. Sensitive data might remain in heap allocations or stack frames, potentially exposed in core dumps or through memory attacks. Developers often have to use unreliable "hacks" with reflection to try to zero out internal buffers in cryptographic libraries. Even so, some data might still stay in memory where the developer can't reach or control it.

The solution is to provide a runtime mechanism that automatically erases all temporary storage used during sensitive operations. This will make it easier for library developers to write secure code without using workarounds.

Description

Add the runtime/secret package with Do and Enabled functions:

// Do invokes f.
//
// Do ensures that any temporary storage used by f is erased in a
// timely manner. (In this context, "f" is shorthand for the
// entire call tree initiated by f.)
//   - Any registers used by f are erased before Do returns.
//   - Any stack used by f is erased before Do returns.
//   - Any heap allocation done by f is erased as soon as the garbage
//     collector realizes that it is no longer reachable.
//   - Do works even if f panics or calls runtime.Goexit. As part of
//     that, any panic raised by f will appear as if it originates from
//     Do itself.
func Do(f func())
// Enabled reports whether Do appears anywhere on the call stack.
func Enabled() bool

The current implementation has several limitations:

  • Only supported on linux/amd64 and linux/arm64. On unsupported platforms, Do invokes f directly.
  • Protection does not cover any global variables that f writes to.
  • Trying to start a goroutine within f causes a panic.
  • If f calls runtime.Goexit, erasure is delayed until all deferred functions are executed.
  • Heap allocations are only erased if ➊ the program drops all references to them, and ➋ then the garbage collector notices that those references are gone. The program controls the first part, but the second part depends on when the runtime decides to act.
  • If f panics, the panicked value might reference memory allocated inside f. That memory won't be erased until (at least) the panicked value is no longer reachable.
  • Pointer addresses might leak into data buffers that the runtime uses for garbage collection. Do not put confidential information into pointers.

Confidential information in pointers

Imagine you're working on a cryptographic algorithm that uses a lookup table. You have a public table array and a secret byte b taken from your private key. You might be tempted to create a pointer to the result:

// BAD: The address stored in p depends on the secret b.
p := &table[b]

In this case, the pointer p stores the memory address AddressOf(table) + b. Since the garbage collector tracks all active pointers, it might save this address in its internal buffers. If an attacker can access the GC's memory, they could see the address in p, subtract the base address of table, and figure out your secret byte b.

The package is mainly for developers who work on cryptographic libraries. Most apps should use higher-level libraries that use secret.Do behind the scenes.

As of Go 1.26, the runtime/secret package is experimental and can be enabled by setting GOEXPERIMENT=runtimesecret at build time.

Example

Generate a session key while keeping the ephemeral private key and shared secret safe using secret.Do:

// DeriveSessionKey does an ephemeral key exchange to create a session key.
func DeriveSessionKey(peerPublicKey *ecdh.PublicKey) (*ecdh.PublicKey, []byte, error) {
    var pubKey *ecdh.PublicKey
    var sessionKey []byte
    var err error

    // Use secret.Do to contain the sensitive data during the handshake.
    // The ephemeral private key and the raw shared secret will be
    // wiped out when this function finishes.
    secret.Do(func() {
        // 1. Generate an ephemeral private key.
        // This is highly sensitive; if leaked later, forward secrecy is broken.
        privKey, e := ecdh.P256().GenerateKey(rand.Reader)
        if e != nil {
            err = e
            return
        }

        // 2. Compute the shared secret (ECDH).
        // This raw secret is also highly sensitive.
        sharedSecret, e := privKey.ECDH(peerPublicKey)
        if e != nil {
            err = e
            return
        }

        // 3. Derive the final session key (e.g., using HKDF).
        // We copy the result out; the inputs (privKey, sharedSecret)
        // will be destroyed by secret.Do when they become unreachable.
        sessionKey = performHKDF(sharedSecret)
        pubKey = privKey.PublicKey()
    })

    // The session key is returned for use, but the "recipe" to recreate it
    // is destroyed. Additionally, because the session key was allocated
    // inside the secret block, the runtime will automatically zero it out
    // when the application is finished using it.
    return pubKey, sessionKey, err
}

Here, the ephemeral private key and the raw shared secret are effectively "toxic waste" — they are necessary to create the final session key, but dangerous to keep around.

If these values stay in the heap and an attacker later gets access to the application's memory (for example, via a core dump or a vulnerability like Heartbleed), they could use these intermediates to re-derive the session key and decrypt past conversations.

By wrapping the calculation in secret.Do, we make sure that as soon as the session key is created, the "ingredients" used to make it are permanently destroyed. This means that even if the server is compromised in the future, this specific past session can't be exposed, which ensures forward secrecy.

𝗣 21865 • 𝗖𝗟 704615 • 𝗔 Daniel Morsing

]]>
Gist of Go: Concurrency internalshttps://antonz.org/go-concurrency/internals/Fri, 05 Dec 2025 10:00:00 +0000https://antonz.org/go-concurrency/internals/CPU cores, threads, goroutines, and the scheduler.This is a chapter from my book on Go concurrency, which teaches the topic from the ground up through interactive examples.

Here's where we started this book:

Functions that run with go are called goroutines. The Go runtime juggles these goroutines and distributes them among operating system threads running on CPU cores. Compared to OS threads, goroutines are lightweight, so you can create hundreds or thousands of them.

That's generally correct, but it's a little too brief. In this chapter, we'll take a closer look at how goroutines work. We'll still use a simplified model, but it should help you understand how everything fits together.

ConcurrencyGoroutine schedulerGOMAXPROCSConcurrency primitivesScheduler metricsProfilingTracingKeep it up

Concurrency

At the hardware level, CPU cores are responsible for running parallel tasks. If a processor has 4 cores, it can run 4 instructions at the same time — one on each core.

  instr A     instr B     instr C     instr D
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Core 1  │ │ Core 2  │ │ Core 3  │ │ Core 4  │ CPU
└─────────┘ └─────────┘ └─────────┘ └─────────┘

At the operating system level, a thread is the basic unit of execution. There are usually many more threads than CPU cores, so the operating system's scheduler decides which threads to run and which ones to pause. The scheduler keeps switching between threads to make sure each one gets a turn to run on a CPU, instead of waiting in line forever. This is how the operating system handles concurrency.

┌──────────┐              ┌──────────┐
│ Thread E │              │ Thread F │              OS
└──────────┘              └──────────┘
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Thread A │ │ Thread B │ │ Thread C │ │ Thread D │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
     │           │           │           │
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Core 1   │ │ Core 2   │ │ Core 3   │ │ Core 4   │ CPU
└──────────┘ └──────────┘ └──────────┘ └──────────┘

At the Go runtime level, a goroutine is the basic unit of execution. The runtime scheduler runs a fixed number of OS threads, often one per CPU core. There can be many more goroutines than threads, so the scheduler decides which goroutines to run on the available threads and which ones to pause. The scheduler keeps switching between goroutines to make sure each one gets a turn to run on a thread, instead of waiting in line forever. This is how Go handles concurrency.

┌─────┐┌─────┐┌─────┐┌─────┐┌─────┐┌─────┐
│ G15 ││ G16 ││ G17 ││ G18 ││ G19 ││ G20 │
└─────┘└─────┘└─────┘└─────┘└─────┘└─────┘
┌─────┐      ┌─────┐      ┌─────┐      ┌─────┐
│ G11 │      │ G12 │      │ G13 │      │ G14 │      Go runtime
└─────┘      └─────┘      └─────┘      └─────┘
  │            │            │            │
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Thread A │ │ Thread B │ │ Thread C │ │ Thread D │ OS
└──────────┘ └──────────┘ └──────────┘ └──────────┘

The Go runtime scheduler doesn't decide which threads run on the CPU — that's the operating system scheduler's job. The Go runtime makes sure all goroutines run on the threads it manages, but the OS controls how and when those threads actually get CPU time.

Goroutine scheduler

The scheduler's job is to run M goroutines on N operating system threads, where M can be much larger than N. Here's a simple way to do it:

  1. Put all goroutines in a queue.
  2. Take N goroutines from the queue and run them.
  3. If a running goroutine gets blocked (for example, waiting to read from a channel or waiting on a mutex), put it back in the queue and run the next goroutine from the queue.

Take goroutines G11-G14 and run them:

┌─────┐┌─────┐┌─────┐┌─────┐┌─────┐┌─────┐
│ G15 ││ G16 ││ G17 ││ G18 ││ G19 ││ G20 │          queue
└─────┘└─────┘└─────┘└─────┘└─────┘└─────┘
┌─────┐      ┌─────┐      ┌─────┐      ┌─────┐
│ G11 │      │ G12 │      │ G13 │      │ G14 │      running
└─────┘      └─────┘      └─────┘      └─────┘
  │            │            │            │
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Thread A │ │ Thread B │ │ Thread C │ │ Thread D │
└──────────┘ └──────────┘ └──────────┘ └──────────┘

Goroutine G12 got blocked while reading from the channel. Put it back in the queue and replace it with G15:

┌─────┐┌─────┐┌─────┐┌─────┐┌─────┐┌─────┐
│ G16 ││ G17 ││ G18 ││ G19 ││ G20 ││ G12 │          queue
└─────┘└─────┘└─────┘└─────┘└─────┘└─────┘
┌─────┐      ┌─────┐      ┌─────┐      ┌─────┐
│ G11 │      │ G15 │      │ G13 │      │ G14 │      running
└─────┘      └─────┘      └─────┘      └─────┘
  │            │            │            │
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Thread A │ │ Thread B │ │ Thread C │ │ Thread D │
└──────────┘ └──────────┘ └──────────┘ └──────────┘

But there are a few things to keep in mind.

Starvation

Let's say goroutines G11–G14 are running smoothly without getting blocked by mutexes or channels. Does that mean goroutines G15–G20 won't run at all and will just have to wait (starve) until one of G11–G14 finally finishes? That would be unfortunate.

That's why the scheduler checks each running goroutine roughly every 10 ms to decide if it's time to pause it and put it back in the queue. This approach is called preemptive scheduling: the scheduler can interrupt running goroutines when needed so others have a chance to run too.

System calls

The scheduler can manage a goroutine while it's running Go code. But what happens if a goroutine makes a system call, like reading from disk? In that case, the scheduler can't take the goroutine off the thread, and there's no way to know how long the system call will take. For example, if goroutines G11–G14 in our example spend a long time in system calls, all worker threads will be blocked, and the program will basically "freeze".

To solve this problem, the scheduler starts new threads if the existing ones get blocked in a system call. For example, here's what happens if G11 and G12 make system calls:

┌─────┐┌─────┐┌─────┐┌─────┐
│ G17 ││ G18 ││ G19 ││ G20 │                        queue
└─────┘└─────┘└─────┘└─────┘

┌─────┐      ┌─────┐      ┌─────┐      ┌─────┐
│ G15 │      │ G16 │      │ G13 │      │ G14 │      running
└─────┘      └─────┘      └─────┘      └─────┘
  │            │            │            │
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Thread E │ │ Thread F │ │ Thread C │ │ Thread D │
└──────────┘ └──────────┘ └──────────┘ └──────────┘

┌─────┐      ┌─────┐
│ G11 │      │ G12 │                                syscalls
└─────┘      └─────┘
  │            │
┌──────────┐ ┌──────────┐
│ Thread A │ │ Thread B │
└──────────┘ └──────────┘

Here, the scheduler started two new threads, E and F, and assigned goroutines G15 and G16 from the queue to these threads.

When G11 and G12 finish their system calls, the scheduler will stop or terminate the extra threads (E and F) and keep running the goroutines on four threads: A-B-C-D.

This is a simplified model of how the goroutine scheduler works in Go. If you want to learn more, I recommend watching the talk by Dmitry Vyukov, one of the scheduler's developers: Go scheduler: Implementing language with lightweight concurrency (video, slides)

GOMAXPROCS

We said that the scheduler uses N threads to run goroutines. In the Go runtime, the value of N is set by a parameter called GOMAXPROCS.

The GOMAXPROCS runtime setting controls the maximum number of operating system threads the Go scheduler can use to execute goroutines concurrently (not counting the goroutines running syscalls). It defaults to the value of runtime.NumCPU, which is the number of logical CPUs on the machine.

Strictly speaking, runtime.NumCPU is either the total number of logical CPUs or the number allowed by the CPU affinity mask, whichever is lower. This can be adjusted by the CPU quota, as explained below.

For example, on my 8-core laptop, the default value of GOMAXPROCS is also 8:

maxProcs := runtime.GOMAXPROCS(0) // returns the current value
fmt.Println("NumCPU:", runtime.NumCPU())
fmt.Println("GOMAXPROCS:", maxProcs)
NumCPU: 8
GOMAXPROCS: 8

You can change GOMAXPROCS by setting GOMAXPROCS environment variable or calling runtime.GOMAXPROCS():

// Get the default value.
fmt.Println("GOMAXPROCS default:", runtime.GOMAXPROCS(0))

// Change the value.
runtime.GOMAXPROCS(1)
fmt.Println("GOMAXPROCS custom:", runtime.GOMAXPROCS(0))
GOMAXPROCS default: 8
GOMAXPROCS custom: 1

You can also undo the manual changes and go back to the default value set by the runtime. To do this, use the runtime.SetDefaultGOMAXPROCS function (Go 1.25+):

GOMAXPROCS=2 go run nproc.go
// Using the environment variable.
fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))

// Using the manual setting.
runtime.GOMAXPROCS(4)
fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))

// Back to the default value.
runtime.SetDefaultGOMAXPROCS()
fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))
GOMAXPROCS: 2
GOMAXPROCS: 4
GOMAXPROCS: 8

CPU quota

Go programs often run in containers, like those managed by Docker or Kubernetes. These systems let you limit the CPU resources for a container using a Linux feature called cgroups.

A cgroup (control group) in Linux lets you group processes together and control how much CPU, memory, and network I/O they can use by setting limits and priorities.

For example, here's how you can limit a Docker container to use only four CPUs:

docker run --cpus=4 golang:1.24-alpine go run /app/nproc.go
// /app/nproc.go
maxProcs := runtime.GOMAXPROCS(0) // returns the current value
fmt.Println("NumCPU:", runtime.NumCPU())
fmt.Println("GOMAXPROCS:", maxProcs)

Before version 1.25, the Go runtime didn't consider the CPU quota when setting the GOMAXPROCS value. No matter how you limited CPU resources, GOMAXPROCS was always set to the number of logical CPUs on the host machine:

docker run --cpus=4 golang:1.24-alpine go run /app/nproc.go
NumCPU: 8
GOMAXPROCS: 8

Starting with version 1.25, the Go runtime respects the CPU quota:

docker run --cpus=4 golang:1.25-alpine go run /app/nproc.go
NumCPU: 8
GOMAXPROCS: 4

So, the default GOMAXPROCS value is set to either the number of logical CPUs or the CPU limit enforced by cgroup settings for the process, whichever is lower.

Note on CPU limits

Cgroups actually offer not just one, but two ways to limit CPU resources:

  • CPU quota — the maximum CPU time the cgroup may use within some period window.
  • CPU shares — relative CPU priorities given to the kernel scheduler.

Docker's --cpus and --cpu-period/--cpu-quota set the quota, while --cpu-shares sets the shares.

Kubernetes' CPU limit sets the quota, while CPU request sets the shares.

Go's runtime GOMAXPROCS only takes the CPU quota into account, not the shares.

Fractional CPU limits are rounded up:

docker run --cpus=2.3 golang:1.25-alpine go run /app/nproc.go
NumCPU: 8
GOMAXPROCS: 3

On a machine with multiple CPUs, the minimum default value for GOMAXPROCS is 2, even if the CPU limit is set lower:

docker run --cpus=1 golang:1.25-alpine go run /app/nproc.go
NumCPU: 8
GOMAXPROCS: 2

The Go runtime automatically updates GOMAXPROCS if the CPU limit changes. It happens up to once per second (less frequently if the application is idle).

Concurrency primitives

Let's take a quick look at the three main concurrency tools for Go: goroutines, channels, and select.

Goroutine

A goroutine is implemented as a pointer to a runtime.g structure. Here's what it looks like:

// runtime/runtime2.go
type g struct {
    atomicstatus atomic.Uint32 // goroutine status
    stack        stack         // goroutine stack
    m            *m            // thread that runs the goroutine
    // ...
}

The g structure has many fields, but most of its memory is taken up by the stack, which holds the goroutine's local variables. By default, each stack gets 2 KB of memory, and it grows if needed.

Because goroutines use very little memory, they're much more efficient than operating system threads, which usually need about 1 MB each. Also, switching between goroutines is very fast because it's handled by Go's scheduler and doesn't involve the operating system's kernel (unlike switching between threads managed by the OS). This lets Go run hundreds of thousands, or even millions, of goroutines on a single machine.

Channel

A channel is implemented as a pointer to a runtime.hchan structure. Here's what it looks like:

// runtime/chan.go
type hchan struct {
    // channel buffer
    qcount   uint           // number of items in the buffer
    dataqsiz uint           // buffer array size
    buf      unsafe.Pointer // pointer to the buffer array

    // closed channel flag
    closed uint32

    // queues of goroutines waiting to receive and send
    recvq waitq // waiting to receive from the channel
    sendq waitq // waiting to send to the channel

    // protects the channel state
    lock mutex

    // ...
}

The buffer array (buf) has a fixed size (dataqsiz, which you can get with the cap() builtin). It's created when you make a buffered channel. The number of items in the channel (qcount, which you can get with the len() builtin) increases when you send to the channel and decreases when you receive from it.

The close() builtin sets the closed field to 1.

Sending an item to an unbuffered channel, or to a buffered channel that's already full, puts the goroutine into the sendq queue. Receiving from an empty channel puts the goroutine into the recvq queue.

Select

The select logic is implemented in the runtime.selectgo function. It's a huge function that takes a list of select cases and (very simply put) works as follows:

  • Go through the cases and check if the matching channels are ready to send or receive.
  • If several cases are ready, choose one at random (to prevent starvation, where some cases are always chosen and others are never chosen).
  • Once a case is selected, perform the send or receive operation on the matching channel.
  • If there is a default case and no other cases are ready, pick the default.
  • If no cases are ready, block the goroutine and add it to the channel queue for each case.

✎ Exercise: Runtime simulator

Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it.

If you are okay with just theory for now, let's continue.

Scheduler metrics

Metrics show how the Go runtime is performing, like how much heap memory it uses or how long garbage collection pauses take. Each metric has a unique name (for example, /sched/gomaxprocs:threads) and a value, which can be a number or a histogram.

We use the runtime/metrics package to work with metrics.

List all available metrics with descriptions:

func main() {
    descs := metrics.All()
    for _, d := range descs {
        fmt.Printf("Name: %s\n", d.Name)
        fmt.Printf("Description: %s\n", d.Description)
        fmt.Printf("Kind: %s\n", kindToString(d.Kind))
        fmt.Println()
    }
}

func kindToString(k metrics.ValueKind) string {
    switch k {
    case metrics.KindUint64:
        return "KindUint64"
    case metrics.KindFloat64:
        return "KindFloat64"
    case metrics.KindFloat64Histogram:
        return "KindFloat64Histogram"
    case metrics.KindBad:
        return "KindBad"
    default:
        return "Unknown"
    }
}
Name: /cgo/go-to-c-calls:calls
Description: Count of calls made from Go to C by the current process.
Kind: KindUint64

Name: /cpu/classes/gc/mark/assist:cpu-seconds
Description: Estimated total CPU time goroutines spent performing GC
tasks to assist the GC and prevent it from falling behind the application.
This metric is an overestimate, and not directly comparable to system
CPU time measurements. Compare only with other /cpu/classes metrics.
Kind: KindFloat64
...

Get the value of a specific metric:

samples := []metrics.Sample{
    {Name: "/sched/gomaxprocs:threads"},
    {Name: "/sched/goroutines:goroutines"},
}
metrics.Read(samples)

for _, s := range samples {
    // Assumes the value is a uint64. Check the metric description
    // or use s.Value.Kind() if you're not sure.
    fmt.Printf("%s: %v\n", s.Name, s.Value.Uint64())
}
/sched/gomaxprocs:threads: 8
/sched/goroutines:goroutines: 1

Here are some goroutine-related metrics:

/sched/goroutines-created:goroutines

  • Count of goroutines created since program start (Go 1.26+).

/sched/goroutines:goroutines

  • Count of live goroutines (created but not finished yet).
  • An increase in this metric may indicate a goroutine leak.

/sched/goroutines/not-in-go:goroutines

  • Approximate count of goroutines running or blocked in a system call or cgo call (Go 1.26+).
  • An increase in this metric may indicate problems with such calls.

/sched/goroutines/runnable:goroutines

  • Approximate count of goroutines ready to execute, but not executing (Go 1.26+).
  • An increase in this metric may mean the system is overloaded and the CPU can't keep up with the growing number of goroutines.

/sched/goroutines/running:goroutines

  • Approximate count of goroutines executing (Go 1.26+).
  • Always less than or equal to /sched/gomaxprocs:threads.

/sched/goroutines/waiting:goroutines

  • Approximate count of goroutines waiting on a resource — I/O or sync primitives (Go 1.26+).
  • An increase in this metric may indicate issues with mutex locks, other synchronization blocks, or I/O issues.

/sched/threads/total:threads

  • The current count of live threads that are owned by the runtime (Go 1.26+).

/sched/gomaxprocs:threads

  • The current runtime.GOMAXPROCS setting — the maximum number of operating system threads the scheduler can use to execute goroutines concurrently.

In real projects, runtime metrics are usually exported automatically with client libraries for Prometheus, OpenTelemetry, or other observability tools. Here's an example for Prometheus:

package main

import (
    "net/http"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
    // Export runtime/metrics in Prometheus format at the /metrics endpoint.
    http.Handle("/metrics", promhttp.Handler())
    http.ListenAndServe("localhost:2112", nil)
}

The exported metrics are then collected by Prometheus, visualized, and used to set up alerts.

Profiling

Profiling helps you understand exactly what the program is doing, what resources it uses, and where in the code this happens. Profiling is often not recommended in production because it's a "heavy" process that can slow things down. But that's not the case with Go.

Go's profiler is designed for production use. It uses sampling, so it doesn't track every single operation. Instead, it takes quick snapshots of the runtime every 10 ms and puts them together to give you a full picture.

Go supports the following profiles:

  • CPU. Shows how much CPU time each function uses. Use it to find performance bottlenecks if your program is running slowly because of CPU-heavy tasks.
  • Heap. Shows the heap memory currently used by each function. Use it to detect memory leaks or excessive memory usage.
  • Allocs. Shows which functions have used heap memory since the profiler started (not just currently). Use it to optimize garbage collection or reduce allocations that impact performance.
  • Goroutine. Shows the stack traces of all current goroutines. Use it to get an overview of what the program is doing.
  • Block. Shows where goroutines block waiting on synchronization primitives like channels, mutexes and wait groups. Use it to identify synchronization bottlenecks and issues in data exchange between goroutines. Disabled by default.
  • Mutex. Shows lock contentions on mutexes and internal runtime locks. Use it to find "problematic" mutexes that goroutines are frequently waiting for. Disabled by default.

The easiest way to add a profiler to your app is by using the net/http/pprof package. When you import it, it automatically registers HTTP handlers for collecting profiles:

package main

import (
    "net/http"
    _ "net/http/pprof"
    "sync"
)

func main() {
    // Enable block and mutexe profiles.
    runtime.SetBlockProfileRate(1)
    runtime.SetMutexProfileFraction(1)
    // Start an HTTP server on localhost.
    // Profiler HTTP handlers are automatically
    // registered when you import "net/http/pprof".
    http.ListenAndServe("localhost:6060", nil)
}

Or you can register profiler handlers manually:

var wg sync.WaitGroup

wg.Go(func() {
    // Application server running on port 8080.
    mux := http.NewServeMux()
    mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        w.Write([]byte("Hello, World!"))
    })
    log.Println("Starting hello server on :8080")
    log.Fatal(http.ListenAndServe(":8080", mux))
})

wg.Go(func() {
    // Profiling server running on localhost on port 6060.
    runtime.SetBlockProfileRate(1)
    runtime.SetMutexProfileFraction(1)

    mux := http.NewServeMux()
    mux.HandleFunc("/debug/pprof/", pprof.Index)
    mux.HandleFunc("/debug/pprof/profile", pprof.Profile)
    mux.HandleFunc("/debug/pprof/trace", pprof.Trace)
    log.Println("Starting pprof server on :6060")
    log.Fatal(http.ListenAndServe("localhost:6060", mux))
})

wg.Wait()

After that, you can start profiling with a specific profile by running the go tool pprof command with the matching URL, or just open that URL in your browser:

go tool pprof -proto \
  "http://localhost:6060/debug/pprof/profile?seconds=N" > cpu.pprof

go tool pprof -proto \
  http://localhost:6060/debug/pprof/heap > heap.pprof

go tool pprof -proto \
  http://localhost:6060/debug/pprof/allocs > allocs.pprof

go tool pprof -proto \
  http://localhost:6060/debug/pprof/goroutine > goroutine.pprof

go tool pprof -proto \
  http://localhost:6060/debug/pprof/block > block.pprof

go tool pprof -proto \
  http://localhost:6060/debug/pprof/mutex > mutex.pprof

For the CPU profile, you can choose how long the profiler runs (the default is 30 seconds). Other profiles are taken instantly.

After running the profiler, you'll get a binary file that you can open in the browser using the same go tool pprof utility. For example:

go tool pprof -http=localhost:8080 cpu.pprof

The pprof web interface lets you view the same profile in different ways. My personal favorites are the flame graph, which clearly shows the call hierarchy and resource usage, and the source view, which shows the exact lines of code.

Flame graph view
The flame graph view shows the call hierarchy and resource usage.
Source view
The source view shows the exact lines of code.

You can also profile manually. To collect a CPU profile, use StartCPUProfile and StopCPUProfile:

func main() {
    // Start profiling and stop it when main exits.
    // Ignore errors for simplicity.
    file, _ := os.Create("cpu.prof")
    defer file.Close()
    pprof.StartCPUProfile(file)
    defer pprof.StopCPUProfile()

    // The rest of the program code.
    // ...
}

To collect other profiles, use Lookup:

// profile collects a profile with the given name.
func profile(name string) {
    // Ignore errors for simplicity.
    file, _ := os.Create(name + ".prof")
    defer file.Close()
    p := pprof.Lookup(name)
    if p != nil {
        p.WriteTo(file, 0)
    }
}

func main() {
    runtime.SetBlockProfileRate(1)
    runtime.SetMutexProfileFraction(1)

    // ...
    profile("heap")
    profile("allocs")
    // ...
}

Profiling is a broad topic, and we've only touched the surface. To learn more, start with these articles:

Tracing

Tracing records certain types of events while the program is running, mainly those related to concurrency and memory:

  • goroutine creation and state changes;
  • system calls;
  • garbage collection;
  • heap size changes;
  • and more.

If you enabled the profiling server as described earlier, you can collect a trace using this URL:

http://localhost:6060/debug/pprof/trace?seconds=N

Trace files can be quite large, so it's better to use a small N value.

After tracing is complete, you'll get a binary file that you can open in the browser using the go tool trace utility:

go tool trace -http=localhost:6060 trace.out

In the trace web interface, you'll see each goroutine's "lifecycle" on its own line. You can zoom in and out of the trace with the W and S keys, and you can click on any event to see more details:

Trace web interface

You can also collect a trace manually:

func main() {
    // Start tracing and stop it when main exits.
    // Ignore errors for simplicity.
    file, _ := os.Create("trace.out")
    defer file.Close()
    trace.Start(file)
    defer trace.Stop()

    // The rest of the program code.
    // ...
}

Flight recorder

Flight recording is a tracing technique that collects execution data, such as function calls and memory allocations, within a sliding window that's limited by size or duration. It helps to record traces of interesting program behavior, even if you don't know in advance when it will happen.

The trace.FlightRecorder type (Go 1.25+) implements a flight recorder in Go. It tracks a moving window over the execution trace produced by the runtime, always containing the most recent trace data.

Here's an example of how you might use it.

First, configure the sliding window:

// Configure the flight recorder to keep
// at least 5 seconds of trace data,
// with a maximum buffer size of 3MB.
// Both of these are hints, not strict limits.
cfg := trace.FlightRecorderConfig{
    MinAge:   5 * time.Second,
    MaxBytes: 3 << 20, // 3MB
}

Then create the recorder and start it:

// Create and start the flight recorder.
rec := trace.NewFlightRecorder(cfg)
rec.Start()
defer rec.Stop()

Continue with the application code as usual:

// Simulate some workload.
done := make(chan struct{})
go func() {
    defer close(done)
    const n = 1 << 20
    var s []int
    for range n {
        s = append(s, rand.IntN(n))
    }
    fmt.Printf("done filling slice of %d elements\n", len(s))
}()
<-done

Finally, save the trace snapshot to a file when an important event occurs:

// Save the trace snapshot to a file.
file, _ := os.Create("/tmp/trace.out")
defer file.Close()
n, _ := rec.WriteTo(file)
fmt.Printf("wrote %dB to trace file\n", n)
done filling slice of 1048576 elements
wrote 8441B to trace file

Use go tool trace to view the trace in the browser:

go tool trace -http=localhost:6060 /tmp/trace.out

✎ Exercise: Comparing blocks

Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it.

If you are okay with just theory for now, let's continue.

Keep it up

Now you can see how challenging the Go scheduler's job is. Fortunately, most of the time you don't need to worry about how it works behind the scenes — sticking to goroutines, channels, select, and other synchronization primitives is usually enough.

This is the final chapter of my "Gist of Go: Concurrency" book. I invite you to read it — the book is an easy-to-understand, interactive guide to concurrency programming in Go.

Buy for $25   or read online

]]>