doc/go_mem.html

   1 <!--{
   2         "Title": "The Go Memory Model",
   3         "Subtitle": "Version of June 6, 2022",
   4         "Path": "/ref/mem"
   5 }-->
   6
   7 <style>
   8 p.rule {
   9   font-style: italic;
  10 }
  11 </style>
  12
  13 <h2 id="introduction">Introduction</h2>
  14
  15 <p>
  16 The Go memory model specifies the conditions under which
  17 reads of a variable in one goroutine can be guaranteed to
  18 observe values produced by writes to the same variable in a different goroutine.
  19 </p>
  20
  21
  22 <h3 id="advice">Advice</h3>
  23
  24 <p>
  25 Programs that modify data being simultaneously accessed by multiple goroutines
  26 must serialize such access.
  27 </p>
  28
  29 <p>
  30 To serialize access, protect the data with channel operations or other synchronization primitives
  31 such as those in the <a href="/pkg/sync/"><code>sync</code></a>
  32 and <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> packages.
  33 </p>
  34
  35 <p>
  36 If you must read the rest of this document to understand the behavior of your program,
  37 you are being too clever.
  38 </p>
  39
  40 <p>
  41 Don't be clever.
  42 </p>
  43
  44 <h3 id="overview">Informal Overview</h3>
  45
  46 <p>
  47 Go approaches its memory model in much the same way as the rest of the language,
  48 aiming to keep the semantics simple, understandable, and useful.
  49 This section gives a general overview of the approach and should suffice for most programmers.
  50 The memory model is specified more formally in the next section.
  51 </p>
  52
  53 <p>
  54 A <em>data race</em> is defined as
  55 a write to a memory location happening concurrently with another read or write to that same location,
  56 unless all the accesses involved are atomic data accesses as provided by the <code>sync/atomic</code> package.
  57 As noted already, programmers are strongly encouraged to use appropriate synchronization
  58 to avoid data races.
  59 In the absence of data races, Go programs behave as if all the goroutines
  60 were multiplexed onto a single processor.
  61 This property is sometimes referred to as DRF-SC: data-race-free programs
  62 execute in a sequentially consistent manner.
  63 </p>
  64
  65 <p>
  66 While programmers should write Go programs without data races,
  67 there are limitations to what a Go implementation can do in response to a data race.
  68 An implementation may always react to a data race by reporting the race and terminating the program.
  69 Otherwise, each read of a single-word-sized or sub-word-sized memory location
  70 must observe a value actually written to that location (perhaps by a concurrent executing goroutine)
  71 and not yet overwritten.
  72 These implementation constraints make Go more like Java or JavaScript,
  73 in that most races have a limited number of outcomes,
  74 and less like C and C++, where the meaning of any program with a race
  75 is entirely undefined, and the compiler may do anything at all.
  76 Go's approach aims to make errant programs more reliable and easier to debug,
  77 while still insisting that races are errors and that tools can diagnose and report them.
  78 </p>
  79
  80 <h2 id="model">Memory Model</h2>
  81
  82 <p>
  83 The following formal definition of Go's memory model closely follows
  84 the approach presented by Hans-J. Boehm and Sarita V. Adve in
  85 “<a href="https://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf">Foundations of the C++ Concurrency Memory Model</a>”,
  86 published in PLDI 2008.
  87 The definition of data-race-free programs and the guarantee of sequential consistency
  88 for race-free programs are equivalent to the ones in that work.
  89 </p>
  90
  91 <p>
  92 The memory model describes the requirements on program executions,
  93 which are made up of goroutine executions,
  94 which in turn are made up of memory operations.
  95 </p>
  96
  97 <p>
  98 A <i>memory operation</i> is modeled by four details:
  99 </p>
 100 <ul>
 101 <li>its kind, indicating whether it is an ordinary data read, an ordinary data write,
 102 or a <i>synchronizing operation</i> such as an atomic data access,
 103 a mutex operation, or a channel operation,
 104 <li>its location in the program,
 105 <li>the memory location or variable being accessed, and
 106 <li>the values read or written by the operation.
 107 </ul>
 108 <p>
 109 Some memory operations are <i>read-like</i>, including read, atomic read, mutex lock, and channel receive.
 110 Other memory operations are <i>write-like</i>, including write, atomic write, mutex unlock, channel send, and channel close.
 111 Some, such as atomic compare-and-swap, are both read-like and write-like.
 112 </p>
 113
 114 <p>
 115 A <i>goroutine execution</i> is modeled as a set of memory operations executed by a single goroutine.
 116 </p>
 117
 118 <p>
 119 <b>Requirement 1</b>:
 120 The memory operations in each goroutine must correspond to a correct sequential execution of that goroutine,
 121 given the values read from and written to memory.
 122 That execution must be consistent with the <i>sequenced before</i> relation,
 123 defined as the partial order requirements set out by the <a href="/ref/spec">Go language specification</a>
 124 for Go's control flow constructs as well as the <a href="/ref/spec#Order_of_evaluation">order of evaluation for expressions</a>.
 125 </p>
 126
 127 <p>
 128 A Go <i>program execution</i> is modeled as a set of goroutine executions,
 129 together with a mapping <i>W</i> that specifies the write-like operation that each read-like operation reads from.
 130 (Multiple executions of the same program can have different program executions.)
 131 </p>
 132
 133 <p>
 134 <b>Requirement 2</b>:
 135 For a given program execution, the mapping <i>W</i>, when limited to synchronizing operations,
 136 must be explainable by some implicit total order of the synchronizing operations
 137 that is consistent with sequencing and the values read and written by those operations.
 138 </p>
 139
 140 <p>
 141 The <i>synchronized before</i> relation is a partial order on synchronizing memory operations,
 142 derived from <i>W</i>.
 143 If a synchronizing read-like memory operation <i>r</i>
 144 observes a synchronizing write-like memory operation <i>w</i>
 145 (that is, if <i>W</i>(<i>r</i>) = <i>w</i>),
 146 then <i>w</i> is synchronized before <i>r</i>.
 147 Informally, the synchronized before relation is a subset of the implied total order
 148 mentioned in the previous paragraph,
 149 limited to the information that <i>W</i> directly observes.
 150 </p>
 151
 152 <p>
 153 The <i>happens before</i> relation is defined as the transitive closure of the
 154 union of the sequenced before and synchronized before relations.
 155 </p>
 156
 157 <p>
 158 <b>Requirement 3</b>:
 159 For an ordinary (non-synchronizing) data read <i>r</i> on a memory location <i>x</i>,
 160 <i>W</i>(<i>r</i>) must be a write <i>w</i> that is <i>visible</i> to <i>r</i>,
 161 where visible means that both of the following hold:
 162 </p>
 163
 164 <ol>
 165 <li><i>w</i> happens before <i>r</i>.
 166 <li><i>w</i> does not happen before any other write <i>w'</i> (to <i>x</i>) that happens before <i>r</i>.
 167 </ol>
 168
 169 <p>
 170 A <i>read-write data race</i> on memory location <i>x</i>
 171 consists of a read-like memory operation <i>r</i> on <i>x</i>
 172 and a write-like memory operation <i>w</i> on <i>x</i>,
 173 at least one of which is non-synchronizing,
 174 which are unordered by happens before
 175 (that is, neither <i>r</i> happens before <i>w</i>
 176 nor <i>w</i> happens before <i>r</i>).
 177 </p>
 178
 179 <p>
 180 A <i>write-write data race</i> on memory location <i>x</i>
 181 consists of two write-like memory operations <i>w</i> and <i>w'</i> on <i>x</i>,
 182 at least one of which is non-synchronizing,
 183 which are unordered by happens before.
 184 </p>
 185
 186 <p>
 187 Note that if there are no read-write or write-write data races on memory location <i>x</i>,
 188 then any read <i>r</i> on <i>x</i> has only one possible <i>W</i>(<i>r</i>):
 189 the single <i>w</i> that immediately precedes it in the happens before order.
 190 </p>
 191
 192 <p>
 193 More generally, it can be shown that any Go program that is data-race-free,
 194 meaning it has no program executions with read-write or write-write data races,
 195 can only have outcomes explained by some sequentially consistent interleaving
 196 of the goroutine executions.
 197 (The proof is the same as Section 7 of Boehm and Adve's paper cited above.)
 198 This property is called DRF-SC.
 199 </p>
 200
 201 <p>
 202 The intent of the formal definition is to match
 203 the DRF-SC guarantee provided to race-free programs
 204 by other languages, including C, C++, Java, JavaScript, Rust, and Swift.
 205 </p>
 206
 207 <p>
 208 Certain Go language operations such as goroutine creation and memory allocation
 209 act as synchronization operations.
 210 The effect of these operations on the synchronized-before partial order
 211 is documented in the “Synchronization” section below.
 212 Individual packages are responsible for providing similar documentation
 213 for their own operations.
 214 </p>
 215
 216 <h2 id="restrictions">Implementation Restrictions for Programs Containing Data Races</h2>
 217
 218 <p>
 219 The preceding section gave a formal definition of data-race-free program execution.
 220 This section informally describes the semantics that implementations must provide
 221 for programs that do contain races.
 222 </p>
 223
 224 <p>
 225 Any implementation can, upon detecting a data race,
 226 report the race and halt execution of the program.
 227 Implementations using ThreadSanitizer
 228 (accessed with “<code>go</code> <code>build</code> <code>-race</code>”)
 229 do exactly this.
 230 </p>
 231
 232 <p>
 233 A read of an array, struct, or complex number
 234 may by implemented as a read of each individual sub-value
 235 (array element, struct field, or real/imaginary component),
 236 in any order.
 237 Similarly, a write of an array, struct, or complex number
 238 may be implemented as a write of each individual sub-value,
 239 in any order.
 240 </p>
 241
 242 <p>
 243 A read <i>r</i> of a memory location <i>x</i>
 244 holding a value
 245 that is not larger than a machine word must observe
 246 some write <i>w</i> such that <i>r</i> does not happen before <i>w</i>
 247 and there is no write <i>w'</i> such that <i>w</i> happens before <i>w'</i>
 248 and <i>w'</i> happens before <i>r</i>.
 249 That is, each read must observe a value written by a preceding or concurrent write.
 250 </p>
 251
 252 <p>
 253 Additionally, observation of acausal and “out of thin air” writes is disallowed.
 254 </p>
 255
 256 <p>
 257 Reads of memory locations larger than a single machine word
 258 are encouraged but not required to meet the same semantics
 259 as word-sized memory locations,
 260 observing a single allowed write <i>w</i>.
 261 For performance reasons,
 262 implementations may instead treat larger operations
 263 as a set of individual machine-word-sized operations
 264 in an unspecified order.
 265 This means that races on multiword data structures
 266 can lead to inconsistent values not corresponding to a single write.
 267 When the values depend on the consistency
 268 of internal (pointer, length) or (pointer, type) pairs,
 269 as can be the case for interface values, maps,
 270 slices, and strings in most Go implementations,
 271 such races can in turn lead to arbitrary memory corruption.
 272 </p>
 273
 274 <p>
 275 Examples of incorrect synchronization are given in the
 276 “Incorrect synchronization” section below.
 277 </p>
 278
 279 <p>
 280 Examples of the limitations on implementations are given in the
 281 “Incorrect compilation” section below.
 282 </p>
 283
 284 <h2 id="synchronization">Synchronization</h2>
 285
 286 <h3 id="init">Initialization</h3>
 287
 288 <p>
 289 Program initialization runs in a single goroutine,
 290 but that goroutine may create other goroutines,
 291 which run concurrently.
 292 </p>
 293
 294 <p class="rule">
 295 If a package <code>p</code> imports package <code>q</code>, the completion of
 296 <code>q</code>'s <code>init</code> functions happens before the start of any of <code>p</code>'s.
 297 </p>
 298
 299 <p class="rule">
 300 The completion of all <code>init</code> functions is synchronized before
 301 the start of the function <code>main.main</code>.
 302 </p>
 303
 304 <h3 id="go">Goroutine creation</h3>
 305
 306 <p class="rule">
 307 The <code>go</code> statement that starts a new goroutine
 308 is synchronized before the start of the goroutine's execution.
 309 </p>
 310
 311 <p>
 312 For example, in this program:
 313 </p>
 314
 315 <pre>
 316 var a string
 317
 318 func f() {
 319         print(a)
 320 }
 321
 322 func hello() {
 323         a = "hello, world"
 324         go f()
 325 }
 326 </pre>
 327
 328 <p>
 329 calling <code>hello</code> will print <code>"hello, world"</code>
 330 at some point in the future (perhaps after <code>hello</code> has returned).
 331 </p>
 332
 333 <h3 id="goexit">Goroutine destruction</h3>
 334
 335 <p>
 336 The exit of a goroutine is not guaranteed to be synchronized before
 337 any event in the program.
 338 For example, in this program:
 339 </p>
 340
 341 <pre>
 342 var a string
 343
 344 func hello() {
 345         go func() { a = "hello" }()
 346         print(a)
 347 }
 348 </pre>
 349
 350 <p>
 351 the assignment to <code>a</code> is not followed by
 352 any synchronization event, so it is not guaranteed to be
 353 observed by any other goroutine.
 354 In fact, an aggressive compiler might delete the entire <code>go</code> statement.
 355 </p>
 356
 357 <p>
 358 If the effects of a goroutine must be observed by another goroutine,
 359 use a synchronization mechanism such as a lock or channel
 360 communication to establish a relative ordering.
 361 </p>
 362
 363 <h3 id="chan">Channel communication</h3>
 364
 365 <p>
 366 Channel communication is the main method of synchronization
 367 between goroutines.  Each send on a particular channel
 368 is matched to a corresponding receive from that channel,
 369 usually in a different goroutine.
 370 </p>
 371
 372 <p class="rule">
 373 A send on a channel is synchronized before the completion of the
 374 corresponding receive from that channel.
 375 </p>
 376
 377 <p>
 378 This program:
 379 </p>
 380
 381 <pre>
 382 var c = make(chan int, 10)
 383 var a string
 384
 385 func f() {
 386         a = "hello, world"
 387         c &lt;- 0
 388 }
 389
 390 func main() {
 391         go f()
 392         &lt;-c
 393         print(a)
 394 }
 395 </pre>
 396
 397 <p>
 398 is guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
 399 is sequenced before the send on <code>c</code>, which is synchronized before
 400 the corresponding receive on <code>c</code> completes, which is sequenced before
 401 the <code>print</code>.
 402 </p>
 403
 404 <p class="rule">
 405 The closing of a channel is synchronized before a receive that returns a zero value
 406 because the channel is closed.
 407 </p>
 408
 409 <p>
 410 In the previous example, replacing
 411 <code>c &lt;- 0</code> with <code>close(c)</code>
 412 yields a program with the same guaranteed behavior.
 413 </p>
 414
 415 <p class="rule">
 416 A receive from an unbuffered channel is synchronized before the completion of
 417 the corresponding send on that channel.
 418 </p>
 419
 420 <p>
 421 This program (as above, but with the send and receive statements swapped and
 422 using an unbuffered channel):
 423 </p>
 424
 425 <pre>
 426 var c = make(chan int)
 427 var a string
 428
 429 func f() {
 430         a = "hello, world"
 431         &lt;-c
 432 }
 433
 434 func main() {
 435         go f()
 436         c &lt;- 0
 437         print(a)
 438 }
 439 </pre>
 440
 441 <p>
 442 is also guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
 443 is sequenced before the receive on <code>c</code>, which is synchronized before
 444 the corresponding send on <code>c</code> completes, which is sequenced
 445 before the <code>print</code>.
 446 </p>
 447
 448 <p>
 449 If the channel were buffered (e.g., <code>c = make(chan int, 1)</code>)
 450 then the program would not be guaranteed to print
 451 <code>"hello, world"</code>.  (It might print the empty string,
 452 crash, or do something else.)
 453 </p>
 454
 455 <p class="rule">
 456 The <i>k</i>th receive on a channel with capacity <i>C</i> is synchronized before the completion of the <i>k</i>+<i>C</i>th send from that channel completes.
 457 </p>
 458
 459 <p>
 460 This rule generalizes the previous rule to buffered channels.
 461 It allows a counting semaphore to be modeled by a buffered channel:
 462 the number of items in the channel corresponds to the number of active uses,
 463 the capacity of the channel corresponds to the maximum number of simultaneous uses,
 464 sending an item acquires the semaphore, and receiving an item releases
 465 the semaphore.
 466 This is a common idiom for limiting concurrency.
 467 </p>
 468
 469 <p>
 470 This program starts a goroutine for every entry in the work list, but the
 471 goroutines coordinate using the <code>limit</code> channel to ensure
 472 that at most three are running work functions at a time.
 473 </p>
 474
 475 <pre>
 476 var limit = make(chan int, 3)
 477
 478 func main() {
 479         for _, w := range work {
 480                 go func(w func()) {
 481                         limit &lt;- 1
 482                         w()
 483                         &lt;-limit
 484                 }(w)
 485         }
 486         select{}
 487 }
 488 </pre>
 489
 490 <h3 id="locks">Locks</h3>
 491
 492 <p>
 493 The <code>sync</code> package implements two lock data types,
 494 <code>sync.Mutex</code> and <code>sync.RWMutex</code>.
 495 </p>
 496
 497 <p class="rule">
 498 For any <code>sync.Mutex</code> or <code>sync.RWMutex</code> variable <code>l</code> and <i>n</i> &lt; <i>m</i>,
 499 call <i>n</i> of <code>l.Unlock()</code> is synchronized before call <i>m</i> of <code>l.Lock()</code> returns.
 500 </p>
 501
 502 <p>
 503 This program:
 504 </p>
 505
 506 <pre>
 507 var l sync.Mutex
 508 var a string
 509
 510 func f() {
 511         a = "hello, world"
 512         l.Unlock()
 513 }
 514
 515 func main() {
 516         l.Lock()
 517         go f()
 518         l.Lock()
 519         print(a)
 520 }
 521 </pre>
 522
 523 <p>
 524 is guaranteed to print <code>"hello, world"</code>.
 525 The first call to <code>l.Unlock()</code> (in <code>f</code>) is synchronized
 526 before the second call to <code>l.Lock()</code> (in <code>main</code>) returns,
 527 which is sequenced before the <code>print</code>.
 528 </p>
 529
 530 <p class="rule">
 531 For any call to <code>l.RLock</code> on a <code>sync.RWMutex</code> variable <code>l</code>,
 532 there is an <i>n</i> such that the <i>n</i>th call to <code>l.Unlock</code>
 533 is synchronized before the return from <code>l.RLock</code>,
 534 and the matching call to <code>l.RUnlock</code> is synchronized before the return from call <i>n</i>+1 to <code>l.Lock</code>.
 535 </p>
 536
 537 <p class="rule">
 538 A successful call to <code>l.TryLock</code> (or <code>l.TryRLock</code>)
 539 is equivalent to a call to <code>l.Lock</code> (or <code>l.RLock</code>).
 540 An unsuccessful call has no synchronizing effect at all.
 541 As far as the memory model is concerned,
 542 <code>l.TryLock</code> (or <code>l.TryRLock</code>)
 543 may be considered to be able to return false
 544 even when the mutex <i>l</i> is unlocked.
 545 </p>
 546
 547 <h3 id="once">Once</h3>
 548
 549 <p>
 550 The <code>sync</code> package provides a safe mechanism for
 551 initialization in the presence of multiple goroutines
 552 through the use of the <code>Once</code> type.
 553 Multiple threads can execute <code>once.Do(f)</code> for a particular <code>f</code>,
 554 but only one will run <code>f()</code>, and the other calls block
 555 until <code>f()</code> has returned.
 556 </p>
 557
 558 <p class="rule">
 559 The completion of a single call of <code>f()</code> from <code>once.Do(f)</code>
 560 is synchronized before the return of any call of <code>once.Do(f)</code>.
 561 </p>
 562
 563 <p>
 564 In this program:
 565 </p>
 566
 567 <pre>
 568 var a string
 569 var once sync.Once
 570
 571 func setup() {
 572         a = "hello, world"
 573 }
 574
 575 func doprint() {
 576         once.Do(setup)
 577         print(a)
 578 }
 579
 580 func twoprint() {
 581         go doprint()
 582         go doprint()
 583 }
 584 </pre>
 585
 586 <p>
 587 calling <code>twoprint</code> will call <code>setup</code> exactly
 588 once.
 589 The <code>setup</code> function will complete before either call
 590 of <code>print</code>.
 591 The result will be that <code>"hello, world"</code> will be printed
 592 twice.
 593 </p>
 594
 595 <h3 id="atomic">Atomic Values</h3>
 596
 597 <p>
 598 The APIs in the <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a>
 599 package are collectively “atomic operations”
 600 that can be used to synchronize the execution of different goroutines.
 601 If the effect of an atomic operation <i>A</i> is observed by atomic operation <i>B</i>,
 602 then <i>A</i> is synchronized before <i>B</i>.
 603 All the atomic operations executed in a program behave as though executed
 604 in some sequentially consistent order.
 605 </p>
 606
 607 <p>
 608 The preceding definition has the same semantics as C++’s sequentially consistent atomics
 609 and Java’s <code>volatile</code> variables.
 610 </p>
 611
 612 <h3 id="finalizer">Finalizers</h3>
 613
 614 <p>
 615 The <a href="/pkg/runtime/"><code>runtime</code></a> package provides
 616 a <code>SetFinalizer</code> function that adds a finalizer to be called when
 617 a particular object is no longer reachable by the program.
 618 A call to <code>SetFinalizer(x, f)</code> is synchronized before the finalization call <code>f(x)</code>.
 619 </p>
 620
 621 <h3 id="more">Additional Mechanisms</h3>
 622
 623 <p>
 624 The <code>sync</code> package provides additional synchronization abstractions,
 625 including <a href="/pkg/sync/#Cond">condition variables</a>,
 626 <a href="/pkg/sync/#Map">lock-free maps</a>,
 627 <a href="/pkg/sync/#Pool">allocation pools</a>,
 628 and
 629 <a href="/pkg/sync/#WaitGroup">wait groups</a>.
 630 The documentation for each of these specifies the guarantees it
 631 makes concerning synchronization.
 632 </p>
 633
 634 <p>
 635 Other packages that provide synchronization abstractions
 636 should document the guarantees they make too.
 637 </p>
 638
 639
 640 <h2 id="badsync">Incorrect synchronization</h2>
 641
 642 <p>
 643 Programs with races are incorrect and
 644 can exhibit non-sequentially consistent executions.
 645 In particular, note that a read <i>r</i> may observe the value written by any write <i>w</i>
 646 that executes concurrently with <i>r</i>.
 647 Even if this occurs, it does not imply that reads happening after <i>r</i>
 648 will observe writes that happened before <i>w</i>.
 649 </p>
 650
 651 <p>
 652 In this program:
 653 </p>
 654
 655 <pre>
 656 var a, b int
 657
 658 func f() {
 659         a = 1
 660         b = 2
 661 }
 662
 663 func g() {
 664         print(b)
 665         print(a)
 666 }
 667
 668 func main() {
 669         go f()
 670         g()
 671 }
 672 </pre>
 673
 674 <p>
 675 it can happen that <code>g</code> prints <code>2</code> and then <code>0</code>.
 676 </p>
 677
 678 <p>
 679 This fact invalidates a few common idioms.
 680 </p>
 681
 682 <p>
 683 Double-checked locking is an attempt to avoid the overhead of synchronization.
 684 For example, the <code>twoprint</code> program might be
 685 incorrectly written as:
 686 </p>
 687
 688 <pre>
 689 var a string
 690 var done bool
 691
 692 func setup() {
 693         a = "hello, world"
 694         done = true
 695 }
 696
 697 func doprint() {
 698         if !done {
 699                 once.Do(setup)
 700         }
 701         print(a)
 702 }
 703
 704 func twoprint() {
 705         go doprint()
 706         go doprint()
 707 }
 708 </pre>
 709
 710 <p>
 711 but there is no guarantee that, in <code>doprint</code>, observing the write to <code>done</code>
 712 implies observing the write to <code>a</code>.  This
 713 version can (incorrectly) print an empty string
 714 instead of <code>"hello, world"</code>.
 715 </p>
 716
 717 <p>
 718 Another incorrect idiom is busy waiting for a value, as in:
 719 </p>
 720
 721 <pre>
 722 var a string
 723 var done bool
 724
 725 func setup() {
 726         a = "hello, world"
 727         done = true
 728 }
 729
 730 func main() {
 731         go setup()
 732         for !done {
 733         }
 734         print(a)
 735 }
 736 </pre>
 737
 738 <p>
 739 As before, there is no guarantee that, in <code>main</code>,
 740 observing the write to <code>done</code>
 741 implies observing the write to <code>a</code>, so this program could
 742 print an empty string too.
 743 Worse, there is no guarantee that the write to <code>done</code> will ever
 744 be observed by <code>main</code>, since there are no synchronization
 745 events between the two threads.  The loop in <code>main</code> is not
 746 guaranteed to finish.
 747 </p>
 748
 749 <p>
 750 There are subtler variants on this theme, such as this program.
 751 </p>
 752
 753 <pre>
 754 type T struct {
 755         msg string
 756 }
 757
 758 var g *T
 759
 760 func setup() {
 761         t := new(T)
 762         t.msg = "hello, world"
 763         g = t
 764 }
 765
 766 func main() {
 767         go setup()
 768         for g == nil {
 769         }
 770         print(g.msg)
 771 }
 772 </pre>
 773
 774 <p>
 775 Even if <code>main</code> observes <code>g != nil</code> and exits its loop,
 776 there is no guarantee that it will observe the initialized
 777 value for <code>g.msg</code>.
 778 </p>
 779
 780 <p>
 781 In all these examples, the solution is the same:
 782 use explicit synchronization.
 783 </p>
 784
 785 <h2 id="badcompiler">Incorrect compilation</h2>
 786
 787 <p>
 788 The Go memory model restricts compiler optimizations as much as it does Go programs.
 789 Some compiler optimizations that would be valid in single-threaded programs are not valid in all Go programs.
 790 In particular, a compiler must not introduce writes that do not exist in the original program,
 791 it must not allow a single read to observe multiple values,
 792 and it must not allow a single write to write multiple values.
 793 </p>
 794
 795 <p>
 796 All the following examples assume that `*p` and `*q` refer to
 797 memory locations accessible to multiple goroutines.
 798 </p>
 799
 800 <p>
 801 Not introducing data races into race-free programs means not moving
 802 writes out of conditional statements in which they appear.
 803 For example, a compiler must not invert the conditional in this program:
 804 </p>
 805
 806 <pre>
 807 *p = 1
 808 if cond {
 809         *p = 2
 810 }
 811 </pre>
 812
 813 <p>
 814 That is, the compiler must not rewrite the program into this one:
 815 </p>
 816
 817 <pre>
 818 *p = 2
 819 if !cond {
 820         *p = 1
 821 }
 822 </pre>
 823
 824 <p>
 825 If <code>cond</code> is false and another goroutine is reading <code>*p</code>,
 826 then in the original program, the other goroutine can only observe any prior value of <code>*p</code> and <code>1</code>.
 827 In the rewritten program, the other goroutine can observe <code>2</code>, which was previously impossible.
 828 </p>
 829
 830 <p>
 831 Not introducing data races also means not assuming that loops terminate.
 832 For example, a compiler must in general not move the accesses to <code>*p</code> or <code>*q</code>
 833 ahead of the loop in this program:
 834 </p>
 835
 836 <pre>
 837 n := 0
 838 for e := list; e != nil; e = e.next {
 839         n++
 840 }
 841 i := *p
 842 *q = 1
 843 </pre>
 844
 845 <p>
 846 If <code>list</code> pointed to a cyclic list,
 847 then the original program would never access <code>*p</code> or <code>*q</code>,
 848 but the rewritten program would.
 849 (Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic;
 850 moving `*q` ahead would also require the compiler proving that no other
 851 goroutine can access `*q`.)
 852 </p>
 853
 854 <p>
 855 Not introducing data races also means not assuming that called functions
 856 always return or are free of synchronization operations.
 857 For example, a compiler must not move the accesses to <code>*p</code> or <code>*q</code>
 858 ahead of the function call in this program
 859 (at least not without direct knowledge of the precise behavior of <code>f</code>):
 860 </p>
 861
 862 <pre>
 863 f()
 864 i := *p
 865 *q = 1
 866 </pre>
 867
 868 <p>
 869 If the call never returned, then once again the original program
 870 would never access <code>*p</code> or <code>*q</code>, but the rewritten program would.
 871 And if the call contained synchronizing operations, then the original program
 872 could establish happens before edges preceding the accesses
 873 to <code>*p</code> and <code>*q</code>, but the rewritten program would not.
 874 </p>
 875
 876 <p>
 877 Not allowing a single read to observe multiple values means
 878 not reloading local variables from shared memory.
 879 For example, a compiler must not discard <code>i</code> and reload it
 880 a second time from <code>*p</code> in this program:
 881 </p>
 882
 883 <pre>
 884 i := *p
 885 if i &lt; 0 || i &gt;= len(funcs) {
 886         panic("invalid function index")
 887 }
 888 ... complex code ...
 889 // compiler must NOT reload i = *p here
 890 funcs[i]()
 891 </pre>
 892
 893 <p>
 894 If the complex code needs many registers, a compiler for single-threaded programs
 895 could discard <code>i</code> without saving a copy and then reload
 896 <code>i = *p</code> just before
 897 <code>funcs[i]()</code>.
 898 A Go compiler must not, because the value of <code>*p</code> may have changed.
 899 (Instead, the compiler could spill <code>i</code> to the stack.)
 900 </p>
 901
 902 <p>
 903 Not allowing a single write to write multiple values also means not using
 904 the memory where a local variable will be written as temporary storage before the write.
 905 For example, a compiler must not use <code>*p</code> as temporary storage in this program:
 906 </p>
 907
 908 <pre>
 909 *p = i + *p/2
 910 </pre>
 911
 912 <p>
 913 That is, it must not rewrite the program into this one:
 914 </p>
 915
 916 <pre>
 917 *p /= 2
 918 *p += i
 919 </pre>
 920
 921 <p>
 922 If <code>i</code> and <code>*p</code> start equal to 2,
 923 the original code does <code>*p = 3</code>,
 924 so a racing thread can read only 2 or 3 from <code>*p</code>.
 925 The rewritten code does <code>*p = 1</code> and then <code>*p = 3</code>,
 926 allowing a racing thread to read 1 as well.
 927 </p>
 928
 929 <p>
 930 Note that all these optimizations are permitted in C/C++ compilers:
 931 a Go compiler sharing a back end with a C/C++ compiler must take care
 932 to disable optimizations that are invalid for Go.
 933 </p>
 934
 935 <p>
 936 Note that the prohibition on introducing data races
 937 does not apply if the compiler can prove that the races
 938 do not affect correct execution on the target platform.
 939 For example, on essentially all CPUs, it is valid to rewrite
 940 </p>
 941
 942 <pre>
 943 n := 0
 944 for i := 0; i < m; i++ {
 945         n += *shared
 946 }
 947 </pre>
 948
 949 into:
 950
 951 <pre>
 952 n := 0
 953 local := *shared
 954 for i := 0; i < m; i++ {
 955         n += local
 956 }
 957 </pre>
 958
 959 <p>
 960 provided it can be proved that <code>*shared</code> will not fault on access,
 961 because the potential added read will not affect any existing concurrent reads or writes.
 962 On the other hand, the rewrite would not be valid in a source-to-source translator.
 963 </p>
 964
 965 <h2 id="conclusion">Conclusion</h2>
 966
 967 <p>
 968 Go programmers writing data-race-free programs can rely on
 969 sequentially consistent execution of those programs,
 970 just as in essentially all other modern programming languages.
 971 </p>
 972
 973 <p>
 974 When it comes to programs with races,
 975 both programmers and compilers should remember the advice:
 976 don't be clever.
 977 </p>