doc/go_mem.html

   1 <!--{
   2         "Title": "The Go Memory Model",
   3         "Subtitle": "Version of June 6, 2022",
   4         "Path": "/ref/mem"
   5 }-->
   6
   7 <style>
   8 p.rule {
   9   font-style: italic;
  10 }
  11 </style>
  12
  13 <h2 id="introduction">Introduction</h2>
  14
  15 <p>
  16 The Go memory model specifies the conditions under which
  17 reads of a variable in one goroutine can be guaranteed to
  18 observe values produced by writes to the same variable in a different goroutine.
  19 </p>
  20
  21
  22 <h3 id="advice">Advice</h3>
  23
  24 <p>
  25 Programs that modify data being simultaneously accessed by multiple goroutines
  26 must serialize such access.
  27 </p>
  28
  29 <p>
  30 To serialize access, protect the data with channel operations or other synchronization primitives
  31 such as those in the <a href="/pkg/sync/"><code>sync</code></a>
  32 and <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> packages.
  33 </p>
  34
  35 <p>
  36 If you must read the rest of this document to understand the behavior of your program,
  37 you are being too clever.
  38 </p>
  39
  40 <p>
  41 Don't be clever.
  42 </p>
  43
  44 <h3 id="overview">Informal Overview</h3>
  45
  46 <p>
  47 Go approaches its memory model in much the same way as the rest of the language,
  48 aiming to keep the semantics simple, understandable, and useful.
  49 This section gives a general overview of the approach and should suffice for most programmers.
  50 The memory model is specified more formally in the next section.
  51 </p>
  52
  53 <p>
  54 A <em>data race</em> is defined as
  55 a write to a memory location happening concurrently with another read or write to that same location,
  56 unless all the accesses involved are atomic data accesses as provided by the <code>sync/atomic</code> package.
  57 As noted already, programmers are strongly encouraged to use appropriate synchronization
  58 to avoid data races.
  59 In the absence of data races, Go programs behave as if all the goroutines
  60 were multiplexed onto a single processor.
  61 This property is sometimes referred to as DRF-SC: data-race-free programs
  62 execute in a sequentially consistent manner.
  63 </p>
  64
  65 <p>
  66 While programmers should write Go programs without data races,
  67 there are limitations to what a Go implementation can do in response to a data race.
  68 An implementation may always react to a data race by reporting the race and terminating the program.
  69 Otherwise, each read of a single-word-sized or sub-word-sized memory location
  70 must observe a value actually written to that location (perhaps by a concurrent executing goroutine)
  71 and not yet overwritten.
  72 These implementation constraints make Go more like Java or JavaScript,
  73 in that most races have a limited number of outcomes,
  74 and less like C and C++, where the meaning of any program with a race
  75 is entirely undefined, and the compiler may do anything at all.
  76 Go's approach aims to make errant programs more reliable and easier to debug,
  77 while still insisting that races are errors and that tools can diagnose and report them.
  78 </p>
  79
  80 <h2 id="model">Memory Model</h2>
  81
  82 <p>
  83 The following formal definition of Go's memory model closely follows
  84 the approach presented by Hans-J. Boehm and Sarita V. Adve in
  85 “<a href="https://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf">Foundations of the C++ Concurrency Memory Model</a>”,
  86 published in PLDI 2008.
  87 The definition of data-race-free programs and the guarantee of sequential consistency
  88 for race-free programs are equivalent to the ones in that work.
  89 </p>
  90
  91 <p>
  92 The memory model describes the requirements on program executions,
  93 which are made up of goroutine executions,
  94 which in turn are made up of memory operations.
  95 </p>
  96
  97 <p>
  98 A <i>memory operation</i> is modeled by four details:
  99 </p>
 100 <ul>
 101 <li>its kind, indicating whether it is an ordinary data read, an ordinary data write,
 102 or a <i>synchronizing operation</i> such as an atomic data access,
 103 a mutex operation, or a channel operation,
 104 <li>its location in the program,
 105 <li>the memory location or variable being accessed, and
 106 <li>the values read or written by the operation.
 107 </ul>
 108 <p>
 109 Some memory operations are <i>read-like</i>, including read, atomic read, mutex lock, and channel receive.
 110 Other memory operations are <i>write-like</i>, including write, atomic write, mutex unlock, channel send, and channel close.
 111 Some, such as atomic compare-and-swap, are both read-like and write-like.
 112 </p>
 113
 114 <p>
 115 A <i>goroutine execution</i> is modeled as a set of memory operations executed by a single goroutine.
 116 </p>
 117
 118 <p>
 119 <b>Requirement 1</b>:
 120 The memory operations in each goroutine must correspond to a correct sequential execution of that goroutine,
 121 given the values read from and written to memory.
 122 That execution must be consistent with the <i>sequenced before</i> relation,
 123 defined as the partial order requirements set out by the <a href="/ref/spec">Go language specification</a>
 124 for Go's control flow constructs as well as the <a href="/ref/spec#Order_of_evaluation">order of evaluation for expressions</a>.
 125 </p>
 126
 127 <p>
 128 A Go <i>program execution</i> is modeled as a set of goroutine executions,
 129 together with a mapping <i>W</i> that specifies the write-like operation that each read-like operation reads from.
 130 (Multiple executions of the same program can have different program executions.)
 131 </p>
 132
 133 <p>
 134 <b>Requirement 2</b>:
 135 For a given program execution, the mapping <i>W</i>, when limited to synchronizing operations,
 136 must be explainable by some implicit total order of the synchronizing operations
 137 that is consistent with sequencing and the values read and written by those operations.
 138 </p>
 139
 140 <p>
 141 The <i>synchronized before</i> relation is a partial order on synchronizing memory operations,
 142 derived from <i>W</i>.
 143 If a synchronizing read-like memory operation <i>r</i>
 144 observes a synchronizing write-like memory operation <i>w</i>
 145 (that is, if <i>W</i>(<i>r</i>) = <i>w</i>),
 146 then <i>w</i> is synchronized before <i>r</i>.
 147 Informally, the synchronized before relation is a subset of the implied total order
 148 mentioned in the previous paragraph,
 149 limited to the information that <i>W</i> directly observes.
 150 </p>
 151
 152 <p>
 153 The <i>happens before</i> relation is defined as the transitive closure of the
 154 union of the sequenced before and synchronized before relations.
 155 </p>
 156
 157 <p>
 158 <b>Requirement 3</b>:
 159 For an ordinary (non-synchronizing) data read <i>r</i> on a memory location <i>x</i>,
 160 <i>W</i>(<i>r</i>) must be a write <i>w</i> that is <i>visible</i> to <i>r</i>,
 161 where visible means that both of the following hold:
 162
 163 <ol>
 164 <li><i>w</i> happens before <i>r</i>.
 165 <li><i>w</i> does not happen before any other write <i>w'</i> (to <i>x</i>) that happens before <i>r</i>.
 166 </ol>
 167
 168 <p>
 169 A <i>read-write data race</i> on memory location <i>x</i>
 170 consists of a read-like memory operation <i>r</i> on <i>x</i>
 171 and a write-like memory operation <i>w</i> on <i>x</i>,
 172 at least one of which is non-synchronizing,
 173 which are unordered by happens before
 174 (that is, neither <i>r</i> happens before <i>w</i>
 175 nor <i>w</i> happens before <i>r</i>).
 176 </p>
 177
 178 <p>
 179 A <i>write-write data race</i> on memory location <i>x</i>
 180 consists of two write-like memory operations <i>w</i> and <i>w'</i> on <i>x</i>,
 181 at least one of which is non-synchronizing,
 182 which are unordered by happens before.
 183 </p>
 184
 185 <p>
 186 Note that if there are no read-write or write-write data races on memory location <i>x</i>,
 187 then any read <i>r</i> on <i>x</i> has only one possible <i>W</i>(<i>r</i>):
 188 the single <i>w</i> that immediately precedes it in the happens before order.
 189 </p>
 190
 191 <p>
 192 More generally, it can be shown that any Go program that is data-race-free,
 193 meaning it has no program executions with read-write or write-write data races,
 194 can only have outcomes explained by some sequentially consistent interleaving
 195 of the goroutine executions.
 196 (The proof is the same as Section 7 of Boehm and Adve's paper cited above.)
 197 This property is called DRF-SC.
 198 </p>
 199
 200 <p>
 201 The intent of the formal definition is to match
 202 the DRF-SC guarantee provided to race-free programs
 203 by other languages, including C, C++, Java, JavaScript, Rust, and Swift.
 204 </p>
 205
 206 <p>
 207 Certain Go language operations such as goroutine creation and memory allocation
 208 act as synchronization operations.
 209 The effect of these operations on the synchronized-before partial order
 210 is documented in the “Synchronization” section below.
 211 Individual packages are responsible for providing similar documentation
 212 for their own operations.
 213 </p>
 214
 215 <h2 id="restrictions">Implementation Restrictions for Programs Containing Data Races</h2>
 216
 217 <p>
 218 The preceding section gave a formal definition of data-race-free program execution.
 219 This section informally describes the semantics that implementations must provide
 220 for programs that do contain races.
 221 </p>
 222
 223 <p>
 224 Any implementation can, upon detecting a data race,
 225 report the race and halt execution of the program.
 226 Implementations using ThreadSanitizer
 227 (accessed with “<code>go</code> <code>build</code> <code>-race</code>”)
 228 do exactly this.
 229 </p>
 230
 231 <p>
 232 A read of an array, struct, or complex number
 233 may by implemented as a read of each individual sub-value
 234 (array element, struct field, or real/imaginary component),
 235 in any order.
 236 Similarly, a write of an array, struct, or complex number
 237 may be implemented as a write of each individual sub-value,
 238 in any order.
 239 </p>
 240
 241 <p>
 242 A read <i>r</i> of a memory location <i>x</i>
 243 holding a value
 244 that is not larger than a machine word must observe
 245 some write <i>w</i> such that <i>r</i> does not happen before <i>w</i>
 246 and there is no write <i>w'</i> such that <i>w</i> happens before <i>w'</i>
 247 and <i>w'</i> happens before <i>r</i>.
 248 That is, each read must observe a value written by a preceding or concurrent write.
 249 </p>
 250
 251 <p>
 252 Additionally, observation of acausal and “out of thin air” writes is disallowed.
 253 </p>
 254
 255 <p>
 256 Reads of memory locations larger than a single machine word
 257 are encouraged but not required to meet the same semantics
 258 as word-sized memory locations,
 259 observing a single allowed write <i>w</i>.
 260 For performance reasons,
 261 implementations may instead treat larger operations
 262 as a set of individual machine-word-sized operations
 263 in an unspecified order.
 264 This means that races on multiword data structures
 265 can lead to inconsistent values not corresponding to a single write.
 266 When the values depend on the consistency
 267 of internal (pointer, length) or (pointer, type) pairs,
 268 as can be the case for interface values, maps,
 269 slices, and strings in most Go implementations,
 270 such races can in turn lead to arbitrary memory corruption.
 271 </p>
 272
 273 <p>
 274 Examples of incorrect synchronization are given in the
 275 “Incorrect synchronization” section below.
 276 </p>
 277
 278 <p>
 279 Examples of the limitations on implementations are given in the
 280 “Incorrect compilation” section below.
 281 </p>
 282
 283 <h2 id="synchronization">Synchronization</h2>
 284
 285 <h3 id="init">Initialization</h3>
 286
 287 <p>
 288 Program initialization runs in a single goroutine,
 289 but that goroutine may create other goroutines,
 290 which run concurrently.
 291 </p>
 292
 293 <p class="rule">
 294 If a package <code>p</code> imports package <code>q</code>, the completion of
 295 <code>q</code>'s <code>init</code> functions happens before the start of any of <code>p</code>'s.
 296 </p>
 297
 298 <p class="rule">
 299 The completion of all <code>init</code> functions is synchronized before
 300 the start of the function <code>main.main</code>.
 301 </p>
 302
 303 <h3 id="go">Goroutine creation</h3>
 304
 305 <p class="rule">
 306 The <code>go</code> statement that starts a new goroutine
 307 is synchronized before the start of the goroutine's execution.
 308 </p>
 309
 310 <p>
 311 For example, in this program:
 312 </p>
 313
 314 <pre>
 315 var a string
 316
 317 func f() {
 318         print(a)
 319 }
 320
 321 func hello() {
 322         a = "hello, world"
 323         go f()
 324 }
 325 </pre>
 326
 327 <p>
 328 calling <code>hello</code> will print <code>"hello, world"</code>
 329 at some point in the future (perhaps after <code>hello</code> has returned).
 330 </p>
 331
 332 <h3 id="goexit">Goroutine destruction</h3>
 333
 334 <p>
 335 The exit of a goroutine is not guaranteed to be synchronized before
 336 any event in the program.
 337 For example, in this program:
 338 </p>
 339
 340 <pre>
 341 var a string
 342
 343 func hello() {
 344         go func() { a = "hello" }()
 345         print(a)
 346 }
 347 </pre>
 348
 349 <p>
 350 the assignment to <code>a</code> is not followed by
 351 any synchronization event, so it is not guaranteed to be
 352 observed by any other goroutine.
 353 In fact, an aggressive compiler might delete the entire <code>go</code> statement.
 354 </p>
 355
 356 <p>
 357 If the effects of a goroutine must be observed by another goroutine,
 358 use a synchronization mechanism such as a lock or channel
 359 communication to establish a relative ordering.
 360 </p>
 361
 362 <h3 id="chan">Channel communication</h3>
 363
 364 <p>
 365 Channel communication is the main method of synchronization
 366 between goroutines.  Each send on a particular channel
 367 is matched to a corresponding receive from that channel,
 368 usually in a different goroutine.
 369 </p>
 370
 371 <p class="rule">
 372 A send on a channel is synchronized before the completion of the
 373 corresponding receive from that channel.
 374 </p>
 375
 376 <p>
 377 This program:
 378 </p>
 379
 380 <pre>
 381 var c = make(chan int, 10)
 382 var a string
 383
 384 func f() {
 385         a = "hello, world"
 386         c &lt;- 0
 387 }
 388
 389 func main() {
 390         go f()
 391         &lt;-c
 392         print(a)
 393 }
 394 </pre>
 395
 396 <p>
 397 is guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
 398 is sequenced before the send on <code>c</code>, which is synchronized before
 399 the corresponding receive on <code>c</code> completes, which is sequenced before
 400 the <code>print</code>.
 401 </p>
 402
 403 <p class="rule">
 404 The closing of a channel is synchronized before a receive that returns a zero value
 405 because the channel is closed.
 406 </p>
 407
 408 <p>
 409 In the previous example, replacing
 410 <code>c &lt;- 0</code> with <code>close(c)</code>
 411 yields a program with the same guaranteed behavior.
 412 </p>
 413
 414 <p class="rule">
 415 A receive from an unbuffered channel is synchronized before the completion of
 416 the corresponding send on that channel.
 417 </p>
 418
 419 <p>
 420 This program (as above, but with the send and receive statements swapped and
 421 using an unbuffered channel):
 422 </p>
 423
 424 <pre>
 425 var c = make(chan int)
 426 var a string
 427
 428 func f() {
 429         a = "hello, world"
 430         &lt;-c
 431 }
 432
 433 func main() {
 434         go f()
 435         c &lt;- 0
 436         print(a)
 437 }
 438 </pre>
 439
 440 <p>
 441 is also guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
 442 is sequenced before the receive on <code>c</code>, which is synchronized before
 443 the corresponding send on <code>c</code> completes, which is sequenced
 444 before the <code>print</code>.
 445 </p>
 446
 447 <p>
 448 If the channel were buffered (e.g., <code>c = make(chan int, 1)</code>)
 449 then the program would not be guaranteed to print
 450 <code>"hello, world"</code>.  (It might print the empty string,
 451 crash, or do something else.)
 452 </p>
 453
 454 <p class="rule">
 455 The <i>k</i>th receive on a channel with capacity <i>C</i> is synchronized before the completion of the <i>k</i>+<i>C</i>th send from that channel completes.
 456 </p>
 457
 458 <p>
 459 This rule generalizes the previous rule to buffered channels.
 460 It allows a counting semaphore to be modeled by a buffered channel:
 461 the number of items in the channel corresponds to the number of active uses,
 462 the capacity of the channel corresponds to the maximum number of simultaneous uses,
 463 sending an item acquires the semaphore, and receiving an item releases
 464 the semaphore.
 465 This is a common idiom for limiting concurrency.
 466 </p>
 467
 468 <p>
 469 This program starts a goroutine for every entry in the work list, but the
 470 goroutines coordinate using the <code>limit</code> channel to ensure
 471 that at most three are running work functions at a time.
 472 </p>
 473
 474 <pre>
 475 var limit = make(chan int, 3)
 476
 477 func main() {
 478         for _, w := range work {
 479                 go func(w func()) {
 480                         limit &lt;- 1
 481                         w()
 482                         &lt;-limit
 483                 }(w)
 484         }
 485         select{}
 486 }
 487 </pre>
 488
 489 <h3 id="locks">Locks</h3>
 490
 491 <p>
 492 The <code>sync</code> package implements two lock data types,
 493 <code>sync.Mutex</code> and <code>sync.RWMutex</code>.
 494 </p>
 495
 496 <p class="rule">
 497 For any <code>sync.Mutex</code> or <code>sync.RWMutex</code> variable <code>l</code> and <i>n</i> &lt; <i>m</i>,
 498 call <i>n</i> of <code>l.Unlock()</code> is synchronized before call <i>m</i> of <code>l.Lock()</code> returns.
 499 </p>
 500
 501 <p>
 502 This program:
 503 </p>
 504
 505 <pre>
 506 var l sync.Mutex
 507 var a string
 508
 509 func f() {
 510         a = "hello, world"
 511         l.Unlock()
 512 }
 513
 514 func main() {
 515         l.Lock()
 516         go f()
 517         l.Lock()
 518         print(a)
 519 }
 520 </pre>
 521
 522 <p>
 523 is guaranteed to print <code>"hello, world"</code>.
 524 The first call to <code>l.Unlock()</code> (in <code>f</code>) is synchronized
 525 before the second call to <code>l.Lock()</code> (in <code>main</code>) returns,
 526 which is sequenced before the <code>print</code>.
 527 </p>
 528
 529 <p class="rule">
 530 For any call to <code>l.RLock</code> on a <code>sync.RWMutex</code> variable <code>l</code>,
 531 there is an <i>n</i> such that the <i>n</i>th call to <code>l.Unlock</code>
 532 is synchronized before the return from <code>l.RLock</code>,
 533 and the matching call to <code>l.RUnlock</code> is synchronized before the return from call <i>n</i>+1 to <code>l.Lock</code>.
 534 </p>
 535
 536 <p class="rule">
 537 A successful call to <code>l.TryLock</code> (or <code>l.TryRLock</code>)
 538 is equivalent to a call to <code>l.Lock</code> (or <code>l.RLock</code>).
 539 An unsuccessful call has no synchronizing effect at all.
 540 As far as the memory model is concerned,
 541 <code>l.TryLock</code> (or <code>l.TryRLock</code>)
 542 may be considered to be able to return false
 543 even when the mutex <i>l</i> is unlocked.
 544 </p>
 545
 546 <h3 id="once">Once</h3>
 547
 548 <p>
 549 The <code>sync</code> package provides a safe mechanism for
 550 initialization in the presence of multiple goroutines
 551 through the use of the <code>Once</code> type.
 552 Multiple threads can execute <code>once.Do(f)</code> for a particular <code>f</code>,
 553 but only one will run <code>f()</code>, and the other calls block
 554 until <code>f()</code> has returned.
 555 </p>
 556
 557 <p class="rule">
 558 The completion of a single call of <code>f()</code> from <code>once.Do(f)</code>
 559 is synchronized before the return of any call of <code>once.Do(f)</code>.
 560 </p>
 561
 562 <p>
 563 In this program:
 564 </p>
 565
 566 <pre>
 567 var a string
 568 var once sync.Once
 569
 570 func setup() {
 571         a = "hello, world"
 572 }
 573
 574 func doprint() {
 575         once.Do(setup)
 576         print(a)
 577 }
 578
 579 func twoprint() {
 580         go doprint()
 581         go doprint()
 582 }
 583 </pre>
 584
 585 <p>
 586 calling <code>twoprint</code> will call <code>setup</code> exactly
 587 once.
 588 The <code>setup</code> function will complete before either call
 589 of <code>print</code>.
 590 The result will be that <code>"hello, world"</code> will be printed
 591 twice.
 592 </p>
 593
 594 <h3 id="atomic">Atomic Values</h3>
 595
 596 <p>
 597 The APIs in the <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a>
 598 package are collectively “atomic operations”
 599 that can be used to synchronize the execution of different goroutines.
 600 If the effect of an atomic operation <i>A</i> is observed by atomic operation <i>B</i>,
 601 then <i>A</i> is synchronized before <i>B</i>.
 602 All the atomic operations executed in a program behave as though executed
 603 in some sequentially consistent order.
 604 </p>
 605
 606 <p>
 607 The preceding definition has the same semantics as C++’s sequentially consistent atomics
 608 and Java’s <code>volatile</code> variables.
 609 </p>
 610
 611 <h3 id="finalizer">Finalizers</h3>
 612
 613 <p>
 614 The <a href="/pkg/runtime/"><code>runtime</code></a> package provides
 615 a <code>SetFinalizer</code> function that adds a finalizer to be called when
 616 a particular object is no longer reachable by the program.
 617 A call to <code>SetFinalizer(x, f)</code> is synchronized before the finalization call <code>f(x)</code>.
 618 </p>
 619
 620 <h3 id="more">Additional Mechanisms</h3>
 621
 622 <p>
 623 The <code>sync</code> package provides additional synchronization abstractions,
 624 including <a href="/pkg/sync/#Cond">condition variables</a>,
 625 <a href="/pkg/sync/#Map">lock-free maps</a>,
 626 <a href="/pkg/sync/#Pool">allocation pools</a>,
 627 and
 628 <a href="/pkg/sync/#WaitGroup">wait groups</a>.
 629 The documentation for each of these specifies the guarantees it
 630 makes concerning synchronization.
 631 </p>
 632
 633 <p>
 634 Other packages that provide synchronization abstractions
 635 should document the guarantees they make too.
 636 </p>
 637
 638
 639 <h2 id="badsync">Incorrect synchronization</h2>
 640
 641 <p>
 642 Programs with races are incorrect and
 643 can exhibit non-sequentially consistent executions.
 644 In particular, note that a read <i>r</i> may observe the value written by any write <i>w</i>
 645 that executes concurrently with <i>r</i>.
 646 Even if this occurs, it does not imply that reads happening after <i>r</i>
 647 will observe writes that happened before <i>w</i>.
 648 </p>
 649
 650 <p>
 651 In this program:
 652 </p>
 653
 654 <pre>
 655 var a, b int
 656
 657 func f() {
 658         a = 1
 659         b = 2
 660 }
 661
 662 func g() {
 663         print(b)
 664         print(a)
 665 }
 666
 667 func main() {
 668         go f()
 669         g()
 670 }
 671 </pre>
 672
 673 <p>
 674 it can happen that <code>g</code> prints <code>2</code> and then <code>0</code>.
 675 </p>
 676
 677 <p>
 678 This fact invalidates a few common idioms.
 679 </p>
 680
 681 <p>
 682 Double-checked locking is an attempt to avoid the overhead of synchronization.
 683 For example, the <code>twoprint</code> program might be
 684 incorrectly written as:
 685 </p>
 686
 687 <pre>
 688 var a string
 689 var done bool
 690
 691 func setup() {
 692         a = "hello, world"
 693         done = true
 694 }
 695
 696 func doprint() {
 697         if !done {
 698                 once.Do(setup)
 699         }
 700         print(a)
 701 }
 702
 703 func twoprint() {
 704         go doprint()
 705         go doprint()
 706 }
 707 </pre>
 708
 709 <p>
 710 but there is no guarantee that, in <code>doprint</code>, observing the write to <code>done</code>
 711 implies observing the write to <code>a</code>.  This
 712 version can (incorrectly) print an empty string
 713 instead of <code>"hello, world"</code>.
 714 </p>
 715
 716 <p>
 717 Another incorrect idiom is busy waiting for a value, as in:
 718 </p>
 719
 720 <pre>
 721 var a string
 722 var done bool
 723
 724 func setup() {
 725         a = "hello, world"
 726         done = true
 727 }
 728
 729 func main() {
 730         go setup()
 731         for !done {
 732         }
 733         print(a)
 734 }
 735 </pre>
 736
 737 <p>
 738 As before, there is no guarantee that, in <code>main</code>,
 739 observing the write to <code>done</code>
 740 implies observing the write to <code>a</code>, so this program could
 741 print an empty string too.
 742 Worse, there is no guarantee that the write to <code>done</code> will ever
 743 be observed by <code>main</code>, since there are no synchronization
 744 events between the two threads.  The loop in <code>main</code> is not
 745 guaranteed to finish.
 746 </p>
 747
 748 <p>
 749 There are subtler variants on this theme, such as this program.
 750 </p>
 751
 752 <pre>
 753 type T struct {
 754         msg string
 755 }
 756
 757 var g *T
 758
 759 func setup() {
 760         t := new(T)
 761         t.msg = "hello, world"
 762         g = t
 763 }
 764
 765 func main() {
 766         go setup()
 767         for g == nil {
 768         }
 769         print(g.msg)
 770 }
 771 </pre>
 772
 773 <p>
 774 Even if <code>main</code> observes <code>g != nil</code> and exits its loop,
 775 there is no guarantee that it will observe the initialized
 776 value for <code>g.msg</code>.
 777 </p>
 778
 779 <p>
 780 In all these examples, the solution is the same:
 781 use explicit synchronization.
 782 </p>
 783
 784 <h2 id="badcompiler">Incorrect compilation</h2>
 785
 786 <p>
 787 The Go memory model restricts compiler optimizations as much as it does Go programs.
 788 Some compiler optimizations that would be valid in single-threaded programs are not valid in all Go programs.
 789 In particular, a compiler must not introduce writes that do not exist in the original program,
 790 it must not allow a single read to observe multiple values,
 791 and it must not allow a single write to write multiple values.
 792 </p>
 793
 794 <p>
 795 All the following examples assume that `*p` and `*q` refer to
 796 memory locations accessible to multiple goroutines.
 797 </p>
 798
 799 <p>
 800 Not introducing data races into race-free programs means not moving
 801 writes out of conditional statements in which they appear.
 802 For example, a compiler must not invert the conditional in this program:
 803 </p>
 804
 805 <pre>
 806 *p = 1
 807 if cond {
 808         *p = 2
 809 }
 810 </pre>
 811
 812 <p>
 813 That is, the compiler must not rewrite the program into this one:
 814 </p>
 815
 816 <pre>
 817 *p = 2
 818 if !cond {
 819         *p = 1
 820 }
 821 </pre>
 822
 823 <p>
 824 If <code>cond</code> is false and another goroutine is reading <code>*p</code>,
 825 then in the original program, the other goroutine can only observe any prior value of <code>*p</code> and <code>1</code>.
 826 In the rewritten program, the other goroutine can observe <code>2</code>, which was previously impossible.
 827 </p>
 828
 829 <p>
 830 Not introducing data races also means not assuming that loops terminate.
 831 For example, a compiler must in general not move the accesses to <code>*p</code> or <code>*q</code>
 832 ahead of the loop in this program:
 833 </p>
 834
 835 <pre>
 836 n := 0
 837 for e := list; e != nil; e = e.next {
 838         n++
 839 }
 840 i := *p
 841 *q = 1
 842 </pre>
 843
 844 <p>
 845 If <code>list</code> pointed to a cyclic list,
 846 then the original program would never access <code>*p</code> or <code>*q</code>,
 847 but the rewritten program would.
 848 (Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic;
 849 moving `*q` ahead would also require the compiler proving that no other
 850 goroutine can access `*q`.)
 851 </p>
 852
 853 <p>
 854 Not introducing data races also means not assuming that called functions
 855 always return or are free of synchronization operations.
 856 For example, a compiler must not move the accesses to <code>*p</code> or <code>*q</code>
 857 ahead of the function call in this program
 858 (at least not without direct knowledge of the precise behavior of <code>f</code>):
 859 </p>
 860
 861 <pre>
 862 f()
 863 i := *p
 864 *q = 1
 865 </pre>
 866
 867 <p>
 868 If the call never returned, then once again the original program
 869 would never access <code>*p</code> or <code>*q</code>, but the rewritten program would.
 870 And if the call contained synchronizing operations, then the original program
 871 could establish happens before edges preceding the accesses
 872 to <code>*p</code> and <code>*q</code>, but the rewritten program would not.
 873 </p>
 874
 875 <p>
 876 Not allowing a single read to observe multiple values means
 877 not reloading local variables from shared memory.
 878 For example, a compiler must not discard <code>i</code> and reload it
 879 a second time from <code>*p</code> in this program:
 880 </p>
 881
 882 <pre>
 883 i := *p
 884 if i &lt; 0 || i &gt;= len(funcs) {
 885         panic("invalid function index")
 886 }
 887 ... complex code ...
 888 // compiler must NOT reload i = *p here
 889 funcs[i]()
 890 </pre>
 891
 892 <p>
 893 If the complex code needs many registers, a compiler for single-threaded programs
 894 could discard <code>i</code> without saving a copy and then reload
 895 <code>i = *p</code> just before
 896 <code>funcs[i]()</code>.
 897 A Go compiler must not, because the value of <code>*p</code> may have changed.
 898 (Instead, the compiler could spill <code>i</code> to the stack.)
 899 </p>
 900
 901 <p>
 902 Not allowing a single write to write multiple values also means not using
 903 the memory where a local variable will be written as temporary storage before the write.
 904 For example, a compiler must not use <code>*p</code> as temporary storage in this program:
 905 </p>
 906
 907 <pre>
 908 *p = i + *p/2
 909 </pre>
 910
 911 <p>
 912 That is, it must not rewrite the program into this one:
 913 </p>
 914
 915 <pre>
 916 *p /= 2
 917 *p += i
 918 </pre>
 919
 920 <p>
 921 If <code>i</code> and <code>*p</code> start equal to 2,
 922 the original code does <code>*p = 3</code>,
 923 so a racing thread can read only 2 or 3 from <code>*p</code>.
 924 The rewritten code does <code>*p = 1</code> and then <code>*p = 3</code>,
 925 allowing a racing thread to read 1 as well.
 926 </p>
 927
 928 <p>
 929 Note that all these optimizations are permitted in C/C++ compilers:
 930 a Go compiler sharing a back end with a C/C++ compiler must take care
 931 to disable optimizations that are invalid for Go.
 932 </p>
 933
 934 <p>
 935 Note that the prohibition on introducing data races
 936 does not apply if the compiler can prove that the races
 937 do not affect correct execution on the target platform.
 938 For example, on essentially all CPUs, it is valid to rewrite
 939 </p>
 940
 941 <pre>
 942 n := 0
 943 for i := 0; i < m; i++ {
 944         n += *shared
 945 }
 946 </pre>
 947
 948 into:
 949
 950 <pre>
 951 n := 0
 952 local := *shared
 953 for i := 0; i < m; i++ {
 954         n += local
 955 }
 956 </pre>
 957
 958 <p>
 959 provided it can be proved that <code>*shared</code> will not fault on access,
 960 because the potential added read will not affect any existing concurrent reads or writes.
 961 On the other hand, the rewrite would not be valid in a source-to-source translator.
 962 </p>
 963
 964 <h2 id="conclusion">Conclusion</h2>
 965
 966 <p>
 967 Go programmers writing data-race-free programs can rely on
 968 sequentially consistent execution of those programs,
 969 just as in essentially all other modern programming languages.
 970 </p>
 971
 972 <p>
 973 When it comes to programs with races,
 974 both programmers and compilers should remember the advice:
 975 don't be clever.
 976 </p>