https://lab.whitequark.org/
whitequark's lab notebook
2023-07-22T03:37:00Z
whitequark
https://lab.whitequark.org/
tag:lab.whitequark.org,2023-07-22:/notes/2023-07-22/blinking-a-led-using-stm32-dma/
Blinking a LED using STM32 DMA
2023-07-22T03:37:00Z
2023-07-22T03:37:00Z
<p>Every embedded developer has blinked a LED—this is both a rite of passage and a smoke test for bring-up of new hardware. Most often you do it in software. Sometimes you do it by connecting the LED directly to a peripheral.</p>
<p>Have you ever blinked a LED using DMA? No loops. No use of the CPU after the initial configuration. No use of the timer PWM output, either. Just the DMA peripheral, unbothered, moisturized, happy, in its lane, focused, flourishing, reconfiguring itself by writing to its own registers every time the timer ticks.</p>
<figure><figcaption>main.rs (<a href="/files/dmablink/main.rs">download</a>)</figcaption><pre><code class="language-rust"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
</pre></td>
<td class="rouge-code"><pre><span class="nd">#![no_std]</span>
<span class="nd">#![no_main]</span>
<span class="k">use</span> <span class="nn">core</span><span class="p">::{</span><span class="n">ptr</span><span class="p">,</span> <span class="n">sync</span><span class="p">};</span>
<span class="k">use</span> <span class="nn">defmt</span><span class="p">::{</span><span class="o">*</span><span class="p">,</span> <span class="n">unreachable</span><span class="p">};</span>
<span class="k">use</span> <span class="p">{</span><span class="n">defmt_rtt</span> <span class="k">as</span> <span class="n">_</span><span class="p">,</span> <span class="n">panic_probe</span> <span class="k">as</span> <span class="n">_</span><span class="p">};</span>
<span class="k">use</span> <span class="n">stm32_metapac</span> <span class="k">as</span> <span class="n">pac</span><span class="p">;</span>
<span class="c1">// -8<-8<- configuration</span>
<span class="k">const</span> <span class="n">GPIO_REGS</span><span class="p">:</span> <span class="nn">pac</span><span class="p">::</span><span class="nn">gpio</span><span class="p">::</span><span class="n">Gpio</span> <span class="o">=</span> <span class="nn">pac</span><span class="p">::</span><span class="n">GPIOC</span><span class="p">;</span>
<span class="k">const</span> <span class="n">GPIO_LED</span><span class="p">:</span> <span class="nb">usize</span> <span class="o">=</span> <span class="mi">5</span><span class="p">;</span>
<span class="c1">// -8<-8<- end of config</span>
<span class="c1">// The DMA registers are laid out in memory like this:</span>
<span class="c1">// - CH(n+0)_CR <- control register (R/O when EN=1, except for EN)</span>
<span class="c1">// - CH(n+0)_NDTR <- length register (R/O when EN=1)</span>
<span class="c1">// - CH(n+0)_PAR <- peripheral addr (R/W when EN=1!)</span>
<span class="c1">// - CH(n+0)_MAR <- memory addr (R/W when EN=1!)</span>
<span class="c1">// - CH(n+1)_CR</span>
<span class="c1">// - CH(n+1)_...</span>
<span class="c1">//</span>
<span class="c1">// This means that a single DMA transfer to the DMA registers themselves can set</span>
<span class="c1">// up the transfer (without setting EN=1) for CH(n+0)..CH(n+k-1) and then trigger</span>
<span class="c1">// CH(n+k), but it cannot trigger the transfers it just set up; these should be</span>
<span class="c1">// triggered by a previously configured CH(n+k) channel that writes the same data</span>
<span class="c1">// over except now with EN=1.</span>
<span class="c1">//</span>
<span class="c1">// To work around this limitation it helps to see the DMA channel as having three</span>
<span class="c1">// distinct phases it is circularly traversing:</span>
<span class="c1">// 1. Setup phase: EN=0; configuration is being written to the channel</span>
<span class="c1">// 2. Transfer phase: EN=1, NDTR>0; data is being transferred</span>
<span class="c1">// 3. Dormant phase: EN=1, NDTR=0; transfer has finished but the channel is locked</span>
<span class="c1">// Two of the three phases (with EN=1) make it impossible to reconfigure the channel</span>
<span class="c1">// directly, instead requiring EN bit to be reset. For a system made out of DMA</span>
<span class="c1">// channels to return to its initial state, it needs at least three channels,</span>
<span class="c1">// so that the one channel in the transfer phase can advance the state of the system.</span>
<span class="k">const</span> <span class="n">DMA1_CHX_BASE</span><span class="p">:</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span> <span class="o">=</span> <span class="mi">0x4002_0008</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span><span class="p">;</span>
<span class="nd">macro_rules!</span> <span class="n">dma_control</span> <span class="p">{</span>
<span class="p">(</span> <span class="n">en</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">0</span><span class="p">)</span> <span class="p">|</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="p">(</span> <span class="n">m2m</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">4</span><span class="p">)</span> <span class="p">|</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">14</span><span class="p">)</span> <span class="p">|</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="p">(</span> <span class="n">m2p</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">4</span><span class="p">)</span> <span class="p">|</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="p">(</span> <span class="n">p2m</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="c1">// CCRx: MEM2MEM=0 PL=00 MSIZE=PSIZE=10 MINC=1 PINC=1 CIRC=0 DIR=0 *IE=000 EN=0</span>
<span class="p">()</span> <span class="k">=></span> <span class="p">{</span> <span class="mi">0b0_00_10_10_1_1_0_0_000_0u32</span> <span class="p">}</span>
<span class="p">}</span>
<span class="nd">macro_rules!</span> <span class="n">dma_configs</span> <span class="p">{</span>
<span class="p">[</span>
<span class="nv">$</span><span class="p">(</span> <span class="nv">$n:literal</span><span class="p">:</span> <span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl:tt</span><span class="p">)|</span><span class="o">+</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="nv">$len:expr</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nv">$dst:expr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="nv">$src:expr</span><span class="p">)</span> <span class="p">),</span><span class="o">*</span>
<span class="nv">$</span><span class="p">(</span> <span class="p">,</span> <span class="p">)</span><span class="o">?</span>
<span class="p">]</span> <span class="k">=></span> <span class="p">{</span>
<span class="p">[</span>
<span class="nv">$</span><span class="p">(</span>
<span class="c1">// CCRx</span>
<span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)|</span><span class="o">+</span><span class="p">),</span>
<span class="c1">// CNDTRx</span>
<span class="nv">$len</span> <span class="k">as</span> <span class="o">*</span><span class="k">const</span> <span class="nb">u32</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">,</span>
<span class="c1">// CPARx</span>
<span class="nv">$dst</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">,</span>
<span class="c1">// CMARx</span>
<span class="nv">$src</span> <span class="k">as</span> <span class="o">*</span><span class="k">const</span> <span class="nb">u32</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">,</span>
<span class="c1">// reserved</span>
<span class="mi">0u32</span>
<span class="p">),</span><span class="o">*</span>
<span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">const</span> <span class="k">fn</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="n">number</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="k">-></span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span> <span class="p">{</span>
<span class="n">DMA1_CHX_BASE</span><span class="nf">.wrapping_add</span><span class="p">(</span><span class="mi">5</span> <span class="o">*</span> <span class="p">(</span><span class="n">number</span> <span class="o">-</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">}</span>
<span class="k">unsafe</span> <span class="k">fn</span> <span class="nf">dma1_setup_ch</span><span class="p">(</span><span class="n">first</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span> <span class="n">config</span><span class="p">:</span> <span class="o">&</span><span class="p">[</span><span class="nb">u32</span><span class="p">])</span> <span class="p">{</span>
<span class="nn">sync</span><span class="p">::</span><span class="nn">atomic</span><span class="p">::</span><span class="nf">fence</span><span class="p">(</span><span class="nn">sync</span><span class="p">::</span><span class="nn">atomic</span><span class="p">::</span><span class="nn">Ordering</span><span class="p">::</span><span class="n">SeqCst</span><span class="p">);</span>
<span class="k">for</span> <span class="n">offset</span> <span class="k">in</span> <span class="mi">0</span><span class="o">..</span><span class="n">config</span><span class="nf">.len</span><span class="p">()</span> <span class="p">{</span>
<span class="nn">ptr</span><span class="p">::</span><span class="nf">write_volatile</span><span class="p">(</span><span class="nf">dma1_chx</span><span class="p">(</span><span class="n">first</span><span class="p">)</span><span class="nf">.offset</span><span class="p">(</span><span class="n">offset</span> <span class="k">as</span> <span class="nb">isize</span><span class="p">),</span> <span class="n">config</span><span class="p">[</span><span class="n">offset</span><span class="p">]);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nd">#[cortex_m_rt::entry]</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="k">-></span> <span class="o">!</span> <span class="p">{</span>
<span class="nd">info!</span><span class="p">(</span><span class="s">"running!"</span><span class="p">);</span>
<span class="c1">// Configure SYSCFG to make TIM6 trigger DMA1_CH3. This is STM32F302R8Tx specific.</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.apb2enr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_syscfgen</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">SYSCFG</span><span class="nf">.cfgr1</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span>
<span class="n">r</span><span class="nf">.set_tim6_dac1_ch1_dma_rmp</span><span class="p">(</span><span class="nn">pac</span><span class="p">::</span><span class="nn">syscfg</span><span class="p">::</span><span class="nn">vals</span><span class="p">::</span><span class="nf">Tim6Dac1Ch1DmaRmp</span><span class="p">(</span><span class="mi">1</span><span class="p">)));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.ahbenr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_gpiocen</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="n">GPIO_REGS</span><span class="nf">.moder</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_moder</span><span class="p">(</span><span class="n">GPIO_LED</span><span class="p">,</span> <span class="nn">pac</span><span class="p">::</span><span class="nn">gpio</span><span class="p">::</span><span class="nn">vals</span><span class="p">::</span><span class="nn">Moder</span><span class="p">::</span><span class="n">OUTPUT</span><span class="p">));</span>
<span class="k">let</span> <span class="n">gpio_bsrr</span> <span class="o">=</span> <span class="n">GPIO_REGS</span><span class="nf">.bsrr</span><span class="p">()</span><span class="nf">.as_ptr</span><span class="p">()</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span><span class="p">;</span>
<span class="k">let</span> <span class="n">gpio_setb</span> <span class="o">=</span> <span class="mi">1u32</span> <span class="o"><<</span> <span class="n">GPIO_LED</span><span class="p">;</span>
<span class="k">let</span> <span class="n">gpio_clrb</span> <span class="o">=</span> <span class="n">gpio_setb</span> <span class="o"><<</span> <span class="mi">16</span><span class="p">;</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_0</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">1</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_a0</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_a1</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_a2</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">2</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_b0</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_b1</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_b2</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">2</span><span class="p">];</span>
<span class="n">dma1_config_0</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="p">];</span>
<span class="n">dma1_config_a0</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_setb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_a1</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_setb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_a2</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_b0</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_clrb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_b1</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_clrb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_b2</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a0</span><span class="p">),</span>
<span class="p">];</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.ahbenr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_dma1en</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="k">unsafe</span> <span class="p">{</span>
<span class="nf">dma1_setup_ch</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="n">dma1_config_0</span><span class="nf">.as_ref</span><span class="p">());</span>
<span class="nf">dma1_setup_ch</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">dma1_config_a0</span><span class="nf">.as_ref</span><span class="p">());</span>
<span class="p">}</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.apb1enr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_tim6en</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.psc</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_psc</span><span class="p">(</span><span class="mi">8000</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.arr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_arr</span><span class="p">(</span><span class="mi">500</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.dier</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_ude</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.cr2</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_mms</span><span class="p">(</span><span class="nn">pac</span><span class="p">::</span><span class="nn">timer</span><span class="p">::</span><span class="nn">vals</span><span class="p">::</span><span class="nn">Mms</span><span class="p">::</span><span class="n">UPDATE</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.cr1</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_cen</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">cortex_m</span><span class="p">::</span><span class="nn">interrupt</span><span class="p">::</span><span class="nf">disable</span><span class="p">();</span>
<span class="nn">cortex_m</span><span class="p">::</span><span class="nn">asm</span><span class="p">::</span><span class="nf">wfi</span><span class="p">();</span>
<span class="nd">unreachable!</span><span class="p">()</span>
<span class="p">}</span>
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>The timer is only used here to make the blinking observable to the human eye. You could remove it entirely and have the DMA generate high frequency waveforms, though not as quickly as the CPU could do it with the bit banding areas.</p>
<p>Actually, the project I designed this technique for would be using the DMA to generate composite video waveforms without any involvement of the CPU—by writing the analog level to the pins with a resistor ladder (the DAC isn’t fast enough), configuring the timer for the next interval, all while retriggering itself several times.</p>
<p>The <a href="/files/dmablink/dmablink.zip">complete project</a> can be downloaded. It targets STM32F302R8, but will work on any STM32 chip with the “BDMA” type of the DMA peripheral. (There are several, and this one is the most basic.) If you have a checkout of <a href="https://github.com/embassy-rs/stm32-data/">stm32-data</a>, searching for <code>"kind": "bdma"</code> under <code>build/data/chips/</code> will show the devices that are compatible.</p>
<p>Every embedded developer has blinked a LED—this is both a rite of passage and a smoke test for bring-up of new hardware. Most often you do it in software. Sometimes you do it by connecting the LED directly to a peripheral.</p>
<p>Have you ever blinked a LED using DMA? No loops. No use of the CPU after the initial configuration. No use of the timer PWM output, either. Just the DMA peripheral, unbothered, moisturized, happy, in its lane, focused, flourishing, reconfiguring itself by writing to its own registers every time the timer ticks.</p>
<figure><figcaption>main.rs (<a href="/files/dmablink/main.rs">download</a>)</figcaption><pre><code class="language-rust"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
</pre></td>
<td class="rouge-code"><pre><span class="nd">#![no_std]</span>
<span class="nd">#![no_main]</span>
<span class="k">use</span> <span class="nn">core</span><span class="p">::{</span><span class="n">ptr</span><span class="p">,</span> <span class="n">sync</span><span class="p">};</span>
<span class="k">use</span> <span class="nn">defmt</span><span class="p">::{</span><span class="o">*</span><span class="p">,</span> <span class="n">unreachable</span><span class="p">};</span>
<span class="k">use</span> <span class="p">{</span><span class="n">defmt_rtt</span> <span class="k">as</span> <span class="n">_</span><span class="p">,</span> <span class="n">panic_probe</span> <span class="k">as</span> <span class="n">_</span><span class="p">};</span>
<span class="k">use</span> <span class="n">stm32_metapac</span> <span class="k">as</span> <span class="n">pac</span><span class="p">;</span>
<span class="c1">// -8<-8<- configuration</span>
<span class="k">const</span> <span class="n">GPIO_REGS</span><span class="p">:</span> <span class="nn">pac</span><span class="p">::</span><span class="nn">gpio</span><span class="p">::</span><span class="n">Gpio</span> <span class="o">=</span> <span class="nn">pac</span><span class="p">::</span><span class="n">GPIOC</span><span class="p">;</span>
<span class="k">const</span> <span class="n">GPIO_LED</span><span class="p">:</span> <span class="nb">usize</span> <span class="o">=</span> <span class="mi">5</span><span class="p">;</span>
<span class="c1">// -8<-8<- end of config</span>
<span class="c1">// The DMA registers are laid out in memory like this:</span>
<span class="c1">// - CH(n+0)_CR <- control register (R/O when EN=1, except for EN)</span>
<span class="c1">// - CH(n+0)_NDTR <- length register (R/O when EN=1)</span>
<span class="c1">// - CH(n+0)_PAR <- peripheral addr (R/W when EN=1!)</span>
<span class="c1">// - CH(n+0)_MAR <- memory addr (R/W when EN=1!)</span>
<span class="c1">// - CH(n+1)_CR</span>
<span class="c1">// - CH(n+1)_...</span>
<span class="c1">//</span>
<span class="c1">// This means that a single DMA transfer to the DMA registers themselves can set</span>
<span class="c1">// up the transfer (without setting EN=1) for CH(n+0)..CH(n+k-1) and then trigger</span>
<span class="c1">// CH(n+k), but it cannot trigger the transfers it just set up; these should be</span>
<span class="c1">// triggered by a previously configured CH(n+k) channel that writes the same data</span>
<span class="c1">// over except now with EN=1.</span>
<span class="c1">//</span>
<span class="c1">// To work around this limitation it helps to see the DMA channel as having three</span>
<span class="c1">// distinct phases it is circularly traversing:</span>
<span class="c1">// 1. Setup phase: EN=0; configuration is being written to the channel</span>
<span class="c1">// 2. Transfer phase: EN=1, NDTR>0; data is being transferred</span>
<span class="c1">// 3. Dormant phase: EN=1, NDTR=0; transfer has finished but the channel is locked</span>
<span class="c1">// Two of the three phases (with EN=1) make it impossible to reconfigure the channel</span>
<span class="c1">// directly, instead requiring EN bit to be reset. For a system made out of DMA</span>
<span class="c1">// channels to return to its initial state, it needs at least three channels,</span>
<span class="c1">// so that the one channel in the transfer phase can advance the state of the system.</span>
<span class="k">const</span> <span class="n">DMA1_CHX_BASE</span><span class="p">:</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span> <span class="o">=</span> <span class="mi">0x4002_0008</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span><span class="p">;</span>
<span class="nd">macro_rules!</span> <span class="n">dma_control</span> <span class="p">{</span>
<span class="p">(</span> <span class="n">en</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">0</span><span class="p">)</span> <span class="p">|</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="p">(</span> <span class="n">m2m</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">4</span><span class="p">)</span> <span class="p">|</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">14</span><span class="p">)</span> <span class="p">|</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="p">(</span> <span class="n">m2p</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="p">(</span><span class="mi">1</span><span class="o"><<</span><span class="mi">4</span><span class="p">)</span> <span class="p">|</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="p">(</span> <span class="n">p2m</span> <span class="nv">$</span><span class="p">(|</span> <span class="nv">$ctrl:tt</span><span class="p">)</span><span class="o">*</span> <span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)</span><span class="o">*</span><span class="p">)</span> <span class="p">};</span>
<span class="c1">// CCRx: MEM2MEM=0 PL=00 MSIZE=PSIZE=10 MINC=1 PINC=1 CIRC=0 DIR=0 *IE=000 EN=0</span>
<span class="p">()</span> <span class="k">=></span> <span class="p">{</span> <span class="mi">0b0_00_10_10_1_1_0_0_000_0u32</span> <span class="p">}</span>
<span class="p">}</span>
<span class="nd">macro_rules!</span> <span class="n">dma_configs</span> <span class="p">{</span>
<span class="p">[</span>
<span class="nv">$</span><span class="p">(</span> <span class="nv">$n:literal</span><span class="p">:</span> <span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl:tt</span><span class="p">)|</span><span class="o">+</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="nv">$len:expr</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nv">$dst:expr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="nv">$src:expr</span><span class="p">)</span> <span class="p">),</span><span class="o">*</span>
<span class="nv">$</span><span class="p">(</span> <span class="p">,</span> <span class="p">)</span><span class="o">?</span>
<span class="p">]</span> <span class="k">=></span> <span class="p">{</span>
<span class="p">[</span>
<span class="nv">$</span><span class="p">(</span>
<span class="c1">// CCRx</span>
<span class="nd">dma_control!</span><span class="p">(</span><span class="nv">$</span><span class="p">(</span><span class="nv">$ctrl</span><span class="p">)|</span><span class="o">+</span><span class="p">),</span>
<span class="c1">// CNDTRx</span>
<span class="nv">$len</span> <span class="k">as</span> <span class="o">*</span><span class="k">const</span> <span class="nb">u32</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">,</span>
<span class="c1">// CPARx</span>
<span class="nv">$dst</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">,</span>
<span class="c1">// CMARx</span>
<span class="nv">$src</span> <span class="k">as</span> <span class="o">*</span><span class="k">const</span> <span class="nb">u32</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">,</span>
<span class="c1">// reserved</span>
<span class="mi">0u32</span>
<span class="p">),</span><span class="o">*</span>
<span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">const</span> <span class="k">fn</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="n">number</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="k">-></span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span> <span class="p">{</span>
<span class="n">DMA1_CHX_BASE</span><span class="nf">.wrapping_add</span><span class="p">(</span><span class="mi">5</span> <span class="o">*</span> <span class="p">(</span><span class="n">number</span> <span class="o">-</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">}</span>
<span class="k">unsafe</span> <span class="k">fn</span> <span class="nf">dma1_setup_ch</span><span class="p">(</span><span class="n">first</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span> <span class="n">config</span><span class="p">:</span> <span class="o">&</span><span class="p">[</span><span class="nb">u32</span><span class="p">])</span> <span class="p">{</span>
<span class="nn">sync</span><span class="p">::</span><span class="nn">atomic</span><span class="p">::</span><span class="nf">fence</span><span class="p">(</span><span class="nn">sync</span><span class="p">::</span><span class="nn">atomic</span><span class="p">::</span><span class="nn">Ordering</span><span class="p">::</span><span class="n">SeqCst</span><span class="p">);</span>
<span class="k">for</span> <span class="n">offset</span> <span class="k">in</span> <span class="mi">0</span><span class="o">..</span><span class="n">config</span><span class="nf">.len</span><span class="p">()</span> <span class="p">{</span>
<span class="nn">ptr</span><span class="p">::</span><span class="nf">write_volatile</span><span class="p">(</span><span class="nf">dma1_chx</span><span class="p">(</span><span class="n">first</span><span class="p">)</span><span class="nf">.offset</span><span class="p">(</span><span class="n">offset</span> <span class="k">as</span> <span class="nb">isize</span><span class="p">),</span> <span class="n">config</span><span class="p">[</span><span class="n">offset</span><span class="p">]);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nd">#[cortex_m_rt::entry]</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="k">-></span> <span class="o">!</span> <span class="p">{</span>
<span class="nd">info!</span><span class="p">(</span><span class="s">"running!"</span><span class="p">);</span>
<span class="c1">// Configure SYSCFG to make TIM6 trigger DMA1_CH3. This is STM32F302R8Tx specific.</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.apb2enr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_syscfgen</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">SYSCFG</span><span class="nf">.cfgr1</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span>
<span class="n">r</span><span class="nf">.set_tim6_dac1_ch1_dma_rmp</span><span class="p">(</span><span class="nn">pac</span><span class="p">::</span><span class="nn">syscfg</span><span class="p">::</span><span class="nn">vals</span><span class="p">::</span><span class="nf">Tim6Dac1Ch1DmaRmp</span><span class="p">(</span><span class="mi">1</span><span class="p">)));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.ahbenr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_gpiocen</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="n">GPIO_REGS</span><span class="nf">.moder</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_moder</span><span class="p">(</span><span class="n">GPIO_LED</span><span class="p">,</span> <span class="nn">pac</span><span class="p">::</span><span class="nn">gpio</span><span class="p">::</span><span class="nn">vals</span><span class="p">::</span><span class="nn">Moder</span><span class="p">::</span><span class="n">OUTPUT</span><span class="p">));</span>
<span class="k">let</span> <span class="n">gpio_bsrr</span> <span class="o">=</span> <span class="n">GPIO_REGS</span><span class="nf">.bsrr</span><span class="p">()</span><span class="nf">.as_ptr</span><span class="p">()</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span><span class="p">;</span>
<span class="k">let</span> <span class="n">gpio_setb</span> <span class="o">=</span> <span class="mi">1u32</span> <span class="o"><<</span> <span class="n">GPIO_LED</span><span class="p">;</span>
<span class="k">let</span> <span class="n">gpio_clrb</span> <span class="o">=</span> <span class="n">gpio_setb</span> <span class="o"><<</span> <span class="mi">16</span><span class="p">;</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_0</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">1</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_a0</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_a1</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_a2</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">2</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_b0</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_b1</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">5</span><span class="p">];</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">dma1_config_b2</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u32</span><span class="p">;</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">2</span><span class="p">];</span>
<span class="n">dma1_config_0</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="p">];</span>
<span class="n">dma1_config_a0</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_setb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_a1</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_setb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_a2</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_b0</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_clrb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_b1</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">2</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="n">gpio_bsrr</span><span class="p">,</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">gpio_clrb</span><span class="p">),</span>
<span class="mi">3</span><span class="p">:</span> <span class="p">(</span><span class="n">m2p</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="nd">dma_control!</span><span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">)),</span>
<span class="mi">4</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b1</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b1</span><span class="p">),</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a0</span><span class="p">),</span>
<span class="p">];</span>
<span class="n">dma1_config_b2</span> <span class="o">=</span> <span class="nd">dma_configs!</span><span class="p">[</span>
<span class="mi">5</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_b2</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">5</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_b2</span><span class="p">),</span>
<span class="mi">6</span><span class="p">:</span> <span class="p">(</span><span class="n">m2m</span><span class="p">|</span><span class="n">en</span><span class="p">,</span> <span class="n">len</span><span class="p">:</span> <span class="n">dma1_config_a0</span><span class="nf">.len</span><span class="p">(),</span> <span class="n">dst</span><span class="p">:</span> <span class="nf">dma1_chx</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">src</span><span class="p">:</span> <span class="o">&</span><span class="n">dma1_config_a0</span><span class="p">),</span>
<span class="p">];</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.ahbenr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_dma1en</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="k">unsafe</span> <span class="p">{</span>
<span class="nf">dma1_setup_ch</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="n">dma1_config_0</span><span class="nf">.as_ref</span><span class="p">());</span>
<span class="nf">dma1_setup_ch</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">dma1_config_a0</span><span class="nf">.as_ref</span><span class="p">());</span>
<span class="p">}</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">RCC</span><span class="nf">.apb1enr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_tim6en</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.psc</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_psc</span><span class="p">(</span><span class="mi">8000</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.arr</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_arr</span><span class="p">(</span><span class="mi">500</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.dier</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_ude</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.cr2</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_mms</span><span class="p">(</span><span class="nn">pac</span><span class="p">::</span><span class="nn">timer</span><span class="p">::</span><span class="nn">vals</span><span class="p">::</span><span class="nn">Mms</span><span class="p">::</span><span class="n">UPDATE</span><span class="p">));</span>
<span class="nn">pac</span><span class="p">::</span><span class="n">TIM6</span><span class="nf">.cr1</span><span class="p">()</span><span class="nf">.modify</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="n">r</span><span class="nf">.set_cen</span><span class="p">(</span><span class="k">true</span><span class="p">));</span>
<span class="nn">cortex_m</span><span class="p">::</span><span class="nn">interrupt</span><span class="p">::</span><span class="nf">disable</span><span class="p">();</span>
<span class="nn">cortex_m</span><span class="p">::</span><span class="nn">asm</span><span class="p">::</span><span class="nf">wfi</span><span class="p">();</span>
<span class="nd">unreachable!</span><span class="p">()</span>
<span class="p">}</span>
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>The timer is only used here to make the blinking observable to the human eye. You could remove it entirely and have the DMA generate high frequency waveforms, though not as quickly as the CPU could do it with the bit banding areas.</p>
<p>Actually, the project I designed this technique for would be using the DMA to generate composite video waveforms without any involvement of the CPU—by writing the analog level to the pins with a resistor ladder (the DAC isn’t fast enough), configuring the timer for the next interval, all while retriggering itself several times.</p>
<p>The <a href="/files/dmablink/dmablink.zip">complete project</a> can be downloaded. It targets STM32F302R8, but will work on any STM32 chip with the “BDMA” type of the DMA peripheral. (There are several, and this one is the most basic.) If you have a checkout of <a href="https://github.com/embassy-rs/stm32-data/">stm32-data</a>, searching for <code>"kind": "bdma"</code> under <code>build/data/chips/</code> will show the devices that are compatible.</p>
tag:lab.whitequark.org,2020-04-06:/notes/2020-04-06/minimizing-logic-expressions/
Minimizing logic expressions
2020-04-06T17:25:03Z
2020-04-06T17:25:03Z
<p>While working on reverse-engineering the Microchip ATF15xx CPLD family, I found myself deriving minimal logic functions from a truth table. This useful because while it is easy to sample all possible states of a black box combinatorial function using e.g. <a href="https://en.wikipedia.org/wiki/Boundary_scan">boundary scan</a>, these truth tables are unwieldy and don’t provide much insight into the hardware. While a minimal function with the same truth table would not necessarily be <em>the</em> function implemented by the hardware (which may have hidden variables, or simply use a larger equivalent function that is more convenient to implement), deriving one still provides great insight. In this note I explore this process.</p>
<!--more-->
<p>My chosen approach (thanks to <a href="https://www.cs.utah.edu/~regehr/">John Regehr</a> for the <a href="https://twitter.com/johnregehr/status/1212563858524499968">suggestion</a>) I got for <a href="/notes/2020-04-06/synthesizing-optimal-8051-code/">an earlier project</a> is to implement an interpreter for a simple logic expression abstract syntax tree in <a href="https://racket-lang.org">Racket</a> and then use <a href="https://emina.github.io/rosette/">Rosette</a> to translate assertions about the results of interpreting an arbitrary logic expression, as well as a cost function, into a query for an <a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories">SMT solver</a>.</p>
<p>Although I could use an off-the-shelf logic minimizer here (like <a href="https://ptolemy.berkeley.edu/projects/embedded/pubs/downloads/espresso/">Espresso</a>), most logic minimizers solve a different problem: quickly translating large designs to simple netlists. However, I would like to have a complex output netlist: the ATF15xx CPLDs have a hardware XOR gate that I would like the minimizer to infer on its own. On the other hand, I don’t really care about the runtime of the minimizer as long as it’s on the order of minutes to hours. Rosette’s flexibility is a perfect match for this task.</p>
<p>The following code demonstrates the approach and its ability to derive a XOR gate from3 the input expression. It can be easily modified for a particular application by extending (or reducing, e.g. for translation to an <a href="https://en.wikipedia.org/wiki/And-inverter_graph">and-inverter graph</a>) the logic language, or altering the cost function.</p>
<figure><figcaption>minlogic.rkt (<a href="/files/minlogic.rkt">download</a>)</figcaption><pre><code class="language-racket"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
</pre></td>
<td class="rouge-code"><pre><span class="o">#</span><span class="nv">lang</span> <span class="nv">rosette/safe</span>
<span class="p">(</span><span class="k">require</span> <span class="nv">rosette/lib/angelic</span>
<span class="nv">rosette/lib/match</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">^^</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">)</span> <span class="p">(</span><span class="nf">||</span> <span class="p">(</span><span class="nf">&&</span> <span class="nv">x</span> <span class="p">(</span><span class="nf">!</span> <span class="nv">y</span><span class="p">))</span> <span class="p">(</span><span class="nf">&&</span> <span class="p">(</span><span class="nf">!</span> <span class="nv">x</span><span class="p">)</span> <span class="nv">y</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">lnot</span> <span class="p">(</span><span class="nf">a</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">land</span> <span class="p">(</span><span class="nf">a</span> <span class="nv">b</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">lor</span> <span class="p">(</span><span class="nf">a</span> <span class="nv">b</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">lxor</span> <span class="p">(</span><span class="nf">a</span> <span class="nv">b</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">lvar</span> <span class="p">(</span><span class="nf">v</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">llit</span> <span class="p">(</span><span class="nf">v</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">ldump</span> <span class="nv">e</span><span class="p">)</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">e</span>
<span class="p">[(</span><span class="nf">lnot</span> <span class="nv">a</span><span class="p">)</span> <span class="o">`</span><span class="p">(</span><span class="nf">!</span> <span class="o">,</span><span class="p">(</span><span class="nf">ldump</span> <span class="nv">a</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">land</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="o">`</span><span class="p">(</span><span class="nf">&&</span> <span class="o">,</span><span class="p">(</span><span class="nf">ldump</span> <span class="nv">a</span><span class="p">)</span> <span class="o">,</span><span class="p">(</span><span class="nf">ldump</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="o">`</span><span class="p">(</span><span class="nf">\|\|</span> <span class="o">,</span><span class="p">(</span><span class="nf">ldump</span> <span class="nv">a</span><span class="p">)</span> <span class="o">,</span><span class="p">(</span><span class="nf">ldump</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lxor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="o">`</span><span class="p">(</span><span class="nf">^^</span> <span class="o">,</span><span class="p">(</span><span class="nf">ldump</span> <span class="nv">a</span><span class="p">)</span> <span class="o">,</span><span class="p">(</span><span class="nf">ldump</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lvar</span> <span class="nv">v</span><span class="p">)</span> <span class="nv">v</span><span class="p">]</span>
<span class="p">[(</span><span class="nf">llit</span> <span class="nv">v</span><span class="p">)</span> <span class="nv">v</span><span class="p">]))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">e</span><span class="p">)</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">e</span>
<span class="p">[(</span><span class="nf">lnot</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">!</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">a</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">land</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nf">&&</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nf">||</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lxor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nf">^^</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lvar</span> <span class="nv">v</span><span class="p">)</span> <span class="nv">v</span><span class="p">]</span>
<span class="p">[(</span><span class="nf">llit</span> <span class="nv">v</span><span class="p">)</span> <span class="nv">v</span><span class="p">]))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">e</span><span class="p">)</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">e</span>
<span class="p">[(</span><span class="nf">lnot</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">a</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">land</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="mi">2</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="mi">2</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lxor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="mi">2</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">lvar</span> <span class="nv">v</span><span class="p">)</span> <span class="mi">0</span><span class="p">]</span>
<span class="p">[(</span><span class="nf">llit</span> <span class="nv">v</span><span class="p">)</span> <span class="mi">1</span><span class="p">]))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">??lexpr</span> <span class="nv">terminals</span> <span class="nt">#:depth</span> <span class="nv">depth</span><span class="p">)</span>
<span class="p">(</span><span class="nb">apply</span> <span class="nv">choose*</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb"><=</span> <span class="nv">depth</span> <span class="mi">0</span><span class="p">)</span> <span class="nv">terminals</span>
<span class="p">(</span><span class="k">let</span> <span class="p">[(</span><span class="nf">a</span> <span class="p">(</span><span class="nf">??lexpr</span> <span class="nv">terminals</span> <span class="nt">#:depth</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">depth</span> <span class="mi">1</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">b</span> <span class="p">(</span><span class="nf">??lexpr</span> <span class="nv">terminals</span> <span class="nt">#:depth</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">depth</span> <span class="mi">1</span><span class="p">)))]</span>
<span class="p">(</span><span class="nb">append</span> <span class="nv">terminals</span>
<span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nf">lnot</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">land</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nf">lor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nf">lxor</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)))))))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">lmincost</span> <span class="nt">#:forall</span> <span class="nv">inputs</span> <span class="nt">#:tactic</span> <span class="nv">template</span> <span class="nt">#:equiv</span> <span class="nv">behavior</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">model</span>
<span class="p">(</span><span class="nf">optimize</span>
<span class="nt">#:minimize</span> <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nf">lcost</span> <span class="nv">template</span><span class="p">))</span>
<span class="nt">#:guarantee</span> <span class="p">(</span><span class="nf">assert</span> <span class="p">(</span><span class="nf">forall</span> <span class="nv">inputs</span> <span class="p">(</span><span class="nb">equal?</span> <span class="p">(</span><span class="nf">leval</span> <span class="nv">template</span><span class="p">)</span> <span class="nv">behavior</span><span class="p">)))))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">unsat?</span> <span class="nv">model</span><span class="p">)</span> <span class="nv">model</span>
<span class="p">(</span><span class="nf">evaluate</span> <span class="nv">template</span> <span class="nv">model</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">define-symbolic</span> <span class="nv">a</span> <span class="nv">b</span> <span class="nv">c</span> <span class="nv">boolean?</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">f</span>
<span class="p">(</span><span class="nf">lmincost</span>
<span class="nt">#:forall</span> <span class="p">(</span><span class="nb">list</span> <span class="nv">a</span> <span class="nv">b</span> <span class="nv">c</span><span class="p">)</span>
<span class="nt">#:tactic</span> <span class="p">(</span><span class="nf">??lexpr</span> <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nf">lvar</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="nf">lvar</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nf">lvar</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="nf">llit</span> <span class="no">#f</span><span class="p">))</span> <span class="nt">#:depth</span> <span class="mi">3</span><span class="p">)</span>
<span class="nt">#:equiv</span> <span class="p">(</span><span class="nf">&&</span> <span class="p">(</span><span class="nf">||</span> <span class="nv">a</span> <span class="p">(</span><span class="nf">!</span> <span class="p">(</span><span class="nf">&&</span> <span class="nv">b</span> <span class="nv">c</span><span class="p">)))</span> <span class="p">(</span><span class="nf">!</span> <span class="p">(</span><span class="nf">&&</span> <span class="nv">a</span> <span class="p">(</span><span class="nf">||</span> <span class="p">(</span><span class="nf">!</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="nf">!</span> <span class="nv">c</span><span class="p">)))))))</span>
<span class="p">(</span><span class="nb">displayln</span> <span class="p">(</span><span class="nf">ldump</span> <span class="nv">f</span><span class="p">))</span> <span class="c1">; (! (^^ (&& c b) a))</span>
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>While working on reverse-engineering the Microchip ATF15xx CPLD family, I found myself deriving minimal logic functions from a truth table. This useful because while it is easy to sample all possible states of a black box combinatorial function using e.g. <a href="https://en.wikipedia.org/wiki/Boundary_scan">boundary scan</a>, these truth tables are unwieldy and don’t provide much insight into the hardware. While a minimal function with the same truth table would not necessarily be <em>the</em> function implemented by the hardware (which may have hidden variables, or simply use a larger equivalent function that is more convenient to implement), deriving one still provides great insight. In this note I explore this process.</p>
tag:lab.whitequark.org,2020-04-06:/notes/2020-04-06/synthesizing-optimal-8051-code/
Synthesizing optimal 8051 code
2020-04-06T16:44:20Z
2020-04-06T16:44:20Z
<p>While working on an application targeting Nordic nRF24LE1, a wireless SoC with a fairly slow 8051 core, I was wondering if I can have fast, or at least not unusably slow, cryptography. Most cryptographic algorithms involve wide rotates, and the 8051 only has instructions for rotating a 8-bit accumulator by one bit at a time. In this note I explore deriving optimal code for rotating values in registers (that may be bigger than 8 bits) by multiple bits.</p>
<!--more-->
<ul id="markdown-toc">
<li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
<li><a href="#code-generator" id="markdown-toc-code-generator">Code generator</a></li>
<li>
<a href="#results" id="markdown-toc-results">Results</a> <ul>
<li><a href="#bit-rotates" id="markdown-toc-bit-rotates">8-bit rotates</a></li>
<li><a href="#bit-rotates-1" id="markdown-toc-bit-rotates-1">16-bit rotates</a></li>
</ul>
</li>
</ul>
<h1 id="introduction">Introduction</h1>
<p>My chosen approach (thanks to <a href="https://www.cs.utah.edu/~regehr/">John Regehr</a> for the <a href="https://twitter.com/johnregehr/status/1212563858524499968">suggestion</a>) is to implement an interpreter for an abstract 8051 assembly representation in <a href="https://racket-lang.org">Racket</a> and then use <a href="https://emina.github.io/rosette/">Rosette</a> to translate assertions about the results of interpreting an arbitrary piece of code into a query for an <a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories">SMT solver</a>.</p>
<p>Rosette greatly simplifies this task because it lets me avoid learning anything about SMT solvers, and only requires me to understand the constraints of its symbolic execution approach. (Only a small subset of Racket is safe to use in Rosette, and functions outside of that subset are hard to use correctly without an in-depth understaning of how Rosette works.)</p>
<h1 id="code-generator">Code generator</h1>
<p>The following code generates all possible optimal (more on that below) 8-bit and 16-bit rotates. It uses a rather hacky and complicated scheme where it runs several solvers in parallel, one per CPU, each aiming for a particular fixed number of instructions, and then picks the smallest result as the solvers finish. This is because at the time of writing it, I did not understand that Rosette allows optimizing exists-forall problems. (It is quite easy to do so, as I describe <a href="/notes/2020-04-06/minimizing-logic-expressions/">in a later note</a>.)</p>
<p>However, that turned out to be a blessing in disguise; when writing this note, I <a href="/files/synth51/synth51-broken.rkt">rewrote the query as an optimization problem</a> for the solver, and it doesn’t seem like that would work for this use case. First, of the solvers that can be used by Rosette, only Z3 supports quantified formulas, whereas Boolector had the best performance with the simpler queries. Second, even for very small programs (such as 8-bit rotates, which all fit in 4 instructions, and even restricting the usable registers to 2 out of 8), the memory footprint of Z3 grows extremely quickly, and I always ran out of memory before getting a solution.</p>
<p>By “optimal” here I mean “optimal within the limited model being used”, of course. The model I’m using specifically omits any memory access (preventing the use of the <code>XCHD</code> instruction among other things), and in general has a very limited number of instructions to make solver runtime manageable. It is possible (but unlikely) that some of the instructions missing in the model but present in every 8051 CPU provide a faster way to do rotates. It is possible (and fairly likely) that your specific flavor of 8051 CPU provides a faster way to do rotates that involves memory-mapped I/O; indeed, nRF24LE1 does, but I was interested in more portable code.</p>
<figure><figcaption>synth51.rkt (<a href="/files/synth51/synth51.rkt">download</a>)</figcaption><pre><code class="language-racket"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
</pre></td>
<td class="rouge-code"><pre><span class="o">#</span><span class="nv">lang</span> <span class="nv">rosette/safe</span>
<span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="nf">only-in</span> <span class="nv">racket</span> <span class="nv">hash</span> <span class="nv">in-range</span> <span class="nv">for</span> <span class="nv">for/list</span> <span class="nv">with-handlers</span> <span class="nv">flush-output</span>
<span class="nv">thread</span> <span class="nv">thread-wait</span> <span class="nv">break-thread</span> <span class="nv">exn:break?</span>
<span class="nv">make-semaphore</span> <span class="nv">semaphore-wait</span> <span class="nv">semaphore-post</span> <span class="nv">call-with-semaphore/enable-break</span>
<span class="nv">processor-count</span><span class="p">))</span>
<span class="p">(</span><span class="k">require</span> <span class="nv">rosette/solver/smt/z3</span>
<span class="nv">rosette/solver/smt/boolector</span>
<span class="nv">rosette/solver/smt/yices</span><span class="p">)</span>
<span class="c1">;(current-solver (z3 #:logic 'QF_BV #:options (hash</span>
<span class="c1">; ':parallel.enable 'true</span>
<span class="c1">; ':parallel.threads.max 4)))</span>
<span class="c1">;(current-solver (yices #:logic 'QF_BV))</span>
<span class="p">(</span><span class="nf">current-solver</span> <span class="p">(</span><span class="nf">boolector</span> <span class="nt">#:logic</span> <span class="ss">'QF_BV</span><span class="p">))</span>
<span class="p">(</span><span class="k">require</span> <span class="nv">rosette/lib/angelic</span>
<span class="nv">rosette/lib/match</span><span class="p">)</span>
<span class="p">(</span><span class="nf">current-bitwidth</span> <span class="mi">5</span><span class="p">)</span>
<span class="c1">; bit operations</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">rotate-right</span> <span class="nv">s</span> <span class="nv">i</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">cond</span>
<span class="p">[(</span><span class="nb">=</span> <span class="nv">i</span> <span class="mi">0</span><span class="p">)</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">[</span><span class="nf">else</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">extract</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">i</span> <span class="mi">1</span><span class="p">)</span> <span class="mi">0</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nf">extract</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">s</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">i</span> <span class="nv">x</span><span class="p">))]))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">rotate-left</span> <span class="nv">s</span> <span class="nv">i</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">cond</span>
<span class="p">[(</span><span class="nb">=</span> <span class="nv">i</span> <span class="mi">0</span><span class="p">)</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">[</span><span class="nf">else</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">extract</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">s</span> <span class="nv">i</span> <span class="mi">1</span><span class="p">)</span> <span class="mi">0</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nf">extract</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">s</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">s</span> <span class="nv">i</span><span class="p">)</span> <span class="nv">x</span><span class="p">))]))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">replace-bit</span> <span class="nv">s</span> <span class="nv">i</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">m</span> <span class="p">(</span><span class="nf">bvshl</span> <span class="p">(</span><span class="nf">bv</span> <span class="mi">1</span> <span class="nv">s</span><span class="p">)</span> <span class="p">(</span><span class="nf">integer->bitvector</span> <span class="nv">i</span> <span class="nv">s</span><span class="p">)))</span>
<span class="p">(</span><span class="k">cond</span>
<span class="p">[(</span><span class="nf">bveq</span> <span class="nv">y</span> <span class="p">(</span><span class="nf">bv</span> <span class="mi">0</span> <span class="mi">1</span><span class="p">))</span> <span class="p">(</span><span class="nf">bvand</span> <span class="nv">x</span> <span class="p">(</span><span class="nf">bvnot</span> <span class="nv">m</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">bveq</span> <span class="nv">y</span> <span class="p">(</span><span class="nf">bv</span> <span class="mi">1</span> <span class="mi">1</span><span class="p">))</span> <span class="p">(</span><span class="nf">bvor</span> <span class="nv">x</span> <span class="nv">m</span><span class="p">)]</span>
<span class="p">[</span><span class="nf">else</span> <span class="p">(</span><span class="nf">assert</span> <span class="no">#f</span><span class="p">)]))</span>
<span class="c1">; CPU state</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">state</span> <span class="p">(</span><span class="nf">A</span> <span class="nv">C</span> <span class="nv">Rn</span><span class="p">)</span> <span class="nt">#:mutable</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nb">vector-ref</span> <span class="p">(</span><span class="nf">state-Rn</span> <span class="nv">S</span><span class="p">)</span> <span class="nv">n</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-Rn-set!</span> <span class="nv">S</span> <span class="nv">n</span> <span class="nv">v</span><span class="p">)</span>
<span class="p">(</span><span class="nb">vector-set!</span> <span class="p">(</span><span class="nf">state-Rn</span> <span class="nv">S</span><span class="p">)</span> <span class="nv">n</span> <span class="nv">v</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R0</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R1</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R2</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">2</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R3</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">3</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R4</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">4</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R5</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">5</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R6</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">6</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">state-R7</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="mi">7</span><span class="p">))</span>
<span class="p">(</span><span class="nf">define-symbolic</span> <span class="nv">A</span> <span class="nv">R0</span> <span class="nv">R1</span> <span class="nv">R2</span> <span class="nv">R3</span> <span class="nv">R4</span> <span class="nv">R5</span> <span class="nv">R6</span> <span class="nv">R7</span> <span class="p">(</span><span class="nf">bitvector</span> <span class="mi">8</span><span class="p">))</span>
<span class="p">(</span><span class="nf">define-symbolic</span> <span class="nv">C</span> <span class="p">(</span><span class="nf">bitvector</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">make-state</span><span class="p">)</span>
<span class="p">(</span><span class="nf">state</span> <span class="nv">A</span> <span class="nv">C</span> <span class="p">(</span><span class="nb">vector</span> <span class="nv">R0</span> <span class="nv">R1</span> <span class="nv">R2</span> <span class="nv">R3</span> <span class="nv">R4</span> <span class="nv">R5</span> <span class="nv">R6</span> <span class="nv">R7</span><span class="p">)))</span>
<span class="c1">; instructions</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">MOV-A-Rn</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">MOV-Rn-A</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">ANL-A-Rn</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">ORL-A-Rn</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">XRL-A-Rn</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">XCH-A-Rn</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">MOV-A-i</span> <span class="p">(</span><span class="nf">i</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">ANL-A-i</span> <span class="p">(</span><span class="nf">i</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">ORL-A-i</span> <span class="p">(</span><span class="nf">i</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">SWAP-A</span> <span class="p">()</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">CLR-C</span> <span class="p">()</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">MOV-C-An</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">MOV-An-C</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">RLC-A</span> <span class="p">()</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">RRC-A</span> <span class="p">()</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">RL-A</span> <span class="p">()</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">RR-A</span> <span class="p">()</span> <span class="nt">#:transparent</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">print-insn</span> <span class="nv">insn</span><span class="p">)</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">insn</span>
<span class="p">[(</span><span class="nf">MOV-A-Rn</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"MOV A, R~s~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">MOV-Rn-A</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"MOV R~s, A~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">ANL-A-Rn</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"ANL A, R~s~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">ORL-A-Rn</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"ORL A, R~s~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">XRL-A-Rn</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"XRL A, R~s~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">XCH-A-Rn</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"XCH A, R~s~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">MOV-A-i</span> <span class="nv">i</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"MOV A, #0x~x~n"</span> <span class="p">(</span><span class="nf">bitvector->natural</span> <span class="nv">i</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">ANL-A-i</span> <span class="nv">i</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"ANL A, #0x~x~n"</span> <span class="p">(</span><span class="nf">bitvector->natural</span> <span class="nv">i</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">ORL-A-i</span> <span class="nv">i</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"ORL A, #0x~x~n"</span> <span class="p">(</span><span class="nf">bitvector->natural</span> <span class="nv">i</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">SWAP-A</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"SWAP A~n"</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">CLR-C</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"CLR C~n"</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">MOV-C-An</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"MOV C, ACC.~s~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">MOV-An-C</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"MOV ACC.~s, C~n"</span> <span class="nv">n</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">RLC-A</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"RLC A~n"</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">RRC-A</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"RRC A~n"</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">RL-A</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"RL A~n"</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">RR-A</span><span class="p">)</span> <span class="p">(</span><span class="nb">printf</span> <span class="s">"RR A~n"</span><span class="p">)]))</span>
<span class="c1">; sketches</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">??insn</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">n</span> <span class="p">(</span><span class="nf">choose*</span> <span class="mi">0</span> <span class="mi">1</span><span class="p">))</span><span class="c1">; 2 3 4 5 6 7))</span>
<span class="p">(</span><span class="nf">define-symbolic*</span> <span class="nv">i</span> <span class="p">(</span><span class="nf">bitvector</span> <span class="mi">8</span><span class="p">))</span>
<span class="c1">;(define i (choose* (bv #xf0 8) (bv #x0f 8)))</span>
<span class="p">(</span><span class="nf">choose*</span> <span class="p">(</span><span class="nf">MOV-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">MOV-Rn-A</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">ANL-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">ORL-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">XRL-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">XCH-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">MOV-A-i</span> <span class="nv">i</span><span class="p">)</span>
<span class="p">(</span><span class="nf">ANL-A-i</span> <span class="nv">i</span><span class="p">)</span>
<span class="p">(</span><span class="nf">ORL-A-i</span> <span class="nv">i</span><span class="p">)</span>
<span class="p">(</span><span class="nf">SWAP-A</span><span class="p">)</span>
<span class="p">(</span><span class="nf">CLR-C</span><span class="p">)</span>
<span class="p">(</span><span class="nf">MOV-C-An</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">MOV-An-C</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">RLC-A</span><span class="p">)</span>
<span class="p">(</span><span class="nf">RRC-A</span><span class="p">)</span>
<span class="p">(</span><span class="nf">RL-A</span><span class="p">)</span>
<span class="p">(</span><span class="nf">RR-A</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">??prog</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">fuel</span> <span class="mi">0</span><span class="p">)</span> <span class="nv">null</span>
<span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="nf">??insn</span><span class="p">)</span> <span class="p">(</span><span class="nf">??prog</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">fuel</span> <span class="mi">1</span><span class="p">)))))</span>
<span class="c1">; symbolic interpreter</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">run-insn</span> <span class="nv">S</span> <span class="nv">insn</span><span class="p">)</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">insn</span>
<span class="p">[(</span><span class="nf">MOV-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="nv">n</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">MOV-Rn-A</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">state-Rn-set!</span> <span class="nv">S</span> <span class="nv">n</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">ANL-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">bvand</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="nv">n</span><span class="p">)))]</span>
<span class="p">[(</span><span class="nf">ORL-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">bvor</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="nv">n</span><span class="p">)))]</span>
<span class="p">[(</span><span class="nf">XRL-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">bvxor</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="nv">n</span><span class="p">)))]</span>
<span class="p">[(</span><span class="nf">XCH-A-Rn</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="nf">A</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)]</span> <span class="p">[</span><span class="nf">Rn</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="nv">n</span><span class="p">)])</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="nv">Rn</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-set!</span> <span class="nv">S</span> <span class="nv">n</span> <span class="nv">A</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">MOV-A-i</span> <span class="nv">i</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="nv">i</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">ANL-A-i</span> <span class="nv">i</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">bvand</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)</span> <span class="nv">i</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">ORL-A-i</span> <span class="nv">i</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">bvor</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)</span> <span class="nv">i</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">SWAP-A</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="nf">A</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)])</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">3</span> <span class="mi">0</span> <span class="nv">A</span><span class="p">)</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">7</span> <span class="mi">4</span> <span class="nv">A</span><span class="p">))))]</span>
<span class="p">[(</span><span class="nf">CLR-C</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-C!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">bv</span> <span class="mi">0</span> <span class="mi">1</span><span class="p">))]</span>
<span class="p">[(</span><span class="nf">MOV-C-An</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-C!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">extract</span> <span class="nv">n</span> <span class="nv">n</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)))]</span>
<span class="p">[(</span><span class="nf">MOV-An-C</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">replace-bit</span> <span class="mi">8</span> <span class="nv">n</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-C</span> <span class="nv">S</span><span class="p">)))]</span>
<span class="p">[(</span><span class="nf">RLC-A</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="nf">A</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)]</span> <span class="p">[</span><span class="nf">C</span> <span class="p">(</span><span class="nf">state-C</span> <span class="nv">S</span><span class="p">)])</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">6</span> <span class="mi">0</span> <span class="nv">A</span><span class="p">)</span> <span class="nv">C</span><span class="p">))</span>
<span class="p">(</span><span class="nf">set-state-C!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">7</span> <span class="mi">7</span> <span class="nv">A</span><span class="p">)))]</span>
<span class="p">[(</span><span class="nf">RRC-A</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="nf">A</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)]</span> <span class="p">[</span><span class="nf">C</span> <span class="p">(</span><span class="nf">state-C</span> <span class="nv">S</span><span class="p">)])</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">concat</span> <span class="nv">C</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">7</span> <span class="mi">1</span> <span class="nv">A</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">set-state-C!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">0</span> <span class="mi">0</span> <span class="nv">A</span><span class="p">)))]</span>
<span class="p">[(</span><span class="nf">RL-A</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="nf">A</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)])</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">6</span> <span class="mi">0</span> <span class="nv">A</span><span class="p">)</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">7</span> <span class="mi">7</span> <span class="nv">A</span><span class="p">))))]</span>
<span class="p">[(</span><span class="nf">RR-A</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="nf">A</span> <span class="p">(</span><span class="nf">state-A</span> <span class="nv">S</span><span class="p">)])</span>
<span class="p">(</span><span class="nf">set-state-A!</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">0</span> <span class="mi">0</span> <span class="nv">A</span><span class="p">)</span> <span class="p">(</span><span class="nf">extract</span> <span class="mi">7</span> <span class="mi">1</span> <span class="nv">A</span><span class="p">))))]</span>
<span class="p">))</span>
<span class="c1">; program verifier</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">verify-prog</span> <span class="nv">prog</span> <span class="nv">asserts</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">make-state</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">S*</span> <span class="p">(</span><span class="nf">make-state</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">solution</span>
<span class="p">(</span><span class="nf">verify</span>
<span class="nt">#:guarantee</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">for-each</span> <span class="p">(</span><span class="nf">curry</span> <span class="nv">run-insn</span> <span class="nv">S*</span><span class="p">)</span> <span class="nv">prog</span><span class="p">)</span>
<span class="p">(</span><span class="nf">asserts</span> <span class="nv">S</span> <span class="nv">S*</span><span class="p">))))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">unsat?</span> <span class="nv">solution</span><span class="p">)</span> <span class="no">#t</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">displayln</span> <span class="p">(</span><span class="nf">evaluate</span> <span class="nv">S</span> <span class="nv">solution</span><span class="p">))</span>
<span class="p">(</span><span class="nb">displayln</span> <span class="p">(</span><span class="nf">evaluate</span> <span class="nv">S*</span> <span class="nv">solution</span><span class="p">))</span>
<span class="no">#f</span><span class="p">)))</span>
<span class="c1">; program synthesizer</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">synthesize-prog</span> <span class="nv">sketch</span> <span class="nv">asserts</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">S</span> <span class="p">(</span><span class="nf">make-state</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">S*</span> <span class="p">(</span><span class="nf">make-state</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">solution</span>
<span class="p">(</span><span class="nf">synthesize</span>
<span class="nt">#:forall</span> <span class="nv">S</span>
<span class="nt">#:guarantee</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">for-each</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">insn</span><span class="p">)</span> <span class="p">(</span><span class="nf">run-insn</span> <span class="nv">S*</span> <span class="nv">insn</span><span class="p">))</span> <span class="nv">sketch</span><span class="p">)</span>
<span class="p">(</span><span class="nf">asserts</span> <span class="nv">S</span> <span class="nv">S*</span><span class="p">))))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">unsat?</span> <span class="nv">solution</span><span class="p">)</span> <span class="no">#f</span>
<span class="p">(</span><span class="nf">evaluate</span> <span class="nv">sketch</span> <span class="nv">solution</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">optimize-prog</span> <span class="nv">max-fuel</span> <span class="nv">sketch-gen</span> <span class="nv">asserts</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">worker</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">prog</span> <span class="p">(</span><span class="nf">synthesize-prog</span> <span class="p">(</span><span class="nf">sketch-gen</span> <span class="nv">fuel</span><span class="p">)</span> <span class="nv">asserts</span><span class="p">))</span>
<span class="p">(</span><span class="k">if</span> <span class="nv">prog</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">eprintf</span> <span class="s">"sat! ~s~n"</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="nb">for-each</span> <span class="nv">print-insn</span> <span class="nv">prog</span><span class="p">))</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">eprintf</span> <span class="s">"unsat! ~s~n"</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">>=</span> <span class="nv">fuel</span> <span class="nv">max-fuel</span><span class="p">)</span> <span class="no">#f</span>
<span class="p">(</span><span class="nf">worker</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">fuel</span> <span class="mi">1</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">worker</span> <span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">optimize-prog/parallel</span> <span class="nv">max-fuel</span> <span class="nv">sketch-gen</span> <span class="nv">asserts</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">solved</span> <span class="p">(</span><span class="nb">box</span> <span class="no">#f</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">solved-fuel</span> <span class="p">(</span><span class="nb">box</span> <span class="mi">1000</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">threads</span> <span class="p">(</span><span class="nb">box</span> <span class="o">'</span><span class="p">()))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">report-sema</span> <span class="p">(</span><span class="nb">make-semaphore</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">worker</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="k">cond</span>
<span class="p">[(</span><span class="k">or</span> <span class="p">(</span><span class="nb">not</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">solved</span><span class="p">))</span> <span class="p">(</span><span class="nb"><</span> <span class="nv">fuel</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">solved-fuel</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">prog</span> <span class="p">(</span><span class="nf">synthesize-prog</span> <span class="p">(</span><span class="nf">sketch-gen</span> <span class="nv">fuel</span><span class="p">)</span> <span class="nv">asserts</span><span class="p">))</span>
<span class="p">(</span><span class="nb">call-with-semaphore/enable-break</span> <span class="nv">report-sema</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">()</span>
<span class="p">(</span><span class="k">if</span> <span class="nv">prog</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">eprintf</span> <span class="s">"sat! ~s~n"</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="nb">for-each</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">thd-fuel</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">></span> <span class="p">(</span><span class="nb">cdr</span> <span class="nv">thd-fuel</span><span class="p">)</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="nb">break-thread</span> <span class="p">(</span><span class="nb">car</span> <span class="nv">thd-fuel</span><span class="p">))</span>
<span class="p">(</span><span class="nb">void</span><span class="p">)))</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">threads</span><span class="p">))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="k">or</span> <span class="p">(</span><span class="nb">not</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">solved</span><span class="p">))</span> <span class="p">(</span><span class="nb"><</span> <span class="nv">fuel</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">solved-fuel</span><span class="p">)))</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">set-box!</span> <span class="nv">solved-fuel</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="nb">set-box!</span> <span class="nv">solved</span> <span class="nv">prog</span><span class="p">))</span>
<span class="p">(</span><span class="nb">void</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">eprintf</span> <span class="s">"unsat! ~s~n"</span> <span class="nv">fuel</span><span class="p">))))]))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">core-sema</span> <span class="p">(</span><span class="nb">make-semaphore</span> <span class="p">(</span><span class="nf">processor-count</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">for</span> <span class="p">([</span><span class="nf">fuel</span> <span class="p">(</span><span class="nf">in-range</span> <span class="p">(</span><span class="nb">add1</span> <span class="nv">max-fuel</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">semaphore-wait</span> <span class="nv">core-sema</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">thd</span>
<span class="p">(</span><span class="nb">thread</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span>
<span class="p">(</span><span class="k">with-handlers</span> <span class="p">([</span><span class="nb">exn:break?</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">void</span><span class="p">))])</span>
<span class="p">(</span><span class="nf">worker</span> <span class="nv">fuel</span><span class="p">))</span>
<span class="p">(</span><span class="nb">semaphore-post</span> <span class="nv">core-sema</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">set-box!</span> <span class="nv">threads</span> <span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="nb">cons</span> <span class="nv">thd</span> <span class="nv">fuel</span><span class="p">)</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">threads</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">for-each</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">thd-fuel</span><span class="p">)</span> <span class="p">(</span><span class="nb">thread-wait</span> <span class="p">(</span><span class="nb">car</span> <span class="nv">thd-fuel</span><span class="p">)))</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">threads</span><span class="p">))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">not</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">solved</span><span class="p">))</span> <span class="p">(</span><span class="nb">void</span><span class="p">)</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">for-each</span> <span class="nv">print-insn</span> <span class="p">(</span><span class="nb">unbox</span> <span class="nv">solved</span><span class="p">))</span>
<span class="p">(</span><span class="nb">flush-output</span><span class="p">))))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">assert-preserve</span> <span class="nv">S</span> <span class="nv">S*</span> <span class="o">.</span> <span class="nv">regs</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">assert-preserve-reg</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">assert</span> <span class="p">(</span><span class="nf">bveq</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-Rn-ref</span> <span class="nv">S*</span> <span class="nv">n</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">for-each</span> <span class="nv">assert-preserve-reg</span> <span class="nv">regs</span><span class="p">))</span>
<span class="c1">; examples</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">optimize-8b-rotate-right</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">optimize-prog/parallel</span>
<span class="mi">4</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">fuel</span><span class="p">)</span> <span class="p">(</span><span class="nf">??prog</span> <span class="nv">fuel</span><span class="p">))</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">S</span> <span class="nv">S*</span><span class="p">)</span>
<span class="p">(</span><span class="nf">assert</span> <span class="p">(</span><span class="nf">bveq</span> <span class="p">(</span><span class="nf">rotate-right</span> <span class="mi">8</span> <span class="nv">n</span> <span class="p">(</span><span class="nf">state-R0</span> <span class="nv">S</span><span class="p">))</span> <span class="p">(</span><span class="nf">state-R0</span> <span class="nv">S*</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">assert-preserve</span> <span class="nv">S</span> <span class="nv">S*</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">for</span> <span class="p">([</span><span class="nf">n</span> <span class="p">(</span><span class="nf">in-range</span> <span class="mi">8</span><span class="p">)])</span>
<span class="p">(</span><span class="k">time</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s">"; rotate right R0 by ~a~n"</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">flush-output</span><span class="p">)</span>
<span class="p">(</span><span class="nf">optimize-8b-rotate-right</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s">"; "</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">optimize-16b-rotate-right</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nf">optimize-prog/parallel</span>
<span class="mi">20</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">fuel</span> <span class="mi">0</span><span class="p">)</span> <span class="o">'</span><span class="p">()</span>
<span class="p">(</span><span class="nb">append</span>
<span class="c1">; help the synthesizer out a bit</span>
<span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nf">MOV-A-Rn</span> <span class="p">(</span><span class="nf">choose*</span> <span class="mi">0</span> <span class="mi">1</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">??prog</span> <span class="nv">fuel</span><span class="p">)</span>
<span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nf">MOV-Rn-A</span> <span class="p">(</span><span class="nf">choose*</span> <span class="mi">0</span> <span class="mi">1</span><span class="p">))))))</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">S</span> <span class="nv">S*</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">R10</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">state-R1</span> <span class="nv">S</span> <span class="p">)</span> <span class="p">(</span><span class="nf">state-R0</span> <span class="nv">S</span> <span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">R10*</span> <span class="p">(</span><span class="nf">concat</span> <span class="p">(</span><span class="nf">state-R1</span> <span class="nv">S*</span><span class="p">)</span> <span class="p">(</span><span class="nf">state-R0</span> <span class="nv">S*</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">assert</span> <span class="p">(</span><span class="nf">bveq</span> <span class="p">(</span><span class="nf">rotate-right</span> <span class="mi">16</span> <span class="nv">n</span> <span class="nv">R10</span><span class="p">)</span> <span class="nv">R10*</span><span class="p">))</span>
<span class="p">(</span><span class="nf">assert-preserve</span> <span class="nv">S</span> <span class="nv">S*</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">for</span> <span class="p">([</span><span class="nf">n</span> <span class="p">(</span><span class="nf">in-range</span> <span class="mi">16</span><span class="p">)])</span>
<span class="p">(</span><span class="k">time</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s">"; rotate right R1:R0 by ~a~n"</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">flush-output</span><span class="p">)</span>
<span class="p">(</span><span class="nf">optimize-16b-rotate-right</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s">"; "</span><span class="p">)))</span>
</pre></td>
</tr></tbody></table></code></pre></figure>
<h1 id="results">Results</h1>
<p>Generating the optimal 8-bit and 16-bit rotates took about half a day on a modern laptop (Dell XPS13 9360, using all cores and with mitigations disabled). Because of that I have not attempted generating wider ones so far.</p>
<h2 id="bit-rotates">8-bit rotates</h2>
<p>The following code lists all optimal 8-bit rotates, by 0 to 7 bits.</p>
<figure><figcaption>rot8.asm (<a href="/files/synth51/rot8.asm">download</a>)</figcaption><pre><code class="language-text"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
</pre></td>
<td class="rouge-code"><pre>; rotate right R0 by 0
; rotate right R0 by 1
MOV A, R0
RR A
MOV R0, A
; rotate right R0 by 2
MOV A, R0
RR A
RR A
MOV R0, A
; rotate right R0 by 3
XCH A, R0
RL A
SWAP A
MOV R0, A
; rotate right R0 by 4
MOV A, R0
SWAP A
MOV R0, A
; rotate right R0 by 5
MOV A, R0
SWAP A
RR A
XCH A, R0
; rotate right R0 by 6
MOV A, R0
RL A
RL A
MOV R0, A
; rotate right R0 by 7
MOV A, R0
RL A
MOV R0, A
</pre></td>
</tr></tbody></table></code></pre></figure>
<h2 id="bit-rotates-1">16-bit rotates</h2>
<p>The following code lists all optimal 16-bit rotates, by 0 to 15 bits. I find the approach the solver used for the rotate by 10 nothing short of brilliant, and the approach it took for rotate by 3/5/11/13 pretty neat as well.</p>
<figure><figcaption>rot16.asm (<a href="/files/synth51/rot16.asm">download</a>)</figcaption><pre><code class="language-text"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
</pre></td>
<td class="rouge-code"><pre>; rotate right R1:R0 by 0
; rotate right R1:R0 by 1
MOV A, R1
MOV C, ACC.0
XCH A, R0
RRC A
XCH A, R0
RRC A
MOV R1, A
; rotate right R1:R0 by 2
MOV A, R1
MOV C, ACC.0
XCH A, R0
RRC A
XCH A, R0
RRC A
MOV C, ACC.0
XCH A, R0
RRC A
XCH A, R0
RRC A
MOV R1, A
; rotate right R1:R0 by 3
MOV A, R0
XRL A, R1
MOV R1, A
ANL A, #0xf8
XRL A, R0
SWAP A
RL A
XCH A, R1
SWAP A
RL A
XRL A, R1
MOV R0, A
; rotate right R1:R0 by 4
MOV A, R0
XRL A, R1
ANL A, #0xf0
XCH A, R1
XRL A, R1
SWAP A
XCH A, R0
XRL A, R1
SWAP A
MOV R1, A
; rotate right R1:R0 by 5
MOV A, R1
XRL A, R0
ANL A, #0x1f
XCH A, R0
XRL A, R0
RR A
SWAP A
XCH A, R0
XRL A, R1
SWAP A
RR A
MOV R1, A
; rotate right R1:R0 by 6
MOV A, R1
RLC A
XCH A, R0
RLC A
XCH A, R1
RLC A
MOV C, ACC.7
XCH A, R1
RLC A
XCH A, R1
RLC A
MOV R0, A
; rotate right R1:R0 by 7
MOV A, R0
MOV C, ACC.7
MOV A, R1
RLC A
XCH A, R0
RLC A
MOV R1, A
; rotate right R1:R0 by 8
MOV A, R1
XCH A, R0
MOV R1, A
; rotate right R1:R0 by 9
MOV A, R1
RRC A
MOV A, R0
RRC A
XCH A, R1
RRC A
MOV R0, A
; rotate right R1:R0 by 10
MOV A, R1
XRL A, R0
ANL A, #0x3
XRL A, R1
RR A
RR A
XCH A, R0
XRL A, R1
RR A
RR A
XRL A, R0
MOV R1, A
; rotate right R1:R0 by 11
MOV A, R1
XRL A, R0
MOV R1, A
ANL A, #0x7
XRL A, R0
RL A
SWAP A
XCH A, R1
RL A
SWAP A
XRL A, R1
MOV R0, A
; rotate right R1:R0 by 12
MOV A, R0
XRL A, R1
ANL A, #0xf0
XCH A, R1
XRL A, R1
SWAP A
XCH A, R1
XRL A, R0
SWAP A
MOV R0, A
; rotate right R1:R0 by 13
MOV A, R1
XRL A, R0
ANL A, #0xe0
XCH A, R0
XRL A, R0
SWAP A
RR A
XCH A, R0
XRL A, R1
RR A
SWAP A
MOV R1, A
; rotate right R1:R0 by 14
MOV A, R1
MOV C, ACC.7
XCH A, R0
RLC A
XCH A, R0
RLC A
MOV C, ACC.7
XCH A, R0
RLC A
XCH A, R0
RLC A
MOV R1, A
; rotate right R1:R0 by 15
MOV A, R1
MOV C, ACC.7
XCH A, R0
RLC A
XCH A, R0
RLC A
MOV R1, A
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>While working on an application targeting Nordic nRF24LE1, a wireless SoC with a fairly slow 8051 core, I was wondering if I can have fast, or at least not unusably slow, cryptography. Most cryptographic algorithms involve wide rotates, and the 8051 only has instructions for rotating a 8-bit accumulator by one bit at a time. In this note I explore deriving optimal code for rotating values in registers (that may be bigger than 8 bits) by multiple bits.</p>
tag:lab.whitequark.org,2020-01-27:/notes/2020-01-27/undocumented-nrf24lu1-quirks/
Undocumented nRF24LU1+ quirks
2020-01-27T23:48:52Z
2020-01-27T23:48:52Z
<p>While working with nRF24LU1+, I discovered that the chip has lots and lots of odd corners that are sparsely or strangely documented, where the silicon doesn’t match documentation, etc.</p>
<!--more-->
<ul id="markdown-toc">
<li><a href="#info-page-size" id="markdown-toc-info-page-size">Info page size</a></li>
<li><a href="#reading-info-page-from-mcu" id="markdown-toc-reading-info-page-from-mcu">Reading info page from MCU</a></li>
<li><a href="#bootloader-and-reset-vector" id="markdown-toc-bootloader-and-reset-vector">Bootloader and reset vector</a></li>
</ul>
<h1 id="info-page-size">Info page size</h1>
<p>The datasheet describes the info page as 512 bytes long. However, setting <code>FSR.INFEN</code> and reading the entire address space reveals a 1024 byte periodic structure. The second half appears to be a ROM (it cannot be either programmed or erased), and on my chip it starts with <code>43 46 54 39 32 32 11 41 2c</code> (“CFT922”) and has an additional <code>00</code> at offset 0x21.</p>
<h1 id="reading-info-page-from-mcu">Reading info page from MCU</h1>
<p>The datasheet describes in detail how to read info page via SPI, and that works just fine. However, it doesn’t explain how to read it via MCU, which is strange because the info page contains an unique chip ID and it is avilable for user data. The <code>FSR.INFEN</code> bit is documented only to affect SPI accesses, for which a diagram implies the info page replaces the 0th page (0x0000..0x0100).</p>
<p>In practice, it works rather differently. Setting <code>FSR.INFEN</code> in firmware replaces all data accesses to flash address space (which is the entire lower half of it) with info page accesses, similar to the previous section. However, code accesses are not affected.</p>
<p>In terms of the 8051 instruction set, <code>MOVC</code> always performs a code access, and <code>MOVX</code> can perform either a data (0) or a code (1) access depending on the state of <code>PCON.PMW</code> bit. <code>PMW</code> stands for “program memory write”, but, in spite of that name, it affects reads too. Because nRF24LU1+ has non-overlapping code and data spaces (besides the info page quirk) there are no references to it in the documentation for that chip, but the documentation for nRF24LE1, which does have overlapping address spaces, is more suggestive.</p>
<p>This means that the info page may be read via the MCU by setting <code>PCON.PMW</code> and <code>FSR.INFEN</code> to make compiler-emitted <code>MOVX</code> instructions work as expected, and then temporarily clearing <code>PCON.PMW</code> each time a byte of the info page is read.</p>
<h1 id="bootloader-and-reset-vector">Bootloader and reset vector</h1>
<p>The nRF24LU1+ flash is clearly designed for atomic firmware updates: the population count of the 16 bytes at the top of the flash determines whether the bootloader (“protected area”) or application (“unprotected area”) is executed at reset. Thus, it is expected that the application and bootloader take turns programming single bits, which will succeed under worst case power failure.</p>
<p>However, it is useful to have a permanent bootloader mode, where the chip can be programmed via USB by using a strap pin, or a delay and a special USB request. To achieve this, a single bit may be programmed at the top of flash, with the bootloader running the application if the firmware update trigger is not present.</p>
<p>Unfortunately, branching to address 0 doesn’t work; it just causes the bootloader to be re-entered. The datasheet opaquely alludes to this:</p>
<blockquote>
<p><strong>Note:</strong> In a program running in a protected flash area, movc may not be used to access addresses 0x00 to 0x03.</p>
</blockquote>
<p>By using <code>MOVC</code> to access those bytes, it can be seen that, rather than changing the address of the reset vector of the MCU (or for that matter the interrupt vector table; interrupts may not be used in the bootloader), the designers of this chip decided to replace code accesses to the reset vector with a procedurally generated instruction; when <code>FSR.STP</code> is enabled, the first four bytes of the code address space are replaced with <code>02 XX YY 00</code>, where <code>XX YY</code> is the address of the first protected page. This decodes to <code>LJMP #XXYYh; NOP</code>.</p>
<p>Most 8051 MCUs use an interrupt vector table with 3 bytes per entry, which in practice means that it is usually densely filled with <code>LJMP</code> instructions. (However, there are other possibilities, e.g. sdcc may emit <code>AJMP</code>.) That means that it is practical for a bootloader to read out an instruction, recognize an <code>LJMP</code> (and perhaps a few others) and interpret it, i.e. jump to the address within.</p>
<p>Unfortunately, there is a problem with this approach, which is that even once the application is running, the <code>FSR.STP</code> bit is still set, and this corrupts the first byte of the first interrupt vector (timer 0 overflow). This may be worked around by placing a right-aligned <code>SJMP</code> or <code>AJMP</code> there. Similarly, if no interrupts are used, the linker may place CRT startup code there, which may be worked around with a dummy interrupt handler. In both cases it is necessary to modify the application firmware; there is nothing the bootloader can do here as <code>FSR.STP</code> can not be cleared.</p>
<p>While working with nRF24LU1+, I discovered that the chip has lots and lots of odd corners that are sparsely or strangely documented, where the silicon doesn’t match documentation, etc.</p>
tag:lab.whitequark.org,2020-01-25:/notes/2020-01-25/pixel-pawn-wireless-flash-trigger-on-air-protocol/
Pixel Pawn wireless flash trigger on-air protocol
2020-01-25T18:43:35Z
2020-01-25T18:43:35Z
<p>In this note I describe the on-air protocol of Pixel Pawn wireless flash trigger.</p>
<!-- more -->
<ul id="markdown-toc">
<li><a href="#tools-and-methods" id="markdown-toc-tools-and-methods">Tools and methods</a></li>
<li><a href="#framing" id="markdown-toc-framing">Framing</a></li>
<li><a href="#channels" id="markdown-toc-channels">Channels</a></li>
<li><a href="#commands" id="markdown-toc-commands">Commands</a></li>
<li><a href="#communication" id="markdown-toc-communication">Communication</a></li>
</ul>
<h1 id="tools-and-methods">Tools and methods</h1>
<p>To interact with the flash trigger, I used <a href="https://limemicro.com/products/boards/limesdr-mini/">Lime Microsystems LimeSDR Mini</a>. To determine the center frequency of transmissions, I used <a href="https://github.com/f4exb/sdrangel">SDRAngel</a> in spectrogram mode, knowing that the device is advertised to work in the 2.4 GHz range. To determine bit rate, modulation, and packet format, I used <a href="https://github.com/jopohl/urh">Universal Radio Hacker</a> as described in its documentation. To understand commands better, I used the <a href="https://github.com/GlasgowEmbedded/glasgow">Glasgow debug tool</a>’s <code>radio-nrf24l</code> applet in transmit and receive mode as described <a href="#communication">below</a>.</p>
<h1 id="framing">Framing</h1>
<p>Modulation GMSK, data rate 250 kbps, on-air time 356.00 µs, of which settling time 60.75 µs, transmission time 295.25 µs. Packet format:</p>
<pre><code>1010101010101010101010101010101010101100001101010111111001001110101-1100000
< 67 fixed bits > < CMD >
</code></pre>
<p>Each button press produces a burst of 5-10 packets. However, holding a button or being in an active mode only produces a single packet every few hundred ms.</p>
<h1 id="channels">Channels</h1>
<p>The channel to frequency mapping is as follows:</p>
<table style="width: 200px">
<thead>
<tr>
<th>Channel</th>
<th>Frequency</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>HHHH</code></td>
<td>2.4020</td>
</tr>
<tr>
<td><code>HHHL</code></td>
<td>2.4065</td>
</tr>
<tr>
<td><code>HHLH</code></td>
<td>2.4100</td>
</tr>
<tr>
<td><code>HHLL</code></td>
<td>2.4185</td>
</tr>
<tr>
<td><code>HLHH</code></td>
<td>2.4210</td>
</tr>
<tr>
<td><code>HLHL</code></td>
<td>2.4295</td>
</tr>
<tr>
<td><code>HLLH</code></td>
<td>2.4355</td>
</tr>
<tr>
<td><code>HLLL</code></td>
<td>2.4385</td>
</tr>
<tr>
<td><code>LHHH</code></td>
<td>2.4450</td>
</tr>
<tr>
<td><code>LHHL</code></td>
<td>2.4465</td>
</tr>
<tr>
<td><code>LHLH</code></td>
<td>2.4515</td>
</tr>
<tr>
<td><code>LHLL</code></td>
<td>2.4600</td>
</tr>
<tr>
<td><code>LLHH</code></td>
<td>2.4620</td>
</tr>
<tr>
<td><code>LLHL</code></td>
<td>2.4695</td>
</tr>
<tr>
<td><code>LLLH</code></td>
<td>2.4710</td>
</tr>
<tr>
<td><code>LLLL</code></td>
<td>2.4770</td>
</tr>
</tbody>
</table>
<h1 id="commands">Commands</h1>
<p>Commands are specified in 7 last bits of the packet.</p>
<table>
<thead>
<tr>
<th>Command</th>
<th>Encoding</th>
<th>Condition</th>
</tr>
</thead>
<tbody>
<tr>
<td>wakeup</td>
<td><code>1100000</code></td>
<td>power on, mode switch, other command prefix</td>
</tr>
<tr>
<td>autofocus</td>
<td><code>0001111</code></td>
<td>half press</td>
</tr>
<tr>
<td>normal release</td>
<td><code>0010100</code></td>
<td>mode 1 press</td>
</tr>
<tr>
<td>shutter open</td>
<td><code>0010000</code></td>
<td>mode 2 first press</td>
</tr>
<tr>
<td>shutter close</td>
<td><code>0011011</code></td>
<td>mode 2 second press</td>
</tr>
<tr>
<td>timer start</td>
<td><code>1100100</code></td>
<td>mode 3 first press</td>
</tr>
<tr>
<td>timer cancel</td>
<td><code>0011011</code></td>
<td>mode 3 second press</td>
</tr>
</tbody>
</table>
<p>The preamble can be quite short, as few as 2 octets long. However, commands that are not preceded by <code>1100000</code> will be often not recognized, regardless of preamble length.</p>
<h1 id="communication">Communication</h1>
<p>Considering the framing, any appropriately encoded octet sequence that includes the following one will trigger a command:</p>
<pre><code>aa aa aa aa b0 d5 f9 3a (80|CMD)
</code></pre>
<p>An elegant way to emulate a transmitter is to use an nRF24L01(+) in nRF2401 compatible mode with 4 address bytes set to <code>aa aa aa aa</code> and data bytes set to <code>b0 d5 f9 3a (80|CMD)</code>. However, in a pinch, just transmitting this as a kind of in-band signal in any other packet framing also works just fine.</p>
<p>Similarly, an elegant way to emulate a receiver is to use an nRF24L01(+) in nRF2401 compatible mode with 4 address bytes set to <code>b0 d5 f9 3a</code>. (It uses the same <code>aa</code> preamble, and synchronizes to address bytes.)</p>
<p>nRF24L01(+) only support channel frequencies of integer MHz; this flash trigger uses some channels of half-integer MHz, e.g. 2.4465. The nRF24L01(+) is promiscuous enough that it easily and reliably locks to transmissions on half-integer MHz channels. However, the flash trigger receiver configured to use such a channel ignores any transmissions half MHz apart. This means that an nRF24L01(+) can only transmit on half of the defined channels.</p>
<p>One could notice that nRF24L01(+) transmits with its PLL in open loop, and the frequency of said PLL drifts down. By issuing the <code>REUSE_TX_PL</code> command and pulsing CE for a few ms, it will transmit the last command in a loop while drifting down, and eventually hitting the right frequency. This, however, is a rather disgusting workaround.</p>
<p>In this note I describe the on-air protocol of Pixel Pawn wireless flash trigger.</p>
tag:lab.whitequark.org,2018-10-28:/notes/2018-10-28/patching-nvidia-gpu-driver-for-hot-unplug-on-linux/
Patching nVidia GPU driver for hot-unplug on Linux
2018-10-28T08:27:37Z
2018-10-28T08:27:37Z
<p>Recently, I’ve using an extremely cursed setup where my XPS 13 9360 laptop is connected to a Sonnet EchoExpress 2 box rewired for Thunderbolt 3 that has an nVidia Quadro 600 GPU, and Linux is set up for render offload to the eGPU and then frame transfer back to iGPU to be displayed on the laptop’s integrated display, which (to my sheer surprise) not only works quire reliably, but even gives me higher FPS in Team Fortress 2 than the iGPU.</p>
<p>There’s only really one downside: if the eGPU falls off the bus, either because someone™ pulled out the cable, or because the stars didn’t align quite right this morning and it decided to enumerate seemingly at random (sometimes this is preceeded by whining from PCIe AER, sometimes not, I <em>think</em> it’s some sort of hardware issue like a badly inserted PCIe card, but I’m not entirely sure), the nVidia driver… hangs. Hangs quite deliberately, as the sources to the kernel driver show. This leaves the Xorg instance bound to the eGPU hung forever (which confuses bumblebee, but is otherwise not especially bad), and also prevents any new ones from using the eGPU (which is bad).</p>
<p>Anyway, I was kind of annoyed of rebooting every time it happens, so I decided to reboot a few more dozen times instead while patching the driver. This has indeed worked, and left me with something similar to a functional hot-unplug, mildly crippled by the fact that nvidia-modeset is a completely opaque blob that keeps some internal state and tries to act on it, getting stuck when it tries to do something to the now-missing eGPU.</p>
<p>Turns out, there are only a few issues preventing functional hot-unplug.</p>
<ol>
<li>
<p>In <code>nvidia_remove</code>, the driver actually checks if anyone’s still trying to use it, and if yes, it tries to just hang the removal process. This doesn’t actually work, or rather, it mostly works by accident. It starts an infinite loop calling <code>os_schedule()</code> while having taken the <code>NV_LINUX_DEVICES</code> lock. While in the default configuration this indeed hangs any reentrant requests into the driver by virtue of <code>NV_CHECK_PCI_CONFIG_SPACE</code> taking the same lock (in <code>verify_pci_bars</code>, passing the <code>NVreg_CheckPCIConfigSpace=0</code> module option eliminates that accidental safety mechanism, and allows reentrant requests to proceed. They do not crash due to memory being deallocated in <code>nvidia_remove</code> (so you don’t get an unhandled kernel page fault), but they still crash due to being unable to access the GPU.</p>
</li>
<li>
<p>The NVKMS component (in the <code>nvidia-modeset</code> module) tries to maintain some state, and change it when e.g. the Xorg instance quits and closes the <code>/dev/nvidia-modeset</code> file. Unfortunately, it does not expect the GPU to go away, and first spews a few messages to <code>dmesg</code> similar to <code>nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000857d:0:0:0x0000000f</code>, after which it appears to hang somewhere inside the blob, which has been conveniently stripped of all symbols. This needs to be prevented, but…</p>
</li>
<li>
<p>The NVKMS component effectively only exposes a single opaque ioctl, and all the communication, including communication of the GPU bus ID, happens out of band with regards to the open source parts of the <code>nvidia-modeset</code> module. Fortunately, NVKMS calls back into NVRM, and this allows us to associate each <code>/dev/nvidia-modeset</code> fd with the GPU bus ID.</p>
</li>
<li>
<p>When unloading NVKMS, it also tries to act on its internal state and change the GPU state, which leads to the same hang.</p>
</li>
</ol>
<p>All in all, this allows a patch to be written that detects when a GPU goes away, ignores all further NVKMS requests related to that specific GPU (and returns <code>-ENOENT</code> in response to ioctls, which Xorg appropriately interprets as a fault condition), correctly releases the resources by requesting NVRM, and improperly unloads NVKMS so it doesn’t try to reset the GPU state. (All actual resources should be released by this point, and NVKMS doesn’t have any resource allocation callbacks other than those we already intercept, so <em>in theory</em> this doesn’t have any bad consequences. But I’m not working for nVidia, so this might be completely wrong.)</p>
<p>After the GPU is plugged back in, NVKMS will try to act on its internal state again; in this case, it doesn’t hang, but it doesn’t initialize the GPU correctly either, so the <code>nvidia-modeset</code> kernel module has to be (manually) reloaded. It’s not easy to do this automatically because in a hypothetical system with more than one nVidia GPU the module would still be in use when one of them dies, and so just hard reloading NVKMS would have unfortunate consequences. (Though, I don’t really know whether NVKMS would try to access the dead GPU in response to the request acting on the other GPU anyway. I decided to do it conservatively.) Once it’s reloaded you’re back in the game though!</p>
<p>Here’s the patch, written against the <code>nvidia-legacy-390xx-390.87</code> Debian source package:</p>
<figure><figcaption>nvidia-hot-gpu-on-gpu-unplug-action.patch (<a href="/files/nvidia-hot-gpu-on-gpu-unplug-action.patch">download</a>)</figcaption><pre><code class="language-diff"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
</pre></td>
<td class="rouge-code"><pre><span class="gh">diff -ur original/common/inc/nv-linux.h patchedl/common/inc/nv-linux.h
</span><span class="gd">--- original/common/inc/nv-linux.h 2018-09-23 12:20:02.000000000 +0000
</span><span class="gi">+++ patched/common/inc/nv-linux.h 2018-10-28 07:19:21.526566940 +0000
</span><span class="p">@@ -1465,6 +1465,7 @@</span>
typedef struct nv_linux_state_s {
nv_state_t nv_state;
atomic_t usage_count;
<span class="gi">+ atomic_t dead;
</span>
struct pci_dev *dev;
<span class="gh">diff -ur original/common/inc/nv-modeset-interface.h patched/common/inc/nv-modeset-interface.h
</span><span class="gd">--- original/common/inc/nv-modeset-interface.h 2018-08-22 00:55:23.000000000 +0000
</span><span class="gi">+++ patched/common/inc/nv-modeset-interface.h 2018-10-28 07:22:00.768238371 +0000
</span><span class="p">@@ -25,6 +25,8 @@</span>
#include "nv-gpu-info.h"
<span class="gi">+#include <asm/atomic.h>
+
</span> /*
* nvidia_modeset_rm_ops_t::op gets assigned a function pointer from
* core RM, which uses the calling convention of arguments on the
<span class="p">@@ -115,6 +117,8 @@</span>
int (*set_callbacks)(const nvidia_modeset_callbacks_t *cb);
<span class="gi">+ atomic_t * (*gpu_dead)(NvU32 gpu_id);
+
</span> } nvidia_modeset_rm_ops_t;
NV_STATUS nvidia_get_rm_ops(nvidia_modeset_rm_ops_t *rm_ops);
<span class="gh">diff -ur original/common/inc/nv-proto.h patched/common/inc/nv-proto.h
</span><span class="gd">--- original/common/inc/nv-proto.h 2018-08-22 00:55:23.000000000 +0000
</span><span class="gi">+++ patched/common/inc/nv-proto.h 2018-10-28 07:20:49.939494812 +0000
</span><span class="p">@@ -81,6 +81,7 @@</span>
NvBool nvidia_get_gpuid_list (NvU32 *gpu_ids, NvU32 *gpu_count);
int nvidia_dev_get (NvU32, nvidia_stack_t *);
void nvidia_dev_put (NvU32, nvidia_stack_t *);
<span class="gi">+atomic_t * nvidia_dev_dead (NvU32);
</span> int nvidia_dev_get_uuid (const NvU8 *, nvidia_stack_t *);
void nvidia_dev_put_uuid (const NvU8 *, nvidia_stack_t *);
int nvidia_dev_get_pci_info (const NvU8 *, struct pci_dev **, NvU64 *, NvU64 *);
<span class="gh">diff -ur original/nvidia/nv.c patched/nvidia/nv.c
</span><span class="gd">--- original/nvidia/nv.c 2018-09-23 12:20:02.000000000 +0000
</span><span class="gi">+++ patched/nvidia/nv.c 2018-10-28 07:48:05.895025112 +0000
</span><span class="p">@@ -1944,6 +1944,12 @@</span>
unsigned int i;
NvBool bRemove = NV_FALSE;
<span class="gi">+ if (NV_ATOMIC_READ(nvl->dead))
+ {
+ nv_printf(NV_DBG_ERRORS, "NVRM: nvidia_close called on dead device by pid %d!\n",
+ current->pid);
+ }
+
</span> NV_CHECK_PCI_CONFIG_SPACE(sp, nv, TRUE, TRUE, NV_MAY_SLEEP());
/* for control device, just jump to its open routine */
<span class="p">@@ -2106,6 +2112,12 @@</span>
size_t arg_size;
int arg_cmd;
<span class="gi">+ if (NV_ATOMIC_READ(nvl->dead))
+ {
+ nv_printf(NV_DBG_ERRORS, "NVRM: nvidia_ioctl called on dead device by pid %d!\n",
+ current->pid);
+ }
+
</span> nv_printf(NV_DBG_INFO, "NVRM: ioctl(0x%x, 0x%x, 0x%x)\n",
_IOC_NR(cmd), (unsigned int) i_arg, _IOC_SIZE(cmd));
<span class="p">@@ -3217,6 +3229,7 @@</span>
NV_INIT_MUTEX(&nvl->ldata_lock);
NV_ATOMIC_SET(nvl->usage_count, 0);
<span class="gi">+ NV_ATOMIC_SET(nvl->dead, 0);
</span>
if (!rm_init_event_locks(sp, nv))
return NV_FALSE;
<span class="p">@@ -4018,14 +4031,38 @@</span>
nv_printf(NV_DBG_ERRORS,
"NVRM: Attempting to remove minor device %u with non-zero usage count!\n",
nvl->minor_num);
<span class="gi">+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: YOLO, waiting for usage count to drop to zero\n");
</span> WARN_ON(1);
<span class="gd">- /* We can't continue without corrupting state, so just hang to give the
- * user some chance to do something about this before reboot */
- while (1)
</span><span class="gi">+ NV_ATOMIC_SET(nvl->dead, 1);
+
+ /* Insanity check: wait until all clients die, then hope for the best. */
+ while (1) {
+ UNLOCK_NV_LINUX_DEVICES();
</span> os_schedule();
<span class="gd">- }
</span><span class="gi">+ LOCK_NV_LINUX_DEVICES();
+
+ nvl = pci_get_drvdata(dev);
+ if (!nvl || (nvl->dev != dev))
+ {
+ goto done;
+ }
+
+ if (NV_ATOMIC_READ(nvl->usage_count) == 0)
+ {
+ break;
+ }
+ }
</span>
<span class="gi">+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: Usage count is now zero, proceeding to remove the GPU\n");
+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: This is not actually supposed to work lol. Hope it does tho 👍\n");
+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: You probably want to reload nvidia-modeset now if you want any "
+ "of this to ever start up again, but like, man, that's your choice entirely\n");
+ }
</span> nv = NV_STATE_PTR(nvl);
if (nvl == nv_linux_devices)
nv_linux_devices = nvl->next;
<span class="p">@@ -4712,6 +4749,22 @@</span>
up(&nvl->ldata_lock);
}
<span class="gi">+atomic_t *nvidia_dev_dead(NvU32 gpu_id)
+{
+ nv_linux_state_t *nvl;
+ atomic_t *ret;
+
+ /* Takes nvl->ldata_lock */
+ nvl = find_gpu_id(gpu_id);
+ if (!nvl)
+ return NV_FALSE;
+
+ ret = &nvl->dead;
+ up(&nvl->ldata_lock);
+
+ return ret;
+}
+
</span> /*
* Like nvidia_dev_get but uses UUID instead of gpu_id. Note that this may
* trigger initialization and teardown of unrelated devices to look up their
<span class="gh">diff -ur original/nvidia/nv-modeset-interface.c patched/nvidia/nv-modeset-interface.c
</span><span class="gd">--- original/nvidia/nv-modeset-interface.c 2018-08-22 00:55:22.000000000 +0000
</span><span class="gi">+++ patched/nvidia/nv-modeset-interface.c 2018-10-28 07:20:25.959243110 +0000
</span><span class="p">@@ -114,6 +114,7 @@</span>
.close_gpu = nvidia_dev_put,
.op = rm_kernel_rmapi_op, /* provided by nv-kernel.o */
.set_callbacks = nvidia_modeset_set_callbacks,
<span class="gi">+ .gpu_dead = nvidia_dev_dead,
</span> };
if (strcmp(rm_ops->version_string, NV_VERSION_STRING) != 0)
<span class="gh">diff -ur original/nvidia/nv-reg.h patched/nvidia/nv-reg.h
diff -ur original/nvidia-modeset/nvidia-modeset-linux.c patched/nvidia-modeset/nvidia-modeset-linux.c
</span><span class="gd">--- original/nvidia-modeset/nvidia-modeset-linux.c 2018-09-23 12:20:02.000000000 +0000
</span><span class="gi">+++ patched/nvidia-modeset/nvidia-modeset-linux.c 2018-10-28 07:47:14.738703417 +0000
</span><span class="p">@@ -75,6 +75,9 @@</span>
static struct semaphore nvkms_lock;
<span class="gi">+static NvU32 clopen_gpu_id;
+static NvBool leak_on_unload;
+
</span> /*************************************************************************
* NVKMS executes queued work items on a single kthread.
*************************************************************************/
<span class="p">@@ -89,6 +92,9 @@</span>
struct nvkms_per_open {
void *data;
<span class="gi">+ NvU32 gpu_id;
+ atomic_t *gpu_dead;
+
</span> enum NvKmsClientType type;
union {
<span class="p">@@ -711,6 +717,9 @@</span>
nvidia_modeset_stack_ptr stack = NULL;
NvBool ret;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "nvkms_open_gpu called with %08x, pid %d\n",
+ gpuId, current->pid);
+
</span> if (__rm_ops.alloc_stack(&stack) != 0) {
return NV_FALSE;
}
<span class="p">@@ -719,6 +728,10 @@</span>
__rm_ops.free_stack(stack);
<span class="gi">+ if (ret) {
+ clopen_gpu_id = gpuId;
+ }
+
</span> return ret;
}
<span class="p">@@ -726,12 +739,17 @@</span>
{
nvidia_modeset_stack_ptr stack = NULL;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "nvkms_close_gpu called with %08x, pid %d\n",
+ gpuId, current->pid);
+
</span> if (__rm_ops.alloc_stack(&stack) != 0) {
return;
}
__rm_ops.close_gpu(gpuId, stack);
<span class="gi">+ clopen_gpu_id = gpuId;
+
</span> __rm_ops.free_stack(stack);
}
<span class="p">@@ -771,8 +789,14 @@</span>
popen->type = type;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> *status = down_interruptible(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> if (*status != 0) {
goto failed;
}
<span class="p">@@ -781,6 +805,9 @@</span>
up(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> if (popen->data == NULL) {
*status = -EPERM;
goto failed;
<span class="p">@@ -799,10 +826,16 @@</span>
*status = 0;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "exiting in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> return popen;
failed:
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "error in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> nvkms_free(popen, sizeof(*popen));
return NULL;
<span class="p">@@ -816,14 +849,36 @@</span>
* mutex.
*/
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_close_common, pid %d\n",
+ current->pid);
+
</span> down(&nvkms_lock);
<span class="gd">- nvKmsClose(popen->data);
</span><span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_close_common, pid %d\n",
+ current->pid);
+
+ if (popen->gpu_id != 0 && atomic_read(popen->gpu_dead) != 0) {
+ printk(KERN_ERR NVKMS_LOG_PREFIX "awwww u need cleanup :3 "
+ "in nvkms_close_common, pid %d\n",
+ current->pid);
+
+ nvkms_close_gpu(popen->gpu_id);
+
+ popen->gpu_id = 0;
+ popen->gpu_dead = NULL;
+
+ leak_on_unload = NV_TRUE;
+ } else {
+ nvKmsClose(popen->data);
+ }
</span>
popen->data = NULL;
up(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_close_common, pid %d\n",
+ current->pid);
+
</span> if (popen->type == NVKMS_CLIENT_KERNEL_SPACE) {
/*
* Flush any outstanding nvkms_kapi_event_kthread_q_callback() work
<span class="p">@@ -844,6 +899,9 @@</span>
}
nvkms_free(popen, sizeof(*popen));
<span class="gi">+
+ printk(KERN_INFO NVKMS_LOG_PREFIX "exiting nvkms_close_common, pid %d\n",
+ current->pid);
</span> }
int NVKMS_API_CALL nvkms_ioctl_common
<span class="p">@@ -855,20 +913,58 @@</span>
int status;
NvBool ret;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
</span> status = down_interruptible(&nvkms_lock);
if (status != 0) {
return status;
}
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
+ if (popen->gpu_id != 0 && atomic_read(popen->gpu_dead) != 0) {
+ goto dead;
+ }
+
+ clopen_gpu_id = 0;
+
</span> if (popen->data != NULL) {
ret = nvKmsIoctl(popen->data, cmd, address, size);
} else {
ret = NV_FALSE;
}
<span class="gi">+ if (clopen_gpu_id != 0) {
+ if (!popen->gpu_id) {
+ printk(KERN_INFO NVKMS_LOG_PREFIX "detected gpu %08x open in nvkms_ioctl_common, "
+ "pid %d\n", clopen_gpu_id, current->pid);
+ popen->gpu_id = clopen_gpu_id;
+ popen->gpu_dead = __rm_ops.gpu_dead(clopen_gpu_id);
+ } else {
+ printk(KERN_INFO NVKMS_LOG_PREFIX "detected gpu %08x close in nvkms_ioctl_common, "
+ "pid %d\n", clopen_gpu_id, current->pid);
+ popen->gpu_id = 0;
+ popen->gpu_dead = NULL;
+ }
+ }
+
</span> up(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
</span> return ret ? 0 : -EPERM;
<span class="gi">+
+dead:
+ up(&nvkms_lock);
+
+ printk(KERN_ERR NVKMS_LOG_PREFIX "*notices ur gpu is dead* owo whats this "
+ "in nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
+ return -ENOENT;
</span> }
/*************************************************************************
<span class="p">@@ -1239,9 +1335,14 @@</span>
nvkms_proc_exit();
<span class="gd">- down(&nvkms_lock);
- nvKmsModuleUnload();
- up(&nvkms_lock);
</span><span class="gi">+ if(leak_on_unload) {
+ printk(KERN_ERR NVKMS_LOG_PREFIX "im just gonna leak all the kms junk ok? "
+ "haha nvm wasnt a question. in nvkms_exit\n");
+ } else {
+ down(&nvkms_lock);
+ nvKmsModuleUnload();
+ up(&nvkms_lock);
+ }
</span>
/*
* At this point, any pending tasks should be marked canceled, but
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>Here’s some handy scripts I was using while debugging it:</p>
<figure><figcaption>insmod.sh</figcaption><pre><code class="language-sh"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td>
<td class="rouge-code"><pre><span class="c">#!/bin/sh -ex</span>
modprobe acpi_ipmi
insmod nvidia.ko <span class="nv">NVreg_ResmanDebugLevel</span><span class="o">=</span><span class="nt">-1</span> <span class="nv">NVreg_CheckPCIConfigSpace</span><span class="o">=</span>0
insmod nvidia-modeset.ko
dmesg <span class="nt">-w</span>
</pre></td>
</tr></tbody></table></code></pre></figure>
<figure><figcaption>rmmod.sh</figcaption><pre><code class="language-sh"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td>
<td class="rouge-code"><pre><span class="c">#!/bin/sh</span>
rmmod nvidia-modeset
rmmod nvidia
</pre></td>
</tr></tbody></table></code></pre></figure>
<figure><figcaption>xorg.sh</figcaption><pre><code class="language-sh"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td>
<td class="rouge-code"><pre><span class="c">#!/bin/sh</span>
<span class="nb">exec </span>Xorg :8 <span class="nt">-config</span> /etc/bumblebee/xorg.conf.nvidia <span class="nt">-configdir</span> /etc/bumblebee/xorg.conf.d <span class="nt">-sharevts</span> <span class="nt">-nolisten</span> tcp <span class="nt">-noreset</span> <span class="nt">-verbose</span> 3 <span class="nt">-isolateDevice</span> PCI:06:00:0 <span class="nt">-modulepath</span> /usr/lib/nvidia/nvidia,/usr/lib/xorg/modules
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>And finally, here are the relevant kernel and Xorg log messages, showing what happens when a GPU is unplugged:</p>
<figure><figcaption>dmesg.log</figcaption><pre><code class="language-text"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
</pre></td>
<td class="rouge-code"><pre>[ 219.524218] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.87 Tue Aug 21 12:33:05 PDT 2018 (using threaded interrupts)
[ 219.527409] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.87 Tue Aug 21 16:16:14 PDT 2018
[ 224.780721] nvidia-modeset: nvkms_open_gpu called with 00000600, pid 4560
[ 224.807370] nvidia-modeset: detected gpu 00000600 open in nvkms_ioctl_common, pid 4560
[ 239.061383] NVRM: GPU at PCI:0000:06:00: GPU-9fe1319c-8dd3-44e4-2b74-de93f8b02c6a
[ 239.061387] NVRM: Xid (PCI:0000:06:00): 79, GPU has fallen off the bus.
[ 239.061389] NVRM: GPU at 0000:06:00.0 has fallen off the bus.
[ 239.061398] NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.
[ 240.209498] NVRM: Attempting to remove minor device 0 with non-zero usage count!
[ 240.209501] NVRM: YOLO, waiting for usage count to drop to zero
[ 241.433499] nvidia-modeset: *notices ur gpu is dead* owo whats this in nvkms_ioctl_common, pid 4560
[ 241.433851] nvidia-modeset: awwww u need cleanup :3 in nvkms_close_common, pid 4560
[ 241.433853] nvidia-modeset: nvkms_close_gpu called with 00000600, pid 4560
[ 250.440498] NVRM: Usage count is now zero, proceeding to remove the GPU
[ 250.440513] NVRM: This is not actually supposed to work lol. Hope it does tho 👍
[ 250.440520] NVRM: You probably want to reload nvidia-modeset now if you want any of this to ever start up again, but like, man, that's your choice entirely
[ 250.440870] pci 0000:06:00.1: Dropping the link to 0000:06:00.0
[ 250.440950] pci_bus 0000:06: busn_res: [bus 06] is released
[ 250.440982] pci_bus 0000:07: busn_res: [bus 07-38] is released
[ 250.441012] pci_bus 0000:05: busn_res: [bus 05-38] is released
[ 251.000794] pci_bus 0000:02: Allocating resources
[ 251.001324] pci_bus 0000:02: Allocating resources
[ 253.765953] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[ 253.765969] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 253.765976] pcieport 0000:00:1c.0: device [8086:9d10] error status/mask=00002001/00002000
[ 253.765982] pcieport 0000:00:1c.0: [ 0] Receiver Error (First)
[ 253.841064] pcieport 0000:02:02.0: Refused to change power state, currently in D3
[ 253.843882] pcieport 0000:02:00.0: Refused to change power state, currently in D3
[ 253.846177] pci_bus 0000:03: busn_res: [bus 03] is released
[ 253.846248] pci_bus 0000:04: busn_res: [bus 04-38] is released
[ 253.846300] pci_bus 0000:39: busn_res: [bus 39] is released
[ 253.846348] pci_bus 0000:02: busn_res: [bus 02-39] is released
[ 353.369487] nvidia-modeset: im just gonna leak all the kms junk ok? haha nvm wasnt a question. in nvkms_exit
[ 357.600350] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.87 Tue Aug 21 16:16:14 PDT 2018
</pre></td>
</tr></tbody></table></code></pre></figure>
<figure><figcaption>Xorg.8.log</figcaption><pre><code class="language-text"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
</pre></td>
<td class="rouge-code"><pre>[ 244.798] (EE) NVIDIA(GPU-0): WAIT (2, 8, 0x8000, 0x000011f4, 0x00001210)
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>Recently, I’ve using an extremely cursed setup where my XPS 13 9360 laptop is connected to a Sonnet EchoExpress 2 box rewired for Thunderbolt 3 that has an nVidia Quadro 600 GPU, and Linux is set up for render offload to the eGPU and then frame transfer back to iGPU to be displayed on the laptop’s integrated display, which (to my sheer surprise) not only works quire reliably, but even gives me higher FPS in Team Fortress 2 than the iGPU.</p>
<p>There’s only really one downside: if the eGPU falls off the bus, either because someone™ pulled out the cable, or because the stars didn’t align quite right this morning and it decided to enumerate seemingly at random (sometimes this is preceeded by whining from PCIe AER, sometimes not, I <em>think</em> it’s some sort of hardware issue like a badly inserted PCIe card, but I’m not entirely sure), the nVidia driver… hangs. Hangs quite deliberately, as the sources to the kernel driver show. This leaves the Xorg instance bound to the eGPU hung forever (which confuses bumblebee, but is otherwise not especially bad), and also prevents any new ones from using the eGPU (which is bad).</p>
<p>Anyway, I was kind of annoyed of rebooting every time it happens, so I decided to reboot a few more dozen times instead while patching the driver. This has indeed worked, and left me with something similar to a functional hot-unplug, mildly crippled by the fact that nvidia-modeset is a completely opaque blob that keeps some internal state and tries to act on it, getting stuck when it tries to do something to the now-missing eGPU.</p>
<p>Turns out, there are only a few issues preventing functional hot-unplug.</p>
<ol>
<li>
<p>In <code>nvidia_remove</code>, the driver actually checks if anyone’s still trying to use it, and if yes, it tries to just hang the removal process. This doesn’t actually work, or rather, it mostly works by accident. It starts an infinite loop calling <code>os_schedule()</code> while having taken the <code>NV_LINUX_DEVICES</code> lock. While in the default configuration this indeed hangs any reentrant requests into the driver by virtue of <code>NV_CHECK_PCI_CONFIG_SPACE</code> taking the same lock (in <code>verify_pci_bars</code>, passing the <code>NVreg_CheckPCIConfigSpace=0</code> module option eliminates that accidental safety mechanism, and allows reentrant requests to proceed. They do not crash due to memory being deallocated in <code>nvidia_remove</code> (so you don’t get an unhandled kernel page fault), but they still crash due to being unable to access the GPU.</p>
</li>
<li>
<p>The NVKMS component (in the <code>nvidia-modeset</code> module) tries to maintain some state, and change it when e.g. the Xorg instance quits and closes the <code>/dev/nvidia-modeset</code> file. Unfortunately, it does not expect the GPU to go away, and first spews a few messages to <code>dmesg</code> similar to <code>nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000857d:0:0:0x0000000f</code>, after which it appears to hang somewhere inside the blob, which has been conveniently stripped of all symbols. This needs to be prevented, but…</p>
</li>
<li>
<p>The NVKMS component effectively only exposes a single opaque ioctl, and all the communication, including communication of the GPU bus ID, happens out of band with regards to the open source parts of the <code>nvidia-modeset</code> module. Fortunately, NVKMS calls back into NVRM, and this allows us to associate each <code>/dev/nvidia-modeset</code> fd with the GPU bus ID.</p>
</li>
<li>
<p>When unloading NVKMS, it also tries to act on its internal state and change the GPU state, which leads to the same hang.</p>
</li>
</ol>
<p>All in all, this allows a patch to be written that detects when a GPU goes away, ignores all further NVKMS requests related to that specific GPU (and returns <code>-ENOENT</code> in response to ioctls, which Xorg appropriately interprets as a fault condition), correctly releases the resources by requesting NVRM, and improperly unloads NVKMS so it doesn’t try to reset the GPU state. (All actual resources should be released by this point, and NVKMS doesn’t have any resource allocation callbacks other than those we already intercept, so <em>in theory</em> this doesn’t have any bad consequences. But I’m not working for nVidia, so this might be completely wrong.)</p>
<p>After the GPU is plugged back in, NVKMS will try to act on its internal state again; in this case, it doesn’t hang, but it doesn’t initialize the GPU correctly either, so the <code>nvidia-modeset</code> kernel module has to be (manually) reloaded. It’s not easy to do this automatically because in a hypothetical system with more than one nVidia GPU the module would still be in use when one of them dies, and so just hard reloading NVKMS would have unfortunate consequences. (Though, I don’t really know whether NVKMS would try to access the dead GPU in response to the request acting on the other GPU anyway. I decided to do it conservatively.) Once it’s reloaded you’re back in the game though!</p>
<p>Here’s the patch, written against the <code>nvidia-legacy-390xx-390.87</code> Debian source package:</p>
<figure><figcaption>nvidia-hot-gpu-on-gpu-unplug-action.patch (<a href="/files/nvidia-hot-gpu-on-gpu-unplug-action.patch">download</a>)</figcaption><pre><code class="language-diff"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
</pre></td>
<td class="rouge-code"><pre><span class="gh">diff -ur original/common/inc/nv-linux.h patchedl/common/inc/nv-linux.h
</span><span class="gd">--- original/common/inc/nv-linux.h 2018-09-23 12:20:02.000000000 +0000
</span><span class="gi">+++ patched/common/inc/nv-linux.h 2018-10-28 07:19:21.526566940 +0000
</span><span class="p">@@ -1465,6 +1465,7 @@</span>
typedef struct nv_linux_state_s {
nv_state_t nv_state;
atomic_t usage_count;
<span class="gi">+ atomic_t dead;
</span>
struct pci_dev *dev;
<span class="gh">diff -ur original/common/inc/nv-modeset-interface.h patched/common/inc/nv-modeset-interface.h
</span><span class="gd">--- original/common/inc/nv-modeset-interface.h 2018-08-22 00:55:23.000000000 +0000
</span><span class="gi">+++ patched/common/inc/nv-modeset-interface.h 2018-10-28 07:22:00.768238371 +0000
</span><span class="p">@@ -25,6 +25,8 @@</span>
#include "nv-gpu-info.h"
<span class="gi">+#include <asm/atomic.h>
+
</span> /*
* nvidia_modeset_rm_ops_t::op gets assigned a function pointer from
* core RM, which uses the calling convention of arguments on the
<span class="p">@@ -115,6 +117,8 @@</span>
int (*set_callbacks)(const nvidia_modeset_callbacks_t *cb);
<span class="gi">+ atomic_t * (*gpu_dead)(NvU32 gpu_id);
+
</span> } nvidia_modeset_rm_ops_t;
NV_STATUS nvidia_get_rm_ops(nvidia_modeset_rm_ops_t *rm_ops);
<span class="gh">diff -ur original/common/inc/nv-proto.h patched/common/inc/nv-proto.h
</span><span class="gd">--- original/common/inc/nv-proto.h 2018-08-22 00:55:23.000000000 +0000
</span><span class="gi">+++ patched/common/inc/nv-proto.h 2018-10-28 07:20:49.939494812 +0000
</span><span class="p">@@ -81,6 +81,7 @@</span>
NvBool nvidia_get_gpuid_list (NvU32 *gpu_ids, NvU32 *gpu_count);
int nvidia_dev_get (NvU32, nvidia_stack_t *);
void nvidia_dev_put (NvU32, nvidia_stack_t *);
<span class="gi">+atomic_t * nvidia_dev_dead (NvU32);
</span> int nvidia_dev_get_uuid (const NvU8 *, nvidia_stack_t *);
void nvidia_dev_put_uuid (const NvU8 *, nvidia_stack_t *);
int nvidia_dev_get_pci_info (const NvU8 *, struct pci_dev **, NvU64 *, NvU64 *);
<span class="gh">diff -ur original/nvidia/nv.c patched/nvidia/nv.c
</span><span class="gd">--- original/nvidia/nv.c 2018-09-23 12:20:02.000000000 +0000
</span><span class="gi">+++ patched/nvidia/nv.c 2018-10-28 07:48:05.895025112 +0000
</span><span class="p">@@ -1944,6 +1944,12 @@</span>
unsigned int i;
NvBool bRemove = NV_FALSE;
<span class="gi">+ if (NV_ATOMIC_READ(nvl->dead))
+ {
+ nv_printf(NV_DBG_ERRORS, "NVRM: nvidia_close called on dead device by pid %d!\n",
+ current->pid);
+ }
+
</span> NV_CHECK_PCI_CONFIG_SPACE(sp, nv, TRUE, TRUE, NV_MAY_SLEEP());
/* for control device, just jump to its open routine */
<span class="p">@@ -2106,6 +2112,12 @@</span>
size_t arg_size;
int arg_cmd;
<span class="gi">+ if (NV_ATOMIC_READ(nvl->dead))
+ {
+ nv_printf(NV_DBG_ERRORS, "NVRM: nvidia_ioctl called on dead device by pid %d!\n",
+ current->pid);
+ }
+
</span> nv_printf(NV_DBG_INFO, "NVRM: ioctl(0x%x, 0x%x, 0x%x)\n",
_IOC_NR(cmd), (unsigned int) i_arg, _IOC_SIZE(cmd));
<span class="p">@@ -3217,6 +3229,7 @@</span>
NV_INIT_MUTEX(&nvl->ldata_lock);
NV_ATOMIC_SET(nvl->usage_count, 0);
<span class="gi">+ NV_ATOMIC_SET(nvl->dead, 0);
</span>
if (!rm_init_event_locks(sp, nv))
return NV_FALSE;
<span class="p">@@ -4018,14 +4031,38 @@</span>
nv_printf(NV_DBG_ERRORS,
"NVRM: Attempting to remove minor device %u with non-zero usage count!\n",
nvl->minor_num);
<span class="gi">+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: YOLO, waiting for usage count to drop to zero\n");
</span> WARN_ON(1);
<span class="gd">- /* We can't continue without corrupting state, so just hang to give the
- * user some chance to do something about this before reboot */
- while (1)
</span><span class="gi">+ NV_ATOMIC_SET(nvl->dead, 1);
+
+ /* Insanity check: wait until all clients die, then hope for the best. */
+ while (1) {
+ UNLOCK_NV_LINUX_DEVICES();
</span> os_schedule();
<span class="gd">- }
</span><span class="gi">+ LOCK_NV_LINUX_DEVICES();
+
+ nvl = pci_get_drvdata(dev);
+ if (!nvl || (nvl->dev != dev))
+ {
+ goto done;
+ }
+
+ if (NV_ATOMIC_READ(nvl->usage_count) == 0)
+ {
+ break;
+ }
+ }
</span>
<span class="gi">+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: Usage count is now zero, proceeding to remove the GPU\n");
+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: This is not actually supposed to work lol. Hope it does tho 👍\n");
+ nv_printf(NV_DBG_ERRORS,
+ "NVRM: You probably want to reload nvidia-modeset now if you want any "
+ "of this to ever start up again, but like, man, that's your choice entirely\n");
+ }
</span> nv = NV_STATE_PTR(nvl);
if (nvl == nv_linux_devices)
nv_linux_devices = nvl->next;
<span class="p">@@ -4712,6 +4749,22 @@</span>
up(&nvl->ldata_lock);
}
<span class="gi">+atomic_t *nvidia_dev_dead(NvU32 gpu_id)
+{
+ nv_linux_state_t *nvl;
+ atomic_t *ret;
+
+ /* Takes nvl->ldata_lock */
+ nvl = find_gpu_id(gpu_id);
+ if (!nvl)
+ return NV_FALSE;
+
+ ret = &nvl->dead;
+ up(&nvl->ldata_lock);
+
+ return ret;
+}
+
</span> /*
* Like nvidia_dev_get but uses UUID instead of gpu_id. Note that this may
* trigger initialization and teardown of unrelated devices to look up their
<span class="gh">diff -ur original/nvidia/nv-modeset-interface.c patched/nvidia/nv-modeset-interface.c
</span><span class="gd">--- original/nvidia/nv-modeset-interface.c 2018-08-22 00:55:22.000000000 +0000
</span><span class="gi">+++ patched/nvidia/nv-modeset-interface.c 2018-10-28 07:20:25.959243110 +0000
</span><span class="p">@@ -114,6 +114,7 @@</span>
.close_gpu = nvidia_dev_put,
.op = rm_kernel_rmapi_op, /* provided by nv-kernel.o */
.set_callbacks = nvidia_modeset_set_callbacks,
<span class="gi">+ .gpu_dead = nvidia_dev_dead,
</span> };
if (strcmp(rm_ops->version_string, NV_VERSION_STRING) != 0)
<span class="gh">diff -ur original/nvidia/nv-reg.h patched/nvidia/nv-reg.h
diff -ur original/nvidia-modeset/nvidia-modeset-linux.c patched/nvidia-modeset/nvidia-modeset-linux.c
</span><span class="gd">--- original/nvidia-modeset/nvidia-modeset-linux.c 2018-09-23 12:20:02.000000000 +0000
</span><span class="gi">+++ patched/nvidia-modeset/nvidia-modeset-linux.c 2018-10-28 07:47:14.738703417 +0000
</span><span class="p">@@ -75,6 +75,9 @@</span>
static struct semaphore nvkms_lock;
<span class="gi">+static NvU32 clopen_gpu_id;
+static NvBool leak_on_unload;
+
</span> /*************************************************************************
* NVKMS executes queued work items on a single kthread.
*************************************************************************/
<span class="p">@@ -89,6 +92,9 @@</span>
struct nvkms_per_open {
void *data;
<span class="gi">+ NvU32 gpu_id;
+ atomic_t *gpu_dead;
+
</span> enum NvKmsClientType type;
union {
<span class="p">@@ -711,6 +717,9 @@</span>
nvidia_modeset_stack_ptr stack = NULL;
NvBool ret;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "nvkms_open_gpu called with %08x, pid %d\n",
+ gpuId, current->pid);
+
</span> if (__rm_ops.alloc_stack(&stack) != 0) {
return NV_FALSE;
}
<span class="p">@@ -719,6 +728,10 @@</span>
__rm_ops.free_stack(stack);
<span class="gi">+ if (ret) {
+ clopen_gpu_id = gpuId;
+ }
+
</span> return ret;
}
<span class="p">@@ -726,12 +739,17 @@</span>
{
nvidia_modeset_stack_ptr stack = NULL;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "nvkms_close_gpu called with %08x, pid %d\n",
+ gpuId, current->pid);
+
</span> if (__rm_ops.alloc_stack(&stack) != 0) {
return;
}
__rm_ops.close_gpu(gpuId, stack);
<span class="gi">+ clopen_gpu_id = gpuId;
+
</span> __rm_ops.free_stack(stack);
}
<span class="p">@@ -771,8 +789,14 @@</span>
popen->type = type;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> *status = down_interruptible(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> if (*status != 0) {
goto failed;
}
<span class="p">@@ -781,6 +805,9 @@</span>
up(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> if (popen->data == NULL) {
*status = -EPERM;
goto failed;
<span class="p">@@ -799,10 +826,16 @@</span>
*status = 0;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "exiting in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> return popen;
failed:
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "error in nvkms_open_common, pid %d\n",
+ current->pid);
+
</span> nvkms_free(popen, sizeof(*popen));
return NULL;
<span class="p">@@ -816,14 +849,36 @@</span>
* mutex.
*/
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_close_common, pid %d\n",
+ current->pid);
+
</span> down(&nvkms_lock);
<span class="gd">- nvKmsClose(popen->data);
</span><span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_close_common, pid %d\n",
+ current->pid);
+
+ if (popen->gpu_id != 0 && atomic_read(popen->gpu_dead) != 0) {
+ printk(KERN_ERR NVKMS_LOG_PREFIX "awwww u need cleanup :3 "
+ "in nvkms_close_common, pid %d\n",
+ current->pid);
+
+ nvkms_close_gpu(popen->gpu_id);
+
+ popen->gpu_id = 0;
+ popen->gpu_dead = NULL;
+
+ leak_on_unload = NV_TRUE;
+ } else {
+ nvKmsClose(popen->data);
+ }
</span>
popen->data = NULL;
up(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_close_common, pid %d\n",
+ current->pid);
+
</span> if (popen->type == NVKMS_CLIENT_KERNEL_SPACE) {
/*
* Flush any outstanding nvkms_kapi_event_kthread_q_callback() work
<span class="p">@@ -844,6 +899,9 @@</span>
}
nvkms_free(popen, sizeof(*popen));
<span class="gi">+
+ printk(KERN_INFO NVKMS_LOG_PREFIX "exiting nvkms_close_common, pid %d\n",
+ current->pid);
</span> }
int NVKMS_API_CALL nvkms_ioctl_common
<span class="p">@@ -855,20 +913,58 @@</span>
int status;
NvBool ret;
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
</span> status = down_interruptible(&nvkms_lock);
if (status != 0) {
return status;
}
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
+ if (popen->gpu_id != 0 && atomic_read(popen->gpu_dead) != 0) {
+ goto dead;
+ }
+
+ clopen_gpu_id = 0;
+
</span> if (popen->data != NULL) {
ret = nvKmsIoctl(popen->data, cmd, address, size);
} else {
ret = NV_FALSE;
}
<span class="gi">+ if (clopen_gpu_id != 0) {
+ if (!popen->gpu_id) {
+ printk(KERN_INFO NVKMS_LOG_PREFIX "detected gpu %08x open in nvkms_ioctl_common, "
+ "pid %d\n", clopen_gpu_id, current->pid);
+ popen->gpu_id = clopen_gpu_id;
+ popen->gpu_dead = __rm_ops.gpu_dead(clopen_gpu_id);
+ } else {
+ printk(KERN_INFO NVKMS_LOG_PREFIX "detected gpu %08x close in nvkms_ioctl_common, "
+ "pid %d\n", clopen_gpu_id, current->pid);
+ popen->gpu_id = 0;
+ popen->gpu_dead = NULL;
+ }
+ }
+
</span> up(&nvkms_lock);
<span class="gi">+ printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
</span> return ret ? 0 : -EPERM;
<span class="gi">+
+dead:
+ up(&nvkms_lock);
+
+ printk(KERN_ERR NVKMS_LOG_PREFIX "*notices ur gpu is dead* owo whats this "
+ "in nvkms_ioctl_common, pid %d\n",
+ current->pid);
+
+ return -ENOENT;
</span> }
/*************************************************************************
<span class="p">@@ -1239,9 +1335,14 @@</span>
nvkms_proc_exit();
<span class="gd">- down(&nvkms_lock);
- nvKmsModuleUnload();
- up(&nvkms_lock);
</span><span class="gi">+ if(leak_on_unload) {
+ printk(KERN_ERR NVKMS_LOG_PREFIX "im just gonna leak all the kms junk ok? "
+ "haha nvm wasnt a question. in nvkms_exit\n");
+ } else {
+ down(&nvkms_lock);
+ nvKmsModuleUnload();
+ up(&nvkms_lock);
+ }
</span>
/*
* At this point, any pending tasks should be marked canceled, but
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>Here’s some handy scripts I was using while debugging it:</p>
<figure><figcaption>insmod.sh</figcaption><pre><code class="language-sh"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td>
<td class="rouge-code"><pre><span class="c">#!/bin/sh -ex</span>
modprobe acpi_ipmi
insmod nvidia.ko <span class="nv">NVreg_ResmanDebugLevel</span><span class="o">=</span><span class="nt">-1</span> <span class="nv">NVreg_CheckPCIConfigSpace</span><span class="o">=</span>0
insmod nvidia-modeset.ko
dmesg <span class="nt">-w</span>
</pre></td>
</tr></tbody></table></code></pre></figure>
<figure><figcaption>rmmod.sh</figcaption><pre><code class="language-sh"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td>
<td class="rouge-code"><pre><span class="c">#!/bin/sh</span>
rmmod nvidia-modeset
rmmod nvidia
</pre></td>
</tr></tbody></table></code></pre></figure>
<figure><figcaption>xorg.sh</figcaption><pre><code class="language-sh"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td>
<td class="rouge-code"><pre><span class="c">#!/bin/sh</span>
<span class="nb">exec </span>Xorg :8 <span class="nt">-config</span> /etc/bumblebee/xorg.conf.nvidia <span class="nt">-configdir</span> /etc/bumblebee/xorg.conf.d <span class="nt">-sharevts</span> <span class="nt">-nolisten</span> tcp <span class="nt">-noreset</span> <span class="nt">-verbose</span> 3 <span class="nt">-isolateDevice</span> PCI:06:00:0 <span class="nt">-modulepath</span> /usr/lib/nvidia/nvidia,/usr/lib/xorg/modules
</pre></td>
</tr></tbody></table></code></pre></figure>
<p>And finally, here are the relevant kernel and Xorg log messages, showing what happens when a GPU is unplugged:</p>
<figure><figcaption>dmesg.log</figcaption><pre><code class="language-text"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
</pre></td>
<td class="rouge-code"><pre>[ 219.524218] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.87 Tue Aug 21 12:33:05 PDT 2018 (using threaded interrupts)
[ 219.527409] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.87 Tue Aug 21 16:16:14 PDT 2018
[ 224.780721] nvidia-modeset: nvkms_open_gpu called with 00000600, pid 4560
[ 224.807370] nvidia-modeset: detected gpu 00000600 open in nvkms_ioctl_common, pid 4560
[ 239.061383] NVRM: GPU at PCI:0000:06:00: GPU-9fe1319c-8dd3-44e4-2b74-de93f8b02c6a
[ 239.061387] NVRM: Xid (PCI:0000:06:00): 79, GPU has fallen off the bus.
[ 239.061389] NVRM: GPU at 0000:06:00.0 has fallen off the bus.
[ 239.061398] NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.
[ 240.209498] NVRM: Attempting to remove minor device 0 with non-zero usage count!
[ 240.209501] NVRM: YOLO, waiting for usage count to drop to zero
[ 241.433499] nvidia-modeset: *notices ur gpu is dead* owo whats this in nvkms_ioctl_common, pid 4560
[ 241.433851] nvidia-modeset: awwww u need cleanup :3 in nvkms_close_common, pid 4560
[ 241.433853] nvidia-modeset: nvkms_close_gpu called with 00000600, pid 4560
[ 250.440498] NVRM: Usage count is now zero, proceeding to remove the GPU
[ 250.440513] NVRM: This is not actually supposed to work lol. Hope it does tho 👍
[ 250.440520] NVRM: You probably want to reload nvidia-modeset now if you want any of this to ever start up again, but like, man, that's your choice entirely
[ 250.440870] pci 0000:06:00.1: Dropping the link to 0000:06:00.0
[ 250.440950] pci_bus 0000:06: busn_res: [bus 06] is released
[ 250.440982] pci_bus 0000:07: busn_res: [bus 07-38] is released
[ 250.441012] pci_bus 0000:05: busn_res: [bus 05-38] is released
[ 251.000794] pci_bus 0000:02: Allocating resources
[ 251.001324] pci_bus 0000:02: Allocating resources
[ 253.765953] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[ 253.765969] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 253.765976] pcieport 0000:00:1c.0: device [8086:9d10] error status/mask=00002001/00002000
[ 253.765982] pcieport 0000:00:1c.0: [ 0] Receiver Error (First)
[ 253.841064] pcieport 0000:02:02.0: Refused to change power state, currently in D3
[ 253.843882] pcieport 0000:02:00.0: Refused to change power state, currently in D3
[ 253.846177] pci_bus 0000:03: busn_res: [bus 03] is released
[ 253.846248] pci_bus 0000:04: busn_res: [bus 04-38] is released
[ 253.846300] pci_bus 0000:39: busn_res: [bus 39] is released
[ 253.846348] pci_bus 0000:02: busn_res: [bus 02-39] is released
[ 353.369487] nvidia-modeset: im just gonna leak all the kms junk ok? haha nvm wasnt a question. in nvkms_exit
[ 357.600350] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.87 Tue Aug 21 16:16:14 PDT 2018
</pre></td>
</tr></tbody></table></code></pre></figure>
<figure><figcaption>Xorg.8.log</figcaption><pre><code class="language-text"><table class="rouge-table"><tbody><tr>
<td class="rouge-gutter gl"><pre class="lineno">1
</pre></td>
<td class="rouge-code"><pre>[ 244.798] (EE) NVIDIA(GPU-0): WAIT (2, 8, 0x8000, 0x000011f4, 0x00001210)
</pre></td>
</tr></tbody></table></code></pre></figure>
tag:lab.whitequark.org,2018-09-17:/notes/2018-09-17/game-boy-advance-cartridge-smc805-2/
Game Boy Advance cartridge "SMC805-2 VER:1.5"
2018-09-17T11:02:33Z
2018-09-17T11:02:33Z
<p>I have been asked to determine if a (pirate) Game Boy Advance cartridge with the PCB marked “SMC805-2 VER:1.5 2006.11.16” can be reflashed. The cartridge contains a battery, an ASIC in a chip-on-board package (epoxy blob), an unidentified Intel chip marked “RL0ZAA00” (possibly “RLOZAA00” or “RLOZAAOO”) “A5367952” “Z37LA59B” in a VFBGA-56 package, an ISSI (neé ICSI) static RAM <a href="/files/gba-cartridge/IS62LV1024.pdf">IS62LV1024LL-55H</a> in a TSOP-32 package, and miscellaneous passives.</p>
<p>Front side of the board with the cartridge pins annotated (<a href="/images/gba-cartridge/labelled.svg">zoomable version</a>):</p>
<object type="image/svg+xml" data="/images/gba-cartridge/labelled.svg"></object>
<p>It was immediately clear that the Intel chip has to be a Flash, it has a parallel interface, and by looking at it at a shallow angle to the PCB, it could be seen that the balls are laid out in a 7x8 grid. It was then a matter of a search query to discover that an Intel parallel flash in a BGA-56 package belongs to the 28F series, similar to <a href="/files/gba-cartridge/28F128.pdf">28F128Kxx</a>.</p>
<p>After desoldering the chip, I have traced the test points to the balls they are connected to, and the pinout matched the 28F series perfectly. Test point assignment (<a href="/images/gba-cartridge/labelled-mag.svg">zoomable version</a>):</p>
<object type="image/svg+xml" data="/images/gba-cartridge/labelled-mag.svg"></object>
<p>Note that the Flash interface is completely independent from the cartridge interface; every Flash interface data signal goes through the ASIC.</p>
<p>It is not clear if the Flash interface is tristated by the ASIC when not in use, or by some other method, such as the J1-J2 switch.</p>
<p>I have been asked to determine if a (pirate) Game Boy Advance cartridge with the PCB marked “SMC805-2 VER:1.5 2006.11.16” can be reflashed. The cartridge contains a battery, an ASIC in a chip-on-board package (epoxy blob), an unidentified Intel chip marked “RL0ZAA00” (possibly “RLOZAA00” or “RLOZAAOO”) “A5367952” “Z37LA59B” in a VFBGA-56 package, an ISSI (neé ICSI) static RAM <a href="/files/gba-cartridge/IS62LV1024.pdf">IS62LV1024LL-55H</a> in a TSOP-32 package, and miscellaneous passives.</p>
<p>Front side of the board with the cartridge pins annotated (<a href="/images/gba-cartridge/labelled.svg">zoomable version</a>):</p>
<object type="image/svg+xml" data="/images/gba-cartridge/labelled.svg"></object>
<p>It was immediately clear that the Intel chip has to be a Flash, it has a parallel interface, and by looking at it at a shallow angle to the PCB, it could be seen that the balls are laid out in a 7x8 grid. It was then a matter of a search query to discover that an Intel parallel flash in a BGA-56 package belongs to the 28F series, similar to <a href="/files/gba-cartridge/28F128.pdf">28F128Kxx</a>.</p>
<p>After desoldering the chip, I have traced the test points to the balls they are connected to, and the pinout matched the 28F series perfectly. Test point assignment (<a href="/images/gba-cartridge/labelled-mag.svg">zoomable version</a>):</p>
<object type="image/svg+xml" data="/images/gba-cartridge/labelled-mag.svg"></object>
<p>Note that the Flash interface is completely independent from the cartridge interface; every Flash interface data signal goes through the ASIC.</p>
<p>It is not clear if the Flash interface is tristated by the ASIC when not in use, or by some other method, such as the J1-J2 switch.</p>
tag:lab.whitequark.org,2018-08-02:/notes/2018-08-02/z144sn005-lcd-microphotography/
Z144SN005 LCD microphotography
2018-08-02T05:50:10Z
2018-08-02T05:50:10Z
<p>I’ve been provided a Z144SN005 LCD that has split in halves as a result of excess mechanical force. Z144SN005 is a ST7735S-based LCD organized as 128RGB×128; it has 384 sources and 128 gates. I took some microphotographs of it using an Amscope ME300TZ-2L-3M metallurgical microscope.</p>
<p>The following microphotographs of the front glass plate that contains color filters demonstrate the pixels lighting up:</p>
<p><a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-trans-4x.png" title="4x magnification, transmitted light"><img src="/images/z144sn005-display/filters-trans-4x-thumb.png" title="4x magnification, transmitted light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-refl-4x.png" title="4x magnification, reflected light"><img src="/images/z144sn005-display/filters-refl-4x-thumb.png" title="4x magnification, reflected light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-trans-refl-4x.png" title="4x magnification, reflected and transmitted light"><img src="/images/z144sn005-display/filters-trans-refl-4x-thumb.png" title="4x magnification, reflected and transmitted light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-refl-10x.png" title="10x magnification, reflected light"><img src="/images/z144sn005-display/filters-refl-10x-thumb.png" title="10x magnification, reflected light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-trans-refl-10x.png" title="10x magnification, reflected and transmitted light"><img src="/images/z144sn005-display/filters-trans-refl-10x-thumb.png" title="10x magnification, reflected and transmitted light"></a></p>
<p>The following microphotographs (all taken in combined transmitted and reflected light for extra contrast) of the back glass plate show the circuitry driving the liquid crystals confined between the two glass plates.</p>
<p>First, row (gate) drivers. There’s 128 in total. Note the numbered LCD rows.</p>
<p><a class="fancybox" rel="gal-back-rows" href="/images/z144sn005-display/drivers-row-4x.png" title="4x magnification, row drivers"><img src="/images/z144sn005-display/drivers-row-4x-thumb.png" title="4x magnification, row drivers"></a>
<a class="fancybox" rel="gal-back-rows" href="/images/z144sn005-display/drivers-row-10x.png" title="10x magnification, row drivers"><img src="/images/z144sn005-display/drivers-row-10x-thumb.png" title="10x magnification, row drivers"></a></p>
<p>Second, column (source) drivers. There’s 384 in total, so the routing is far more dense. Note that the column drivers are located at the side of the display <em>far</em> from the controller, likely because of lack of space. Note the thick power distribution traces.</p>
<p><a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-1-4x.png" title="4x magnification, column structures in bottom right corner"><img src="/images/z144sn005-display/drivers-col-1-4x-thumb.png" title="4x magnification, column structures in bottom right corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-1-10x.png" title="10x magnification, column structures in bottom right corner"><img src="/images/z144sn005-display/drivers-col-1-10x-thumb.png" title="10x magnification, column structures in bottom right corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-2-10x.png" title="10x magnification, column structures in bottom center-right"><img src="/images/z144sn005-display/drivers-col-2-10x-thumb.png" title="10x magnification, column structures in bottom center-right"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-3-10x.png" title="10x magnification, column structures in bottom center"><img src="/images/z144sn005-display/drivers-col-3-10x-thumb.png" title="10x magnification, column structures in bottom center"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-4-10x.png" title="10x magnification, column structures in bottom left corner"><img src="/images/z144sn005-display/drivers-col-4-10x-thumb.png" title="10x magnification, column structures in bottom left corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-5-10x.png" title="10x magnification, column structures in top left corner"><img src="/images/z144sn005-display/drivers-col-5-10x-thumb.png" title="10x magnification, column structures in top left corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-6-4x.png" title="4x magnification, column structures in top right corner"><img src="/images/z144sn005-display/drivers-col-6-4x-thumb.png" title="4x magnification, column structures in top right corner"></a></p>
<p>At last, here’s a magnified shot of one of the cracks in the glass that killed the LCD:</p>
<p><a class="fancybox" href="/images/z144sn005-display/crack-10x.png" title="10x magnification, glass crack"><img src="/images/z144sn005-display/crack-10x-thumb.png" title="10x magnification, glass crack"></a></p>
<p>The cleanest pictures were taken by wetting a cotton bud with IPA, then swabbing the glass with the cotton bud such as to leave a thin film but also remove all junk somewhere out of the visual field, and photographing while the field has not evaporated. IPA appears to be nearly index-matched with the glass used in this LCD, and so it hides most if not all imperfections in the glass, apart from any cleaning that may happen as a result of this procedure.</p>
<p>I’ve been provided a Z144SN005 LCD that has split in halves as a result of excess mechanical force. Z144SN005 is a ST7735S-based LCD organized as 128RGB×128; it has 384 sources and 128 gates. I took some microphotographs of it using an Amscope ME300TZ-2L-3M metallurgical microscope.</p>
<p>The following microphotographs of the front glass plate that contains color filters demonstrate the pixels lighting up:</p>
<p><a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-trans-4x.png" title="4x magnification, transmitted light"><img src="/images/z144sn005-display/filters-trans-4x-thumb.png" title="4x magnification, transmitted light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-refl-4x.png" title="4x magnification, reflected light"><img src="/images/z144sn005-display/filters-refl-4x-thumb.png" title="4x magnification, reflected light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-trans-refl-4x.png" title="4x magnification, reflected and transmitted light"><img src="/images/z144sn005-display/filters-trans-refl-4x-thumb.png" title="4x magnification, reflected and transmitted light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-refl-10x.png" title="10x magnification, reflected light"><img src="/images/z144sn005-display/filters-refl-10x-thumb.png" title="10x magnification, reflected light"></a>
<a class="fancybox" rel="gal-front" href="/images/z144sn005-display/filters-trans-refl-10x.png" title="10x magnification, reflected and transmitted light"><img src="/images/z144sn005-display/filters-trans-refl-10x-thumb.png" title="10x magnification, reflected and transmitted light"></a></p>
<p>The following microphotographs (all taken in combined transmitted and reflected light for extra contrast) of the back glass plate show the circuitry driving the liquid crystals confined between the two glass plates.</p>
<p>First, row (gate) drivers. There’s 128 in total. Note the numbered LCD rows.</p>
<p><a class="fancybox" rel="gal-back-rows" href="/images/z144sn005-display/drivers-row-4x.png" title="4x magnification, row drivers"><img src="/images/z144sn005-display/drivers-row-4x-thumb.png" title="4x magnification, row drivers"></a>
<a class="fancybox" rel="gal-back-rows" href="/images/z144sn005-display/drivers-row-10x.png" title="10x magnification, row drivers"><img src="/images/z144sn005-display/drivers-row-10x-thumb.png" title="10x magnification, row drivers"></a></p>
<p>Second, column (source) drivers. There’s 384 in total, so the routing is far more dense. Note that the column drivers are located at the side of the display <em>far</em> from the controller, likely because of lack of space. Note the thick power distribution traces.</p>
<p><a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-1-4x.png" title="4x magnification, column structures in bottom right corner"><img src="/images/z144sn005-display/drivers-col-1-4x-thumb.png" title="4x magnification, column structures in bottom right corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-1-10x.png" title="10x magnification, column structures in bottom right corner"><img src="/images/z144sn005-display/drivers-col-1-10x-thumb.png" title="10x magnification, column structures in bottom right corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-2-10x.png" title="10x magnification, column structures in bottom center-right"><img src="/images/z144sn005-display/drivers-col-2-10x-thumb.png" title="10x magnification, column structures in bottom center-right"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-3-10x.png" title="10x magnification, column structures in bottom center"><img src="/images/z144sn005-display/drivers-col-3-10x-thumb.png" title="10x magnification, column structures in bottom center"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-4-10x.png" title="10x magnification, column structures in bottom left corner"><img src="/images/z144sn005-display/drivers-col-4-10x-thumb.png" title="10x magnification, column structures in bottom left corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-5-10x.png" title="10x magnification, column structures in top left corner"><img src="/images/z144sn005-display/drivers-col-5-10x-thumb.png" title="10x magnification, column structures in top left corner"></a>
<a class="fancybox" rel="gal-back-columns" href="/images/z144sn005-display/drivers-col-6-4x.png" title="4x magnification, column structures in top right corner"><img src="/images/z144sn005-display/drivers-col-6-4x-thumb.png" title="4x magnification, column structures in top right corner"></a></p>
<p>At last, here’s a magnified shot of one of the cracks in the glass that killed the LCD:</p>
<p><a class="fancybox" href="/images/z144sn005-display/crack-10x.png" title="10x magnification, glass crack"><img src="/images/z144sn005-display/crack-10x-thumb.png" title="10x magnification, glass crack"></a></p>
<p>The cleanest pictures were taken by wetting a cotton bud with IPA, then swabbing the glass with the cotton bud such as to leave a thin film but also remove all junk somewhere out of the visual field, and photographing while the field has not evaporated. IPA appears to be nearly index-matched with the glass used in this LCD, and so it hides most if not all imperfections in the glass, apart from any cleaning that may happen as a result of this procedure.</p>
tag:lab.whitequark.org,2018-08-02:/notes/2018-08-02/nokia-3220-transflective-lcd-microphotography/
Nokia 3220 transflective LCD microphotography
2018-08-02T01:56:23Z
2018-08-02T01:56:23Z
<p>I took some microphotographs of a Nokia 3220 (RH-37) transflective LCD using an Amscope ME300TZ-2L-3M metallurgical microscope. These microphotographs were taken on an intact LCD, i.e. transmitted light images were acquired by shining a flashlight through the entire LCD stackup (the built-in diascopic light source is not powerful enough); the 90° reflected light images were acquired using the built-in episcopic light sources, and the 45° reflected light images were acquired using a similar flashlight.</p>
<p><img src="/images/nokia-3220-display/reflected-90deg.png" alt="reflected light incident at 90°">
<img src="/images/nokia-3220-display/reflected-90deg-45deg.png" alt="reflected light incident at 90° and 45°">
<img src="/images/nokia-3220-display/transmitted.png" alt="transmitted light">
<img src="/images/nokia-3220-display/combined.png" alt="transmitted light and reflected light incident at 90° and 45°">
<img src="/images/nokia-3220-display/magnified.png" alt="transmitted light and reflected light incident at 90°, at larger magnification"></p>
<p>I took some microphotographs of a Nokia 3220 (RH-37) transflective LCD using an Amscope ME300TZ-2L-3M metallurgical microscope. These microphotographs were taken on an intact LCD, i.e. transmitted light images were acquired by shining a flashlight through the entire LCD stackup (the built-in diascopic light source is not powerful enough); the 90° reflected light images were acquired using the built-in episcopic light sources, and the 45° reflected light images were acquired using a similar flashlight.</p>
<p><img src="/images/nokia-3220-display/reflected-90deg.png" alt="reflected light incident at 90°">
<img src="/images/nokia-3220-display/reflected-90deg-45deg.png" alt="reflected light incident at 90° and 45°">
<img src="/images/nokia-3220-display/transmitted.png" alt="transmitted light">
<img src="/images/nokia-3220-display/combined.png" alt="transmitted light and reflected light incident at 90° and 45°">
<img src="/images/nokia-3220-display/magnified.png" alt="transmitted light and reflected light incident at 90°, at larger magnification"></p>
tag:lab.whitequark.org,2018-08-02:/notes/2018-08-02/sony-xperia-z2-lcd-microphotography/
Sony Xperia Z2 LCD microphotography
2018-08-02T00:33:42Z
2018-08-02T00:33:42Z
<p>I took some microphotographs of a counterfeit Sony Xperia Z2 LCD display, bought on Taobao circa 2017, using an Amscope ME300TZ-2L-3M metallurgical microscope. (This display would not initialize with an unmodified firmware, and was factory-programmed to match the never used, due to a bug, DeviceTree display configuration present in the Xperia Z2 kernel image.)</p>
<p>The combined light photo is not digitally manipulated; this is just how it looks with both episcopic and diascopic light sources on.</p>
<p><img src="/images/xperia-z2-display/reflected.png" alt="reflected light">
<img src="/images/xperia-z2-display/transmitted.png" alt="transmitted light">
<img src="/images/xperia-z2-display/combined.png" alt="combined light"></p>
<p>I took some microphotographs of a counterfeit Sony Xperia Z2 LCD display, bought on Taobao circa 2017, using an Amscope ME300TZ-2L-3M metallurgical microscope. (This display would not initialize with an unmodified firmware, and was factory-programmed to match the never used, due to a bug, DeviceTree display configuration present in the Xperia Z2 kernel image.)</p>
<p>The combined light photo is not digitally manipulated; this is just how it looks with both episcopic and diascopic light sources on.</p>
<p><img src="/images/xperia-z2-display/reflected.png" alt="reflected light">
<img src="/images/xperia-z2-display/transmitted.png" alt="transmitted light">
<img src="/images/xperia-z2-display/combined.png" alt="combined light"></p>