2016-08-05

Ring oscillators on Silego GreenPAK 4

Tags:

Silego GreenPAK 4 is a very nice programmable mixed-signal matrix that has a fully open-source toolchain consisting of Yosys and openfpga. In this note I look at fun and completely inappropriate uses of its logic primitives.

Tools
GP_RINGOSC
Frequency counter
- Aside: UART is just a pipe, right?
Behavior of GP_RINGOSC
- Quantifying noise
- Varying voltage
Rolling our own
Results

A ring oscillator consists of an odd number of inverters in a row:

ring oscillator

It is, of course, unstable, and flips with a frequency that depends mostly on temperature and voltage, and the parameters of the circuit defined by the manufacturing process.

Ring oscillators are quite useful because they are small and self-contained. GreenPAK 4 already includes a ring oscillator primitive, GP_RINGOSC. Let’s take a look at its properties.

Tools

GreenPAK4 Universal Developer Board: Silego
SLG46620V: Silego
iCE40HX8K-B-EVM: Lattice Semi
1GSa/s 100MHz oscilloscope: Rigol DS1104Z

GP_RINGOSC

The primitive is stated in the datasheet to oscillate at 27MHz. Let’s verify.

gp_ringosc.v


1
2
3
4
5
6
7

module top( (*LOC="P4"*) output q );
    GP_RINGOSC #(
        .AUTO_PWRDN(0)
    ) ringosc (
        .CLKOUT_FABRIC(q)
    );
endmodule

Build it the usual way:


1
2
3
4

$ yosys -p "read_verilog gp_ringosc.v" \
        -p "synth_greenpak4 -json gp_ringosc.json"
$ gp4par gp_ringosc.json -o gp_ringosc.txt
$ gp4prog -e gp_ringosc.json -v 3.3 -n 4

And take a look with the oscilloscope:

Note that I’m measuring it using the hardware frequency counter, not calculating it via the waveform shape, which would be far less precise.

Sure enough, 26.760MHz… or is it? While I was writing this part of the note, it drifted to 26.750MHz, and then after a while to 26.780MHz, and so on. Let’s look closer.

Frequency counter

There’s a fair amount of variability in the measured frequency, and also the oscilloscope doesn’t let me extract the pulse counter value via the network, so let’s use another FPGA (an iCE40-HX8K) as a frequency counter. It’s also supported by FOSS tooling, Yosys and arachne-pnr.

This is a fairly trivial design. It has a counter clocked directly by the device under test. Every 1s (according to the on-board crystal oscillator) the counter is gated for one system clock cycle, its value is read out and transmitted over UART. The readout strobe is only emitted every 12_000_001 system clock cycles to compensate for the readout lapse.

This design is good for input clocks up to just over 120MHz according to the timing estimate, with the critical path being the carrying through counter.

FreqCounter.v


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

module FreqCounter(
        input  clk_12MHz,
        output uart_tx,
        input  clk_in,
        output clk_12MHz_tp,
        output uart_tx_tp,
    );

    // Test points
    assign clk_12MHz_tp = clk_12MHz;
    assign uart_tx_tp   = uart_tx;

    // Counter
    reg         active;
    wire        clk_gated = clk_in & active;
    reg  [31:0] counter;
    always @(posedge clk_gated)
        counter <= counter + 1;

    // Readout timer
    reg  [31:0] timeout;
    wire        timeout_strobe = (timeout == 0);
    always @(posedge clk_12MHz)
        if(timeout == 0)
            timeout <= 12_000_001;
        else
            timeout <= timeout - 1;

    // UART and readout FSM
    reg   [7:0] tx_data;
    reg         tx_ready;
    wire        tx_ack;
    UART #(
        .FREQ(12_000_000),
        .BAUD(115_200)
    ) uart (
        .reset(1'b1),
        .clk(clk_12MHz),
        .tx_o(uart_tx),
        .tx_data_i(tx_data),
        .tx_ready_i(tx_ready),
        .tx_ack_o(tx_ack)
    );

    localparam S_IDLE  = 0;
    localparam S_SYNC  = 1;
    localparam S_SEND  = 2;
    localparam S_WAIT  = 3;

    reg   [3:0] state  = S_IDLE;
    reg   [1:0] byteno = 2'd0;
    reg  [31:0] buffer;
    always @(posedge clk_12MHz)
        case(state)
            S_IDLE: begin
                if(timeout_strobe) begin
                    active <= 0;
                    state <= S_SYNC;
                end
            end
            S_SYNC: begin
                buffer <= counter;
                active <= 1;
                state <= S_SEND;
            end
            S_SEND: begin
                tx_data <= buffer >> (byteno * 8);
                tx_ready <= 1;
                if(!tx_ack)
                    state <= S_WAIT;
            end
            S_WAIT: begin
                tx_ready <= 0;
                if(tx_ack) begin
                    byteno <= byteno + 1;
                    if(byteno == 3)
                        state <= S_IDLE;
                    else
                        state <= S_SEND;
                end
            end
        endcase

endmodule

See the complete archive for the rest of the design. ./build.sh builds and uploads the gateware, and ./measure.rb /dev/ttyUSB1 reads out the frequency.

The clk_12MHz_tp test point is intended for measuring the on-board oscillator frequency. My oscilloscope indicates that it’s 12.0004 MHz, with no change over 24 hours.

Ultimately, I decided not to correct for deviation from 12.000000 MHz, since my oscilloscope doesn’t make any claims about the trueness of its reference clock, nor do I care about trueness of my frequency measurements. On the other hand, both the scope and the on-board oscillator seem to be precise enough relative to the DUT, which is what I need.

Aside: UART is just a pipe, right?

I’ve spent a few hours figuring out why I had intermittent framing errors–some of the bytes would be skipped when reading seemingly at random. I hacked and slashed the gateware, thinking I screwed it up somewhere, because the UART is just a dumb pipe, right?

Well, no. Turns out (and by “turns out” I mean “discovered with sigrok”; see the ./debug.sh script) the skipped byte was always 0x13… also known as XOFF. Linux has software flow control enabled by default on all TTYs, and that makes the UART not agnostic to the binary data going through it. That was irritating.

Behavior of GP_RINGOSC

Quantifying noise

If we take a look at the raw measurements, it’s pretty clear that the frequency jumps around a lot:

Let’s get some more data!


1

$ ./measure.rb | head -n512 >gp_ringosc-N512-3V3-29C.txt

There’s a lot of noise here:

Can we quantify this instability usefully, though? @bofh453 tells me I should use Allan deviation, which is designed exactly for this purpose. The implementation is simple enough:

allan.rb (download)


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

#!/usr/bin/env ruby

class Array
  def mean; reduce(:+) / length; end
end

samples = File.read(ARGV[0]).split.map(&:to_f)
puts("n\t%d" % samples.count)

nominal_freq = samples.mean
puts("f\t%f" % nominal_freq)

puts("τ\tσ(τ)")
freq_diff =
  samples.map do |sample|
    sample / nominal_freq - 1
  end
(0..8).map { |n| 2**n }.each do |tau|
  variance =
    freq_diff
      .each_slice(tau).map { |slice| slice.mean }
      .each_cons(2).map do |y_n, y_np1|
        (y_np1 - y_n) ** 2
      end.mean / 2
  puts("#{tau}\t%.2g" % Math.sqrt(variance))
end


1
2
3

$ ./allan.rb freq-N512-3V3-29C.txt
n       512
f       26837964.519531

What does this plot indicate? Mostly the fact that the noise in GP_RINGOSC is a combination of pink noise (inherent in all semiconductors) and random walk noise (due to some environmental factor I can’t identify), as illustrated with the plot from the NIST website above:

Varying voltage

What’s the most obvious environmental factor? Voltage. Let’s see how the mean frequency changes if we sweep voltage. The sweep is from 3.3 V to 5.5 V only because the iCE40 developer board has all I/O banks at 3.3 V and anything less causes it to lose pulses… “are they 5V tolerant,” you ask? No, but I’ve realized that only after gathering the data. Ah well, didn’t seem to do it any harm.

sweep_vcc.rb (download)


1
2
3
4
5
6
7
8
9
10
11
12
13

#!/usr/bin/env ruby
require './measure'

class Array
  def mean; reduce(:+) / length; end
end

puts("Vcc\tfosc")
(33..55).map { |v| v/10.0 }.each do |vcc|
  system("gp4prog -q -v #{vcc}")
  freq = to_enum(:acquire).take(5).mean
  puts("#{vcc}\t#{freq}")
end


1

$ ./sweep_vcc.rb >gp_ringosc-3V3-to-5V5-29C.txt

The oscillator is more stable than I expected, deviating by less than 1.5% of its initial frequency over most of the operating range of SLG46620V (raw data):

Okay, what about the noise?


1
2
3
4
5
6

$ gp4prog -q -v 4.5
$ ./measure.rb | head -n512 >gp_ringosc-N512-4V5-29C.txt
$ ./allen.rb gp_ringosc-N512-4V5-29C.txt
$ gp4prog -q -v 5.4
$ ./measure.rb | head -n512 >gp_ringosc-N512-5V4-29C.txt
$ ./allen.rb gp_ringosc-N512-5V4-29C.txt

It seems to grow linearly with the supply voltage, with the random walk component, whatever it is, becoming more prominent (raw data: 3.3 V, 4.5 V, 5.4 V).

Rolling our own

Using specialized primitives is boring, let’s just make our own ring oscillator. There are two primitives in GreenPAK 4 that can be used for this: dedicated inverters and LUTs. Let’s see what properties they have and if they all work the same.

gp_inv.v


1
2
3
4

module top( (*LOC="P4"*) output q );
    (* LOC="INV_N" *) // replace N
    GP_INV inv(.IN(q), .OUT(q));
endmodule

gp_2lut.v


1
2
3
4

module top( (*LOC="P4"*) output q );
    (* LOC="LUT2_N" *) // replace N
    GP_2LUT #(.INIT(4'b0001)) lut(.IN0(q), .OUT(q));
endmodule

gp_3lut.v


1
2
3
4

module top( (*LOC="P4"*) output q );
    (* LOC="LUT3_N" *) // replace N
    GP_3LUT #(.INIT(4'b00000001)) lut(.IN0(q), .OUT(q));
endmodule

gp_4lut.v


1
2
3
4

module top( (*LOC="P4"*) output q );
    (* LOC="LUT4_N" *) // replace N
    GP_4LUT #(.INIT(4'b0000000000000001)) lut(.IN0(q), .OUT(q));
endmodule

Frequency

First, let’s take a look at mean frequency at 5V supply voltage:

Location	Matrix	Frequency (MHz)	Period (ns)
INV_0	0	125.3	7.98
INV_1	1	121.3	8.24
LUT2_0	0	108.7	9.20
LUT2_1	0	108.5	9.22
LUT2_2	0	108.5	9.22
LUT2_4	1	109.7	9.12
LUT2_5	1	110.3	9.07
LUT3_0	0	102.4	9.77
LUT3_8	1	103.1	9.70
LUT4_0	0	69.1	14.5
LUT4_1	1	89.7	11.1

Nothing surprising here–the larger the primitive, the slower it is; within the same matrix there is almost no difference; and the difference between different matrices is minor (LUT4_0 is an exception, as it is not identical to LUT4_1—it is a programmable function generator). I suspect that the difference between matrices is caused by the oscillator driving a longer net towards pin P4 (which is in matrix 0), so let’s verify that:

gp_inv.v


1
2
3
4

module top( (*LOC="P18"*) output q );
    (* LOC="INV_N" *) // replace N
    GP_INV inv(.IN(q), .OUT(q));
endmodule

Location	Matrix	Frequency (MHz)	Period (ns)
INV_0	0	125.2	7.99
INV_1	1	122	8.19

Well, apparently not.

In principle, connecting several LUTs in series should result in a total period as a sum of the parts. Let’s try that:

gp_2lut_x3.v


1
2
3
4
5
6
7
8
9

module top( (*LOC="P4"*) output q0 );
    wire q1, q2;
    (* LOC="LUT2_0" *)
    GP_2LUT #(.INIT(4'b0001)) lut1(.IN0(q0), .OUT(q1));
    (* LOC="LUT2_1" *)
    GP_2LUT #(.INIT(4'b0001)) lut2(.IN0(q1), .OUT(q2));
    (* LOC="LUT2_2" *)
    GP_2LUT #(.INIT(4'b0001)) lut3(.IN0(q2), .OUT(q0));
endmodule

In theory, this oscillator should have a frequency of 36.2 MHz at 5 V. In practice, it turns out to be… 112 MHz, which is higher than an individual GP_2LUT (bitstream for this circuit). Say what?! (Update: I figured it out)

What if we replace one of those with an inverter?

gp_2lut_x2_gp_inv.v


1
2
3
4
5
6
7
8
9

module top( (*LOC="P4"*) output q0 );
    wire q1, q2;
    (* LOC="LUT2_0" *)
    GP_2LUT #(.INIT(4'b0001)) lut1(.IN0(q0), .OUT(q1));
    (* LOC="LUT2_1" *)
    GP_2LUT #(.INIT(4'b0001)) lut2(.IN0(q1), .OUT(q2));
    (* LOC="INV_0" *)
    GP_INV inv(.IN(q2), .OUT(q0));
endmodule

This one should have a frequency of 37.9 MHz at 5 V, which is also exactly what I measure.

What if instead of an inverter we use a LUT from a different matrix?

gp_2lut_x3_xconn.v


1
2
3
4
5
6
7
8
9

module top( (*LOC="P4"*) output q0 );
    wire q1, q2;
    (* LOC="LUT2_0" *)
    GP_2LUT #(.INIT(4'b0001)) lut1(.IN0(q0), .OUT(q1));
    (* LOC="LUT2_1" *)
    GP_2LUT #(.INIT(4'b0001)) lut2(.IN0(q1), .OUT(q2));
    (* LOC="LUT2_4" *)
    GP_2LUT #(.INIT(4'b0001)) lut3(.IN0(q2), .OUT(q0));
endmodule

This circuit oscillates at 23.4 MHz. This is lower than the prediction of 36.2 MHz due to propagation delay through the cross-connection, which I won’t bother to measure, but it’s close enough.

I think that the 2LUT_0–2LUT_1–2LUT_2 cycle exhibits this behavior because of some resonance effect, since the self-oscillation frequencies are so close. I’m not sure how that works exactly, though.

Voltage

Something interesting I also noticed is the average voltage on the output pins; it differs far more than frequency. The supply voltage is 5V.

Primitive	Output pin	Connection	Average voltage (V)
INV_1	P4	Cross	1.97
LUT2_4	P4	Cross	2.55
LUT3_8	P4	Cross	2.32
INV_0	P18	Cross	2.35
LUT2_0	P18	Cross	2.60
LUT3_0	P18	Cross	2.79
INV_0	P4	Direct	2.80
LUT2_0	P4	Direct	2.78
LUT3_0	P4	Direct	2.84
INV_1	P18	Direct	3.07
LUT2_4	P18	Direct	3.21
LUT3_8	P18	Direct	3.11

Frequency drift

Let’s see how frequency changes with supply voltage:

Not only these oscillators drift a lot more than GP_RINGOSC does, but also their frequency increases, whereas for GP_RINGOSC it decreases. Clearly the latter is compensated for supply voltage.

Noise

Now let’s compare all our oscillators: GP_RINGOSC, GP_INV, GP_2LUT and the 2LUT_0–2LUT_1–2LUT_4 ring, all at 5.4 V (raw data for INV_1, 2_LUT0, 2_LUT[014]):

This is quite interesting. The amount of noise doesn’t depend on frequency. The single LUT and the three-LUT loop differ in frequency by a factor of 5 yet they have almost identical amount of noise. Otherwise, it seems that the more programmable logic is involed, the more noise this creates, but I’m not sure.

Results

To summarize:

Ring oscillators can be constructed from programmable logic, and the highest frequency possible to achieve this way is ~130 MHz.
Ring oscillators made from a single primitive drift with changing supply voltage by ~20 MHz/V; GP_RINGOSC is compensated for that.
Ring oscillators made from programmable logic produce the same kind of noise as GP_RINGOSC; noise is independent of frequency; GP_2LUT is an order of magnitude more noisy than GP_INV or GP_RINGOSC.
The output of high-frequency ring oscillators made from programmable logic has a significant DC bias and they are probably not useful for much. Update: the Hacker News thread mentions that XOR between two ring oscillators could serve as a useful hardware random number generator.
A loop of three closely located and matched LUTs configured as inverters oscillates at a frequency higher than individual LUTs for an unknown reason. Why? (Update: I figured it out)

I also now have a newfound appreciation of crystal oscillators.