Belgian boffins breed 'digital canaries' to test your random numbers

Monday 18th April 2016 07:35 GMT Unep Eurobats

But can you trust a canary

not to tweet?

2 0 Reply
Monday 18th April 2016 08:26 GMT jake

Uh ... harmonic discordance here ...

Adding more input for crackers is never a good idea, from a security point of view.

0 3 Reply
1. Monday 18th April 2016 08:55 GMT Tomato42
  
  Re: Uh ... harmonic discordance here ...
  
  I don't think the idea was to expose this numbers anywhere near end user code, and rather have two modules - one HWRNG and one verifier. This way you can have independently designed RNG and the thing thing that performs the runtime checking.
  
  This is rather good idea, as we know how to design whitening functions that pass all statistical checks on the output while fed no entropy at all. In other words, RNGs passing statistical tests doesn't mean it's a good RNG, it just means it's not horribly broken.
  
  9 0 Reply
Monday 18th April 2016 08:35 GMT TRT

I won't even begin...

to try to understand this article. In terms of simplifying the aims and achievements of the researchers for the bears of small brain, this one's fallen off the twig.

2 0 Reply
1. Monday 18th April 2016 09:10 GMT Dan 55
  
  Re: I won't even begin...
  
  There are two sets of random numbers generated, one with high entropy and one with low. It checks the statistical distribution of the low entropy set (the canary numbers). They should be evenly distributed, like you'd expect repeatedly flipping a coin to turn out roughly 50/50. If they're not then something's up. In theory the canary number will be affected first if there's something up with the RNG because its entropy is lower.
  
  12 0 Reply
  1. Monday 18th April 2016 09:53 GMT Mage
    
    Re: In theory the canary number will be affected first
    
    Like a Zener diode thermal noise source getting modulated by power supply noise (SMPSU ultrasonic or mains hum).
    
    A Zener diode feeding an ADC sounds like a nice idea for a RNG, but it's actually quite hard to do it properly.
    
    3 0 Reply
    1. This post has been deleted by its author
      1. Monday 18th April 2016 15:04 GMT JeffyPoooh
        
        "...radioactive decay."
        
        Symon says, "...radioactive decay. It's easy to make one at home!"
        
        Increasing the rate of random bits might cause other issues. Like body parts falling off...
        
        0 0 Reply
  2. Monday 18th April 2016 10:30 GMT Arctic fox
    
    @Dan 55 Thanks for that old chap. Much obliged, clarified the issue for this bear of limited...
    
    .......brain in the area concerned at any rate. See icon.
    
    4 0 Reply
  3. Monday 18th April 2016 11:06 GMT Doctor Syntax
    
    Re: I won't even begin...
    
    "There are two sets of random numbers generated, one with high entropy and one with low."
    
    I get that. But why should the latter tell you anything about the former? Or, given that they're talking about TRNGs based on noise sources, how do you get two sets of numbers out of them other than by diverting every other number (or some other percentage) from one set to the other in which case how do you have different entropies for the sets?
    
    In short, this seems like a remarkably low content article.
    
    0 0 Reply
    1. Monday 18th April 2016 11:46 GMT Anonymous Coward
      
      Re: I won't even begin...
      
      Probably by way of skimming only a low bit count out of the generator to get your low-entropy set (say your standard set uses 32 or 64 bits, but your canary set only uses 8 bits). A 32-bit number has a numerical range of over 4 billion while an 8-bit one only has 256. Analyzing a stream of 8-bit numbers should be much quicker than scanning a stream of 32-bit ones. You use the first set as an early warning system.
      
      0 0 Reply
      1. Monday 18th April 2016 21:24 GMT Doctor Syntax
        
        Re: I won't even begin...
        
        "Probably by way of skimming only a low bit count out of the generator"
        
        You still have the problem of ensuring that the low bit count numbers repeat any patterns in the high bit count. To take an extreme example you take the low 8 bits, they look random but the top 8 bits are cycling through a short repeated sequence.
        
        0 0 Reply
  4. Monday 18th April 2016 13:17 GMT Bill Gray
    
    Re: I won't even begin...
    
    @Dan 55: Ermm, I'm pretty sure this isn't what they're doing. (See my post below.)
    
    The second (low-entropy) set of numbers is the raw thermal noise data, or the timing of radioactive decay events, or the timing of key presses/mouse movements for a desktop computer. That data is semi-random, but will have patterns in it, including (possibly) not having bits set 50/50, or correlations between bits. This raw, low-entropy data is "stirred" (I think the Linux kernel used to do it with a primitive polynomial) and hashed to create a smaller amount of high-entropy, "really random" output.
    
    The idea these folks are proposing is that you send, say, 90% of your raw data to be stirred and hashed, but send the remaining 10% off still in raw form. That way, if your radioactive event detector is broken and sending all zeroes, you'll see it. (Stirring and hashing can make even a stream of zeroes look random. Keep in mind that the stirring will include the previous state.)
    
    0 0 Reply
    1. Monday 18th April 2016 14:08 GMT TRT
      
      Re: I won't even begin...
      
      You lost me at numbers having entropy. And then confused me by talking about thermal noise which I know *is* something to do with entropy.
      
      0 0 Reply
Monday 18th April 2016 12:10 GMT Anonymous Coward

I call dibs

On calling the Raspberry Pi implementation of this (should there be one) the Tweety Pi.

3 0 Reply
1. Monday 18th April 2016 20:19 GMT Captain DaFt
  
  Re: I call dibs
  
  OK, but only if you make the alert message, "I tawt I taw a putty tat!"
  
  0 0 Reply
Monday 18th April 2016 12:14 GMT Bill Gray

I _think_ this is what they're doing:

Your average TRNG may work by taking, say, images of a lava lamp (see http://lavarnd.org/index.html; note that they now take images of the noise from the camera). This provides correlated bits that are not entirely random. But if you take, say, a 10 KByte sample and hash it, 100 bytes taken from it will be "random" (in the sense that an outsider, without access to the camera, can have no way to say that a given 100 bytes is more likely than any other 100 bytes.)

The problem is that if the camera died and started providing all zeroes, or zeroes alternating with ones, or something like that, the hashed output still would look pretty darn random. A while back, I made a TRNG by taking a small motor, attaching an (empty) peanut butter jar to the end of it, and starting it up with some dice in it. I pointed a cheap Webcam at it, and set it up to take the stream of image data and scramble/hash it in the above manner. (Note that I did this out of curiosity. I'm not currently doing anything for which I'd need that level of security anyway.)

However, I had it send nine out of ten images to the scrambling code, and the tenth to a plain old image file. As long as that image was of a peanut butter jar tumbling dice, I figured things were probably okay. If it was blank or maybe the motor had died and the dice weren't tumbling, I'd know I was getting non-random numbers. I won't swear to it that this is what the authors are contemplating, but it sounds like it.

In theory, you may be revealing some data if people can interpolate from images 1 and 11 to say something about images 2 through 10, but your sampling rate really ought to be low enough that consecutive images won't be correlated anyway.

0 0 Reply
1. Monday 18th April 2016 13:17 GMT Anonymous Coward
  
  Re: I _think_ this is what they're doing:
  
  I once tried a video entropy daemon. Thing was, it's hard to get the noise right even with whitening (try running your stuff through a chi-squared analysis, for example). I also have to wonder how fast such a rig can provide truly random data, which is important for an enterprise setting, especially one dealing with a lot of encrypted data or connections, which is why many have to rely on expensive dedicated devices. Thing is, such devices are difficult to get into to see if they're not malfunctioning, so the only thing you have to work with is the data stream. Doing a controlled sampling seems to be the best solution in that circumstance.
  
  0 0 Reply
Monday 18th April 2016 13:17 GMT Queeg

Not to be a pedant...

but that's a Blue tit...

2 0 Reply
Tuesday 19th April 2016 13:07 GMT anonymous boring coward

"testing the randomness of a large random number"

I would be REALLY interested to see how that could be done...

0 0 Reply