Skip to content

periph/hwrng: Introduce generic hwrng_read(), allow for leaner HWRNG implementations#11924

Closed
benpicco wants to merge 28 commits intoRIOT-OS:masterfrom
benpicco:hwrng-generic_read
Closed

periph/hwrng: Introduce generic hwrng_read(), allow for leaner HWRNG implementations#11924
benpicco wants to merge 28 commits intoRIOT-OS:masterfrom
benpicco:hwrng-generic_read

Conversation

@benpicco
Copy link
Contributor

@benpicco benpicco commented Jul 26, 2019

Contribution description

Most HWRNGs are pretty simple. They provide a register that contains random data. Some HWRNG implementations already expose this single register by providing a hwrand(void) function that returns a 32 bit value.

The hwrng_read() API function expects a buffer, so currently many implementations invent (or copy & paste) their way to fill a buffer of arbitrary size with 32 bit values.

This PR does the following things:

  • add the hwrand(void) function to the hwrng API. Not only is it often more convenient, it also allows for a generic hwrng_read() implementation.
  • implement hwrand() for all drivers that didn't have it yet
  • implement a generic hwrng_read() that uses hwrand() and is a bit more efficient than the current implementation
  • Introduce HWRNG_HAS_INIT & HWRNG_HAS_POWERONOFF analogous to the watchdog API so we don't have to call empty functions if they are not needed.

Testing procedure

All platforms should still build & run tests/rng and tests/periph_hwrng as they did before.

On same54 a small speedup can be observed using the speed test in tests/rng (set source 2 to select the HWRNG):

before: Collected 11009147 samples in 10.000008 seconds (4300 KiB/s).
after: Collected 12244868 samples in 10.000009 seconds (4783 KiB/s).

Issues/PRs references

@benpicco benpicco added Area: drivers Area: Device drivers Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation labels Jul 26, 2019
@miri64 miri64 added the Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. label Jul 26, 2019
@miri64 miri64 requested a review from PeterKietzmann July 26, 2019 11:53
@benpicco benpicco force-pushed the hwrng-generic_read branch from d195a14 to 1a2f6d5 Compare July 26, 2019 12:37
@miri64
Copy link
Member

miri64 commented Jul 26, 2019

I think ae02fbe belongs into its own PR that depends on this one ;-).

@benpicco benpicco force-pushed the hwrng-generic_read branch from 13d65ab to a32e2c9 Compare July 26, 2019 13:14
@benpicco benpicco force-pushed the hwrng-generic_read branch 2 times, most recently from bbc2cf3 to d31cda6 Compare August 3, 2019 13:23
@benpicco
Copy link
Contributor Author

benpicco commented Aug 3, 2019

Rebased to include the conversion of #11647

@benpicco benpicco force-pushed the hwrng-generic_read branch from d31cda6 to b35365d Compare August 26, 2019 14:51
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Aug 26, 2019
@benpicco benpicco force-pushed the hwrng-generic_read branch from b35365d to d38854a Compare August 26, 2019 15:19
@benpicco benpicco force-pushed the hwrng-generic_read branch from 8bc308d to 5ae99b3 Compare August 26, 2019 18:43
@benpicco benpicco requested review from dylad and kaspar030 September 29, 2019 14:13
@benpicco benpicco requested a review from maribu October 22, 2019 09:09
#include "periph/hwrng.h"

static const uint32_t* RNG_DATA_REG = (uint32_t*)0x3ff75144;
#define RNG_DATA_REG (*(volatile uint32_t *)0x3ff75144)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes by the way a bug. In the hope that others point me to bugs introduced by me so that I can learn from that, I try to be a good example:

@gschorcht: Without the volatile the compiler assumes the memory location is regular memory. The compiler is therefore allowed to assume that two successive reads from that location will return the same value, like it would on normal memory. Likely, the compiler would generate code that reads the value into a register and fills the buffer in buf with that exact same value over and over again. So at most 32 bit of randomness are copied to buf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maribu Thanks for pointing that. In case of ESP32 it seems not to be a problem. hwrng_read always returns different values so that successive reads seem to return different values.

@maribu
Copy link
Member

maribu commented Oct 22, 2019

@haukepetersen and @mehlis were the authors of the API, so maybe they want to join the discussion.

I personally do not like the idea of exposing hardware random number generators directly to the application programmer. I'm pretty sure that the use case is not "talk to a HWRNG" but "get true random". And "talk to a HWRNG" is one (very nice) way to implement "get true random", but not the only way.

With that in mind, I would personally prefer a user facing API like this:

typedef enum {
    GETENTROPY_MODE_BLOCK, /**< Block until the required amount of entropy was obtained */
    GETENTROPY_MODE_DO_NOT_BLOCK, /**< Get as much entropy as is available */
    /**
     * @brief Try to give the required amount of entropy with first level entroy for some
     *        time. If this fails, return pseudo-random values to avoid blocking
     */
    GETENTROPY_MODE_UNSECURE_FALLBACK,
} getentropy_mode_t;

/**
 * @brief Retrieve entropy gathered and mixed from all available entropy sources
 *
 * @param dest Store the entropy in this buffer
 * @param size_t The number of entropy bytes to retrieve
 * @param mode What to do if the required amount of entropy is not available
 *
 * @retval -1 Failed to get entropy
 * @return The number of entropy bytes actually retrieved, which can be less than the
 *         requested number depending on the value of @p mode
 *
 * This function is safe to call concurrently.
 *
 * @details Internally a mutex must be used to ensure that the same entropy is only
 *          delivered to a single context. Calls with @p mode set to
 *          @ref ENTROPY_MODE_DO_NOT_BLOCK will try to lock that mutex and return no
 *          entropy, if the mutex is already locked.
 */
ssize_t getentropy(void *dest, size_t length, getentroopy_mode_t mode);

/**
 * @brief Initialize the entropy backend
 *
 * @pre   This function is only called once. If the module `auto_init` is used, it will
 *        be called from there
 */
void entropy_init(void);

(A set of convenience functions to get a truly random uint32_t value etc. could be provided, if there is interest in having them. But I personally can only see two use cases for true randomness: Generating key or IVs for crypto stuff, or seeding a PRNG. In both cases those convenience functions are not really useful.)

This API could than be implemented in the most ROM and RAM efficient way by only having a single HWRNG as entropy source. As this device will never block indefinitely, getentropy() could just ignore the mode parameter and always return the amount of requested entropy. (And this should definitely a choice the user can pick from, when the device has a periph HWRNG.)

If however the user decides to not trust the HWRNG to be fully random or does not have any HWRNG , a more advanced implementation of that API could be used that mixes many sources for entropy in. That way the damage of a faulty (or backdoored) entropy source would be limited. I see two classes of providers for this entropy:

  1. On-demand sources like HWRNGs or transceivers that can read the RSSI of background noise. Those can be called from getentropy().
  2. Occasional entropy sources. E.g. the timestamp of any external interrupt will have some entropy, as even for periodic interrupts some jitter will occur. (I don't this that timer interrupts of timers fed by the same clock as the CPU will add entropy, but every other interrupt should do, I guess.) Other sources might by transceivers to only provide the RSSI of received frames. Those sources will have to call an internal API to feed their entropy into the mix whenever an entropy generating event takes places.

The advantages of the API I proposed above that I see are:

  1. Multiple sources for entropy can (optionally!) be used and mixed in to increase the "robustness" against hardware failures and backdoored entropy sources
  2. Not only "on-demand" entropy sources like hardware random number generators could be accessed in the same API. E.g. if we have a RIOT node without HWRNGs, ADCs (that can be used to sample noise), transceivers (that allows harvesting RSSI background noise) but still needs to generate crypto keys - lets say a KNX connected smart home device: The new API could still cater this use case with occasional entropy sources like external interrupts.

Copy link
Contributor

@haukepetersen haukepetersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a first look, I really don't like the proposed changes:
a) implications for non 32-bit platforms
So why not have hwrng_uint8() or hwrng_uin16() etc.? After all, the difficulty of designing an API for multi-bit-width systems is to make it 'fair' and usable for all of them

b) en/disabling of selected API functions
This is the point I struggle with most: IMHO this is very bad design practice to dis/enable certain parts of a generic API based on the used platform. I guess the practice of choice here would be to always include all API functions, but map them to shared empty implementations if not needed, as done e.g. with the typedef in the periph APIs. This way the platform dependent configuration is hidden from the API user.

c) no clear doc on power implications
In the proposed code (if I understood correctly), hwrng_uint32() does expect the user to turn on power manually before and after using it, while the existing hwrng_get() does take care of this. This must be clearly documented....

d) and lastly, I am missing the overall motivation for this API change
Is the main motivation just to enable streamlining of implementations? Or better run-time performance? I am guessing the latter, right?

If we want to streamline this API for performance, there is IMHO more to it:

  • allow for asynchronous HWRNG usage (especially for larger chunks of entropy data) -> many hardware unints do provide interrupts, DMA, etc capabilites, so the API should allow for exploiting them. This would probably also the path to follow when optimizing for run-time performance!
  • thread safety: so far the API is not thread safe, but if we start to encourage modules to use it, this should be thought through

@haukepetersen
Copy link
Contributor

@maribu this periph interface is not really meant to be a generic user facing API that is able to cover multiple different types of entropy sources (rssi, ...), but a slim API that only wraps the on-cpu hardware random number peripherals on CPUs (if present). The 'multiplexing' of different entropy sources etc is/should be actually part of the higher level random API.

@maribu
Copy link
Member

maribu commented Oct 22, 2019

@maribu this periph interface is not really meant to be a generic user facing API

OK, in that case I'm fine. Sorry for making the noise.

The 'multiplexing' of different entropy sources etc is/should be actually part of the higher level random API.

That API exists, but "only" provides PRNGs. To me it makes sense to add a second higher level API for true random (as needed for cryptographic keys). Keeping those two strictly apart makes sense, as they have totally different requirements. E.g. I could use a PRNG for visual or sound effects where actual randomness doesn't matter, but speed a lot.

@miri64
Copy link
Member

miri64 commented Oct 22, 2019

That API exists, but "only" provides PRNGs. To me it makes sense to add a second higher level API for true random (as needed for cryptographic keys). Keeping those two strictly apart makes sense, as they have totally different requirements. E.g. I could use a PRNG for visual or sound effects where actual randomness doesn't matter, but speed a lot.

#scnr the random API got so often (re-)discussed in the past ^^

@kaspar030
Copy link
Contributor

#scnr the random API got so often (re-)discussed in the past ^^

I thought the same. 😉 But I don't find the discussion with the actual API proposals that got some consensus but never got realized.

@benpicco
Copy link
Contributor Author

@haukepetersen my main motivation was to streamline the implementation because it irked me there was so much duplicated code for all the hwrng implementations.
Then I discovered tests/rng has a benchmark mode…

Adding hwrng_uint16() and hwrng_uint8() would be worthwhile too. Because I disagree with

But I personally can only see two use cases for true randomness: Generating key or IVs for crypto stuff, or seeding a PRNG. In both cases those convenience functions are not really useful.)

A PRNG just adds unnecessary code size, if a HWRNG exists why not just use that for all random needs? That's why I wanted it to have a convenient API and why I wanted to avoid having the overhead of enabling / disabling the peripheral for each hwrng_uint32() call.

@maribu
Copy link
Member

maribu commented Oct 22, 2019

A PRNG just adds unnecessary code size, if a HWRNG exists why not just use that for all random needs?

That is indeed a good point. Still, you don't want to use the periph API for that in order to have a fall back. But keep in mind that the output of PRNGs have some interesting properties that people could rely on, e.g.:

  1. When initialized with the same seed, the exact same sequence can be reproduced
  2. Most PRNGs have a full period. (E.g. when I request 2^32 32 bit values, the PRNG will return every possible 32 bit number exactly once.)

People exploiting one of theses properties and now getting "proper" randomness instead, might have issues.

@maribu
Copy link
Member

maribu commented Oct 22, 2019

I thought the same. wink But I don't find the discussion with the actual API proposals that got some consensus but never got realized.

Please share if you can find the discussion. I'd like to finally create a PR for RIOT that is actually related to my field of research 😆

@benpicco
Copy link
Contributor Author

benpicco commented Oct 22, 2019

I don't think a 'one size fits all' approach works here.

At a previous project (that didn't use RIOT), we had two needs for randomness:

  • a system RNG that is used for common randomness. Think CSMA backoff times.
  • a seed-able, instantiable PRNG. This was to generate random sequences that should be the same on all nodes.

The first one might as well be a HWRNG, but we didn't have that so we used tinymt32 seeded with the CPU ID, but hidden behind a simple API so you could just get a random number of any desirable size. (random_32(), random_16(), random_8(), random_float())

The second was used by the application, those were multiple instances of tinymt32.

So my conclusion would be, if a user wants predictable randomness, the tinymt32 API is good enough. On a system level predictable randomness can not be guaranteed even with a PRNG unless you also control outside events that might trigger a PRNG read.
And for unit tests you can just mock it.

@PeterKietzmann
Copy link
Member

I would really like to keep this discussion as one about a peripheral interface. Dealing with our random API, entropy of different sources or PRNGs is an other (and possibly larger) discussion that I'd like to hold soon, but separated from this one.

@stale
Copy link

stale bot commented Apr 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Apr 25, 2020
@stale stale bot closed this May 26, 2020
@maribu
Copy link
Member

maribu commented May 29, 2020

I like the idea of the PR. However, this idea is not exploited: Most HWRNG implementations in this PR now end up implementing both hwrng_read() and hwrng_uint32().

The maintainability would be increased if only hwrng_uint32() would be provided, as this is usually simpler to implement compared to hwrng_read(). This would however come for the price that hwrn_read() gets slower with the generic implementation, as accessing the HWRNG usually is like one or two instructions - so cheaper than a function call. Maybe the following driver API could be used instead:

#if HWRNG_SIZE == 8
typedef uint8_t hwrng_val_t;
#elif HWRNG_SIZE == 16
typedef uint16_t hwrng_val_t;
#else
typedef uint32_t hwrng_val_t;
#endif

#if defined(HWRNG_HAS_ON) || defined(DOXYGEN)
void hwrng_on(void) {}
#endif

#if defined(HWRNG_HAS_ON) || defined(DOXYGEN)
static inline void hwrng_off(void) {}
#endif

static inline hwrng_val_t _hwrng_read(void);

And based on that generic implementations of hwrng_read(), hwrng_uint32(), hwrng_uint16(), and hwrng_uint8() are provided. This allows the specific driver to expose _hwrng_read() in an inline-able fashion, so that the generic hwrng_read() is not slower than an architecture specific one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: drivers Area: Device drivers CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. State: stale State: The issue / PR has no activity for >185 days Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants