How can I generate different random values in Haskell?

Question

Suppose that I have a list like this:

let list = ["random", "foo", "random", "bar", "random", "boo"]

I want to iterate over a list and map all "random" elements to different random strings:

let newList = fmap randomize list
print newList
-- ["dasidias", "foo", "gasekir", "bar", "nabblip", "boo"]

My randomize function looks like this:

randomize :: String -> String
randomize str = 
  case str of
    "random" -> randStr
    _        -> str
  where
    randStr = take 10 $ randomRs ('a','z') $ unsafePerformIO newStdGen

But I get the same random string for every "random" element:

["abshasb", "foo", "abshasb", "bar", "abshasb", "boo"]

I can't figure out why is this happening and how to get a different random value for each occurrence of "random".

Just a thought, and bear in mind that I don't know haskell at all. In .NET, if you construct a new instance of the Random class and ask it for a random number, if you do this quickly in a loop, you'll observe that you get the same result for quite a while. The reason for this is that the class is seeded by the clock of the computer, but this clock value has a resolution of about 16ms, which means that if you seed 2 Random instances during the same 16ms interval, they'll produce the same sequence of "random" values. Could something similar be the case here? If not, please ignore me. — Lasse V. Karlsen
– Lasse V. Karlsen, Commented Sep 7, 2019 at 19:16
unsafe functions are really unsafe, and can easily break the language. You should pretend these functions are not there. Beginners should never be informed of their existence. — chi
– chi, Commented Sep 7, 2019 at 19:51

Daniel Wagner · Accepted Answer · 2019-09-07 19:24:57Z

7

There are two problems with your code:

You are calling unsafePerformIO, but explicitly violating the contract of that function. It is on you to prove that the thing you provide to unsafePerformIO is actually pure, and the compiler is within its rights to act as if that's the case, and here it is definitely not.
You are not carefully tracking the updated random number generator state after using it. Indeed, it is not possible to do this correctly with randomRs; if you use randomRs, then to a first approximation, that must be the last randomness your program needs.

The simplest fix to both of these is to admit that you really, truly are doing IO. So:

import Control.Monad
import System.Random

randomize :: String -> IO String
randomize "random" = replicateM 10 (randomRIO ('a', 'z'))
randomize other = pure other

Try it out in ghci:

> traverse randomize ["random", "foo", "random", "bar", "random", "boo"]
["xytuowzanb","foo","lzhasynexf","bar","dceuvoxkyh","boo"]

There is no call to unsafePerformIO, and so no proof burden to shirk; and randomRIO tracks the updated generator state for you in a hidden IORef, and so you correctly continue advancing it on each call.

edited Sep 7, 2019 at 19:24

answered Sep 7, 2019 at 19:17

Daniel Wagner

156k10 gold badges231 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Augusto Dias Over a year ago

But now I have IO [String], not [String]. Is there a way to do it and end up with [String] or [Data.Text]?

Daniel Wagner Over a year ago

@AugustoDias No, there is no correct way to end with a [String] and no other context. You may end with a StdGen -> (StdGen, [String]) or something isomorphic; that is the closest you can get to pure for this.

chi Over a year ago

@AugustoDias An IO [String] is allowed to generate different strings each time. A [String] is not -- it is a given, immutable list of strings. If you want to print your x :: IO [String], you can use something like x >>= traverse putStrLn (or, even better, use traverse_ after the right import). I would recommend reading about how IO works in Haskell, there should be many tutorials around the 'net.

Augusto Dias Over a year ago

I'm trying to learn haskell, but it seems to be quite unpratical when things can not get out from IO once it's in.

Daniel Wagner Over a year ago

@AugustoDias Strong disagree: I find it very practical indeed for things to advertise clearly when their behavior depends on values I, the caller, do not control. I use it approximately daily at my job; I can think of very few things that qualify more as "practice" than that for the purposes of assessing practicality.

|

jpmarinier · Accepted Answer · 2019-10-24 00:05:07Z

How not to involve IO in random number generation:

This question has received excellent answers. However, it might leave some readers under the impression that pseudo-random number generation (PRNG) within Haskell is necessarily linked to IO.

Well, it's not. It is just that in Haskell, the default random number generator happens to be "hosted" in the IO type. But this is by choice, not by necessity.

For reference, here is a recent review paper on the subject of PRNGs. PRNGs are deterministic mathematical automata. They do not involve IO. Using PRNGs in Haskell does not need to involve the IO type. At the bottom of this answer, I provide code that solves the problem at hand without involving the IO type, except for printing the result.

The Haskell libraries provide functions such as mkStdGen that take an integer seed and return a pseudo-random number generator, that is an object of the RandomGen class, whose state is dependent on the value of seed. Note that there is nothing magic about mkStdGen. If for some reason you do not like it, there are alternatives, such as mkTFGen which is based on the Threefish block cipher.

Now, pseudo-random number generation is not managed in the same way in imperative languages such as C++ and in Haskell. In C++, you would extract a random value like this: rval = rng.nextVal();. On top of just returning the value, calling nextVal() has the side effect of altering the state of the rng object, ensuring that next time it will return a different random number.

But in Haskell, functions have no side effects. So you need to have something like this:

(rval, rng2) = nextVal rng1

That is, the evaluation function needs to return both the pseudo-random value and the updated state of the generator. A minor consequence is that, if the state is large (such as for the common Mersenne Twister generator), Haskell might need a bit more memory than C++.

So, we expect that solving the problem at hand, that is randomly transforming a list of strings, will involve a function with the following type signature: RandomGen tg => [String] -> tg -> ([String], tg).

For illustration purposes, let's get a generator and use it to generate a couple of "random" integers between 0 and 100. For this, we need the randomR function:

$ ghci
Prelude> import System.Random
Prelude System.Random> :t randomR
randomR :: (RandomGen g, Random a) => (a, a) -> g -> (a, g)
Prelude System.Random> 
Prelude System.Random> let rng1 = mkStdGen 544
Prelude System.Random> let (v, rng2) = randomR (0,100) rng1
Prelude System.Random> v
23
Prelude System.Random> let (v, rng2) = randomR (0,100) rng1
Prelude System.Random> v
23
Prelude System.Random> let (w, rng3) = randomR (0,100) rng2
Prelude System.Random> w
61
Prelude System.Random>

Note that above, when we forget to feed the updated state of the generator, rng2, into the next computation, we get the same "random" number 23 a second time. This is a very common mistake and a very common complaint. Function randomR is a pure Haskell function that does not involve IO. Hence it has referential transparency, that is when given the same arguments, it returns the same output value.

A possible way to deal with this situation is to pass the updated state around manually within the source code. This is cumbersome and error prone, but can be managed. That gives this style of code:

-- stateful map of randomize function for a list of strings:
fmapRandomize :: RandomGen tg => [String] -> tg -> ([String], tg)
fmapRandomize [] rng = ([], rng)
fmapRandomize(str:rest) rng = let (str1, rng1)  = randomize str rng
                                  (rest1, rng2) = fmapRandomize rest rng1
                              in  (str1:rest1, rng2)

Thankfully, there is a better way, which involves the runRand function or its evalRand sibling. Function runRand takes a monadic computation plus (an initial state of) a generator. It returns the pseudo-random value and the updated state of the generator. It is much easier to write the code for monadic computations than to pass the generator state manually around.

This is a possible way to solve the random string substitution problem from the question text:

import  System.Random
import  Control.Monad.Random


-- generic monadic computation to get a sequence of "count" random items:
mkRandSeqM :: (RandomGen tg, Random tv) => (tv,tv) -> Int -> Rand tg [tv]
mkRandSeqM range count = sequence (replicate count (getRandomR range))

-- monadic computation to get our sort of random string:
mkRandStrM :: RandomGen tg => Rand tg String
mkRandStrM = mkRandSeqM  ('a', 'z')  10

-- monadic single string transformation:
randomizeM :: RandomGen tg => String -> Rand tg String
randomizeM str =  if (str == "random")  then  mkRandStrM  else  (pure str)

-- monadic list-of-strings transformation:
mapRandomizeM :: RandomGen tg => [String] -> Rand tg [String]
mapRandomizeM = mapM randomizeM

-- non-monadic function returning the altered string list and generator:
mapRandomize :: RandomGen tg => [String] -> tg -> ([String], tg)
mapRandomize lstr rng = runRand  (mapRandomizeM lstr)  rng


main = do
    let inpList  = ["random", "foo", "random", "bar", "random", "boo", "qux"]
    -- get a random number generator:
    let mySeed  = 54321
    let rng1    = mkStdGen mySeed  

    -- execute the string substitutions:
    let (outList, rng2) = mapRandomize inpList rng1

    -- display results:
    putStrLn $ "inpList = " ++ (show inpList)
    putStrLn $ "outList = " ++ (show outList)

Note that above, RandomGen is the class of the generator, while Random is just the class of the generated value.

Program output:

$ random1.x
inpList = ["random","foo","random","bar","random","boo","qux"]
outList = ["gahuwkxant","foo","swuxjgapni","bar","zdjqwgpgqa","boo","qux"]
$

Robin Zigmond · Accepted Answer · 2019-09-07 19:39:13Z

The fundamental problem with your approach is that Haskell is a pure language, and you're trying to use it as if its not. In fact this isn't the only fundamental misunderstanding of the language that your code displays.

In your randomise function:

randomize :: String -> String
randomize str = 
  case str of
    "random" -> randStr
     _        -> str
  where
    randStr = take 10 $ randomRs ('a','z') $ unsafePerformIO newStdGen

you clearly intend that randStr takes a different value each time it is used. But in Haskell, when you use the = sign, you are not "assigning a value to a variable", as would be the case in an imperative language. You are saying that these two values are equal. Since all "variables" in Haskell are actually "constant" and immutable, the compiler is perfectly entitled to assume that every occurrence of randStr in your program can be replaced by whatever value it first calculates for it.

Unlike an imperative language, Haskell programs are not a sequence of statements to execute, which perform side effects such as updating state. Haskell programs consist of expressions, which are evaluated more or less in whatever order the compiler deems best. (In particular there is the main expression, which describes what your entire program will do - this is then converted by the compiler and runtime into executable machine code.) So when you assign a complex expression to a variable, you are not saying "at this point in the execution flow, do this calculation and assign the result to this variable". You are saying that "this is the value of the variable", for "all time" - that value isn't allowed to change.

Indeed the only reason that it seems to change here is because you have used unsafePerformIO. As the name itself says, this function is "unsafe" - it should basically never be used, at least unless you really know exactly what you're doing. It is not supposed to be a way of "cheating", as you use it here, to use IO, and thereby generate an "impure" result that may be different in different parts of the program, but pretend the result is pure. It's hardly surprising that this doesn't work.

Since generating random values is inherently impure, you need to do the whole thing in the IO monad, as @DanielWagner has shown one approach for in his answer.

(There is actually another way, involving taking a random generator and functions like randomR to generate a random value together with a new generator. This allows you to do more in pure code, which is generally preferable - but it takes more effort, likely including using the State monad to simplify the threading through of the generator values, and you'll still need IO in the end to make sure you get a new random sequence each time you run the program.)

Collectives™ on Stack Overflow

How can I generate different random values in Haskell?

3 Answers 3

7 Comments

How not to involve IO in random number generation:

Program output:

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

How not to involve IO in random number generation:

Program output:

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related