How not to involve IO in random number generation:
This question has received excellent answers. However, it might leave some readers under the impression that pseudo-random number generation (PRNG) within Haskell is necessarily linked to IO.
Well, it's not. It is just that in Haskell, the default random number generator happens to be "hosted" in the IO type. But this is by choice, not by necessity.
For reference, here is a recent review paper on the subject of PRNGs.
PRNGs are deterministic mathematical automata. They do not involve IO. Using PRNGs in Haskell does not need to involve the IO type. At the bottom of this answer, I provide code that solves the problem at hand without involving the IO type, except for printing the result.
The Haskell libraries provide functions such as mkStdGen that take an integer seed and return a pseudo-random number generator, that is an object of the RandomGen class, whose state is dependent on the value of seed. Note that there is nothing magic about mkStdGen. If for some reason you do not like it, there are alternatives, such as mkTFGen which is based on the Threefish block cipher.
Now, pseudo-random number generation is not managed in the same way in imperative languages such as C++ and in Haskell. In C++, you would extract a random value like this: rval = rng.nextVal();. On top of just returning the value, calling nextVal() has the side effect of altering the state of the rng object, ensuring that next time it will return a different random number.
But in Haskell, functions have no side effects. So you need to have something like this:
(rval, rng2) = nextVal rng1
That is, the evaluation function needs to return both the pseudo-random value and the updated state of the generator. A minor consequence is that, if the state is large (such as for the common Mersenne Twister generator), Haskell might need a bit more memory than C++.
So, we expect that solving the problem at hand, that is randomly transforming a list of strings, will involve a function with the following type signature: RandomGen tg => [String] -> tg -> ([String], tg).
For illustration purposes, let's get a generator and use it to generate a couple of "random" integers between 0 and 100. For this, we need the randomR function:
$ ghci
Prelude> import System.Random
Prelude System.Random> :t randomR
randomR :: (RandomGen g, Random a) => (a, a) -> g -> (a, g)
Prelude System.Random>
Prelude System.Random> let rng1 = mkStdGen 544
Prelude System.Random> let (v, rng2) = randomR (0,100) rng1
Prelude System.Random> v
23
Prelude System.Random> let (v, rng2) = randomR (0,100) rng1
Prelude System.Random> v
23
Prelude System.Random> let (w, rng3) = randomR (0,100) rng2
Prelude System.Random> w
61
Prelude System.Random>
Note that above, when we forget to feed the updated state of the generator, rng2, into the next computation, we get the same "random" number 23 a second time. This is a very common mistake and a very common complaint. Function randomR is a pure Haskell function that does not involve IO. Hence it has referential transparency, that is when given the same arguments, it returns the same output value.
A possible way to deal with this situation is to pass the updated state around manually within the source code. This is cumbersome and error prone, but can be managed. That gives this style of code:
-- stateful map of randomize function for a list of strings:
fmapRandomize :: RandomGen tg => [String] -> tg -> ([String], tg)
fmapRandomize [] rng = ([], rng)
fmapRandomize(str:rest) rng = let (str1, rng1) = randomize str rng
(rest1, rng2) = fmapRandomize rest rng1
in (str1:rest1, rng2)
Thankfully, there is a better way, which involves the runRand function or its evalRand sibling. Function runRand takes a monadic computation plus (an initial state of) a generator. It returns the pseudo-random value and the updated state of the generator. It is much easier to write the code for monadic computations than to pass the generator state manually around.
This is a possible way to solve the random string substitution problem from the question text:
import System.Random
import Control.Monad.Random
-- generic monadic computation to get a sequence of "count" random items:
mkRandSeqM :: (RandomGen tg, Random tv) => (tv,tv) -> Int -> Rand tg [tv]
mkRandSeqM range count = sequence (replicate count (getRandomR range))
-- monadic computation to get our sort of random string:
mkRandStrM :: RandomGen tg => Rand tg String
mkRandStrM = mkRandSeqM ('a', 'z') 10
-- monadic single string transformation:
randomizeM :: RandomGen tg => String -> Rand tg String
randomizeM str = if (str == "random") then mkRandStrM else (pure str)
-- monadic list-of-strings transformation:
mapRandomizeM :: RandomGen tg => [String] -> Rand tg [String]
mapRandomizeM = mapM randomizeM
-- non-monadic function returning the altered string list and generator:
mapRandomize :: RandomGen tg => [String] -> tg -> ([String], tg)
mapRandomize lstr rng = runRand (mapRandomizeM lstr) rng
main = do
let inpList = ["random", "foo", "random", "bar", "random", "boo", "qux"]
-- get a random number generator:
let mySeed = 54321
let rng1 = mkStdGen mySeed
-- execute the string substitutions:
let (outList, rng2) = mapRandomize inpList rng1
-- display results:
putStrLn $ "inpList = " ++ (show inpList)
putStrLn $ "outList = " ++ (show outList)
Note that above, RandomGen is the class of the generator, while Random is just the class of the generated value.
Program output:
$ random1.x
inpList = ["random","foo","random","bar","random","boo","qux"]
outList = ["gahuwkxant","foo","swuxjgapni","bar","zdjqwgpgqa","boo","qux"]
$
Randomclass and ask it for a random number, if you do this quickly in a loop, you'll observe that you get the same result for quite a while. The reason for this is that the class is seeded by the clock of the computer, but this clock value has a resolution of about 16ms, which means that if you seed 2Randominstances during the same 16ms interval, they'll produce the same sequence of "random" values. Could something similar be the case here? If not, please ignore me.unsafefunctions are really unsafe, and can easily break the language. You should pretend these functions are not there. Beginners should never be informed of their existence.