This is obv waaay late but I thought I'd post anyways for future readers' benefit (I imagine OP is long done with this problem).
TL;DR:
I think we probably want to use the Data.Vector package for this problem (and similar types of problems).
Longer version:
According to the Haskell docs, a Map (from Data.Map) has O(log N) access time whereas a Vector (from Data.Vector) has O(1) access; we can see the difference in the results below: the vector implementation runs ~3x faster. (Both are way better than lists which have O(N) access time.)
A couple of benchmarks are included below. The tests were intentionally not run one after another so as to prevent any cache-based optimization.
A couple of observations:
The largest absolute improvement (from the code in the original post) was due to the addition of type signatures; without being explicitly told that the data was of type Int, Haskell's type system was inferring that the data was of type Integer (which is obv bigger and slower)
A bit counterintuitive but, results are virtually indistinguishable between foldl1' and foldl1. (I double checked the code and ran these a few times just to make sure.)
Vector and Array (and, to a certain extent, Map) allow for decent improvement primarily as a result of memoization. (Note that OP's solution is likely a lot faster than a list-based solution that tried to use memoization given lists' O(N) access time.)
Here are a couple of benchmarks (all compiled using O2):
Probably want to look
at these numbers
|
V
Data.Vector 0.35s user 0.10s system 97% cpu 0.468 total
Data.Array (Haskell.org) 0.31s user 0.21s system 98% cpu 0.530 total
Data.Map (above answer) 1.31s user 0.46s system 99% cpu 1.772 total
Control.Parallel (Haskell.org) 1.75s user 0.05s system 99% cpu 1.799 total
OP (`Int` type sigs + foldl') 3.60s user 0.06s system 99% cpu 3.674 total
OP (`Int` type sigs) 3.53s user 0.16s system 99% cpu 3.709 total
OP (verbatim) 3.36s user 4.77s system 99% cpu 8.146 total
Source of figures from Haskell.org: https://www.haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_14
The Data.Vector implementation used to generate the above results:
import Data.Vector ( Vector, fromList, maxIndex, (!) )
main :: IO ()
main = putStrLn $ show $ largestCollatz 1000000
largestCollatz :: Int -> Int
largestCollatz n = maxIndex vector
where
vector :: Vector Int
vector = fromList $ 0 : 1 : [collatz x x 0 | x <- [2..n]]
collatz m i c =
case i < m of
True -> c + vector ! i
False -> let j = if even i then i `div` 2 else 3*i + 1
in collatz m j (c+1)
foldl1->foldl1', usequotReminstead ofevenanddiv, specialize toIntif possible, and add memoization. I feel like we've answered this question a couple times before, though, so searching for a duplicate before fleshing this out to an answer...ghci(interpreter), do you? Because when compiled withghc -O2your code outputs the answer in 0.14s for 20000 and in 10s for 1000000 on my machine. It still can probably be optimised but just to make sure we are doing it right from the start.Int(on a 64-bit GHC), and usingquotinstead ofdiv, compiling withghc -O2 -fllvm, it takes less than 1 second here. Note, however, that you don't need the length of the longest chain, but the starting number.