-1

I have a problem to solve. Details of it basically irrelevant & I have two valid solutions for it: in python and Haskell.

Python code:

import heapq

_, volume, *ppl = map(int, open("input.txt").read().split())
ppl = [-x for x in ppl]
heapq.heapify(ppl)
for i in range(volume):
    current = (- heapq.heappop(ppl)) // 10
    heapq.heappush(ppl, -current)

print(sum(-x for x in ppl), file=open("output.txt", 'w'))

Haskell code:

import Data.List hiding (insert, singleton)

type Rank = Int

data Heap a = Tip | Node Rank Int (Heap Int) (Heap Int)

rank Tip = 0
rank (Node r _ _ _) = r

fromList :: [Int] -> Heap Int
fromList [] = Tip
fromList (x:xs) = foldl' (flip insert) (singleton x) xs

makeHeap :: Int -> Heap Int -> Heap Int -> Heap Int
makeHeap x a b = if rank a >= rank b then Node (rank b + 1) x a b
                                     else Node (rank a + 1) x b a

empty :: Heap a
empty = Tip

singleton :: Int -> Heap Int
singleton x = Node 1 x Tip Tip

insert :: Int -> Heap Int -> Heap Int
insert x = merge (singleton x)

merge :: Heap Int -> Heap Int -> Heap Int
merge l Tip = l
merge Tip r = r
merge h1@(Node _ x l1 r1) h2@(Node _ y l2 r2) =
  if x > y then makeHeap x l1 (merge r1 h2)
            else makeHeap y l2 (merge h1 r2)

-- | O(1).
peek :: Heap Int -> Int
peek Tip = 0
peek (Node _ x _ _) = x

-- | O(1), but evaluating the second element of the tuple has same complexity
-- of `merge`.
extract :: Heap Int -> (Int, Heap Int)
extract (Node _ x a b) = (x, merge a b)

toSum :: Heap Int -> Int
toSum Tip            = 0
toSum (Node _ x a b) = x + toSum a + toSum b 

solve :: Int -> Heap Int -> Int
solve 0 heap = toSum heap
solve _ (Node _ 0 _ _) = 0
solve sips ppl = solve (sips - 1) (insert (val `div` 10) h)
    where (val, h) = extract ppl
    


main :: IO()
main = do
    line <- readFile "input.txt"
    let _:volume:ppl = map(read :: String -> Int) $ words line
    let heap = fromList ppl
    writeFile "output.txt" $ show $ solve volume heap

The question borrowed is in the following slice of the results table:

N of test Verdict Time (sec) python Time (sec) Haskell
19 OK 0.0312 0.0156
20 OK 0.0624 0.172
21 OK 0.078 0.733
22 OK 0.234 1.34
23 OK 0.218 1.34

Tests itself is unavailable to me.

So... Seems like my Haskell code for some reason is significantly slower with high amount of values (~10^5). Could someone explain me why and how to fix it?

12
  • 3
    probably it's because of your so-called heap implementation in Haskell is not how a normal heap is implemented and therefore not being very efficient? Have you tried with hackage.haskell.org/package/heap-1.0.4/docs/Data-Heap.html ? Or is this homework? if so, please say so. Commented Oct 26, 2024 at 14:53
  • I cannot "try" it in the test system. If my implementation is inefficient (what's obviously the case) I want to know why. Commented Oct 26, 2024 at 15:05
  • 1
    Why? It seems that your makeHeap and merge are both O(n). Have a look at en.wikipedia.org/wiki/Heap_(data_structure)#Implementation In Haskell, hackage.haskell.org/package/vec is the normal go-to lib if you need an array (which is not just a linked list). Commented Oct 26, 2024 at 15:40
  • 2
    extract should be O(lg n), and it's not obvious that merge is O(lg n) as it needs to be. Commented Oct 26, 2024 at 16:22
  • 1
    You might find RosettaCode.org interesting and useful: rosettacode.org/wiki/Priority_queue#Haskell Commented Oct 26, 2024 at 17:42

1 Answer 1

2

Most likely, the issue is with reading the input file rather than your algorithm.

When I benchmark reading a million integers and summing them with Python:

ppl = map(int, open("random.dat").read().split())
print(sum(ppl))

I get a baseline time of about 100ms.

If I compare that with a version patterned after your Haskell code:

main :: IO()
main = do
    line <- readFile "random.dat"
    let ppl = map (read :: String -> Int) $ words line
    print $ sum ppl

That takes about 950ms. Oof!

Switching from this String implementation to a lazy ByteString:

import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Lazy.Char8 as BL

main :: IO()
main = do
    line <- BL.readFile "random.dat"
    let ppl = [n | w <- BL.words line, let Just (n, _) = BL.readInt w]
    print $ sum ppl

gets it down to 43ms.

So, give that a shot.

Sign up to request clarification or add additional context in comments.

4 Comments

They said 10^5, though, not a million. If that takes 95 ms, that's only a small part of the 1.34 seconds.
link: 19 -- 0 sec | 20 -- 0.062 sec | 21 -- 0.172 sec | 22 -- 0.702 sec | 23 -- 0.718 sec | Very interesting! thanks! Still 3x slower then python though :)
You may find that adding strictness with some well placed exclamation marks helps. Writing data Heap a = Tip | Node !Rank !Int !(Heap Int) !(Heap Int) gave a significant improvement in some of my testing.
@K.A.Buhr another 50 ms win! Thank you! 22: 0.686, 23: 0.655.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.