我需要预填充一个可变的IOVector(具有给定的值)。我正在使用的Haskell代码是
-- use Control.Monad, Data.Vector.Unboxed.Mutable, Data.Word, and run in IO monad
buff <- new buffsize::IO (IOVector Word8)
forM_ [0..buffsize-1] $ \p -> write buff p (100::Word8)
这比可比的C代码慢1-2个数量级
char *buff = (char *) malloc(BUFFERSIZE);
char *maxbuff = buff + BUFFERSIZE;
for(char *p = buff; p < maxbuff; p++) *p = 0;
例如,对于buffsize = 4000000000,在c中大约需要7秒,而在Haskell中大约需要3分钟。
(仅供参考,我正在使用在3.40GHz @ GHC版本7.8.4上的Intel®Core™i7-4770 CPU上运行的Ubuntu,但这些细节可能无关紧要)
有没有人看到我可以对Haskell代码进行任何更改以达到可比的速度?
三件事:
write
是经过边界检查的,而不是直接用C写的指针。unsafeWrite
如果要保持C的安全性,请将其更改为。
forM_ [0..buffsize-1]
由于针对通用性进行了优化,因此产生了开销。如果要删除所有通用性,例如C循环,请将循环写为直接递归的东西。
将llvm后端用于需要优化紧密循环的代码。
我编写了一个标准基准测试来测试大量的变体:
import Control.Monad
import Data.Vector.Unboxed.Mutable
import Data.Word
import Criterion.Main
buffsize :: Int
buffsize = 1000000
fillBuff1 :: IOVector Word8 -> IO ()
fillBuff1 buff = do
forM_ [0..buffsize-1] $ \p -> write buff p 100
fillBuff2 :: IOVector Word8 -> IO ()
fillBuff2 buff = do
forM_ [0..buffsize-1] $ \p -> unsafeWrite buff p 100
fillBuff3 :: IOVector Word8 -> IO ()
fillBuff3 buff = do
let fill n | n < buffsize = unsafeWrite buff n 100 >> fill (n + 1)
| otherwise = return ()
fill 0
fillBuff4 :: IOVector Word8 -> IO ()
fillBuff4 buff = do
let fill n | n < buffsize = write buff n 100 >> fill (n + 1)
| otherwise = return ()
fill 0
main = do
buff <- new buffsize
let b n f = bench n . whnfIO . f $ buff
defaultMain [ b "original" fillBuff1
, b "unsafeWrite" fillBuff2
, b "unsafeWrite + recursive" fillBuff3
, b "recursive" fillBuff4
]
请注意,我只是基准填充,不是分配+填充。
这是一个没有llvm的典型会话:
carl@debian:~/hask$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.8.4
carl@debian:~/hask$ ghc -O2 mutvectorwrite
[1 of 1] Compiling Main ( mutvectorwrite.hs, mutvectorwrite.o )
Linking mutvectorwrite ...
carl@debian:~/hask$ ./mutvectorwrite
benchmarking original
time 6.659 ms (6.599 ms .. 6.728 ms)
0.999 R² (0.998 R² .. 1.000 R²)
mean 6.638 ms (6.599 ms .. 6.683 ms)
std dev 120.7 μs (97.36 μs .. 165.9 μs)
benchmarking unsafeWrite
time 5.413 ms (5.319 ms .. 5.524 ms)
0.998 R² (0.995 R² .. 0.999 R²)
mean 5.346 ms (5.309 ms .. 5.394 ms)
std dev 127.4 μs (85.00 μs .. 220.2 μs)
benchmarking unsafeWrite + recursive
time 3.363 ms (3.323 ms .. 3.409 ms)
0.999 R² (0.998 R² .. 0.999 R²)
mean 3.371 ms (3.343 ms .. 3.411 ms)
std dev 104.6 μs (65.11 μs .. 187.1 μs)
variance introduced by outliers: 16% (moderately inflated)
benchmarking recursive
time 3.389 ms (3.330 ms .. 3.438 ms)
0.998 R² (0.996 R² .. 1.000 R²)
mean 3.435 ms (3.424 ms .. 3.451 ms)
std dev 43.38 μs (34.49 μs .. 67.38 μs)
和llvm的典型会话:
carl@debian:~/hask$ ghc -O2 -fllvm mutvectorwrite
[1 of 1] Compiling Main ( mutvectorwrite.hs, mutvectorwrite.o )
Linking mutvectorwrite ...
carl@debian:~/hask$ ./mutvectorwrite
benchmarking original
time 5.302 ms (5.251 ms .. 5.365 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 5.286 ms (5.262 ms .. 5.322 ms)
std dev 87.47 μs (63.29 μs .. 115.0 μs)
benchmarking unsafeWrite
time 3.929 ms (3.867 ms .. 4.001 ms)
0.998 R² (0.996 R² .. 0.999 R²)
mean 4.039 ms (3.994 ms .. 4.131 ms)
std dev 204.2 μs (114.6 μs .. 378.5 μs)
variance introduced by outliers: 30% (moderately inflated)
benchmarking unsafeWrite + recursive
time 496.4 μs (492.8 μs .. 500.8 μs)
0.999 R² (0.998 R² .. 1.000 R²)
mean 496.6 μs (492.8 μs .. 503.9 μs)
std dev 17.46 μs (9.971 μs .. 31.42 μs)
variance introduced by outliers: 27% (moderately inflated)
benchmarking recursive
time 556.6 μs (548.4 μs .. 563.8 μs)
0.998 R² (0.996 R² .. 0.999 R²)
mean 565.4 μs (559.7 μs .. 574.3 μs)
std dev 23.95 μs (16.41 μs .. 33.78 μs)
variance introduced by outliers: 35% (moderately inflated)
当您将所有内容组合在一起时,性能会下降到相当合理的水平。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句