I am building an office PC, for which the heaviest task will be running Monte Carlo Markov Chain (MCMC) estimations in R. Clearly, a CPU-intensive task, and R is known for being limited to a single core.
Given that the budget is limited, which CPU is better suited for the task, AMD or Intel? In particular, I am looking at Intel Core i3 4130 with two cores clocked at 3.4 Ghz and at AMD FX-6300, six cores clocked at 3.5 Ghz and unlocked multiplier. Which one is better in terms of performance in heavy computational tasks, given the limitations of R?
Thank you.
EDIT: The only bit of information I could find on the issue is this FAQ item:
2.23 Why does R never use more than 50% of my CPU? This is a misreading of Windows' confusing Task Manager. R's computation is single-threaded, and so it cannot use more than one CPU. What the task manager shows is not the usage in CPUs but the usage as a percentage of the apparent total number of CPUs. We say "apparent" as it treats so-called "hyper-threaded" CPUs such as two CPUs per core, and most modern CPUs have at least two cores.
Am I to understand that if R can only use a single CPU thread, Intel's hyper-threading is of no consequence, and therefore AMD's higher clock will have the advantage?
EDIT2: And here are a couple of somewhat relevant benchmarks, though they don't compare CPUs, just the speed with and without hyper-threading.
EDIT3: this topic is about the same problem, albeit expressed in different terms. Without much resolution though.
EDIT4: I have found the answer, but since I doubt the question will be reopened, I'll reference it here. Please see my comment to the main post below.
The proper answer would be AMD (given this particular choice). If there are computational difficulties, e.g. the need to run very long MCMC chains, one can use JAGS or BUGS with R packages such as BRugs, R2WinBUGS, runjags, and rjags. In this case it is possible to run multiple chains for the same parameter on different cores and combine them post-factum. This vid explains it. The more cores, the more chains, hence AMD with six cores is preferable to Intel with 4 (hyperthreaded) cores.
For example, on a six core AMD I would run six chains simultaneously:
library(R2WinBUGS)
re.sim<-bugs(data, inits, parameters, "model.bug", n.chains=6, n.iter=100000,
n.burnin=3000, n.thin=2, debug=F, program="openbugs")
On the Intel CPU I would be able to run only four chains simultaneously, and with lower clock. It might be interesting to note that runjags library allows parallel execution, including multi-machine clusters.
I believe that people who marked this post as off-topic percieved it as a very broad question, while in fact it is an extremely narrow one, requiring the knowledge of R, software R interfaces with, what MCMC is and does, and how all of this combines in regard to uzilizing CPU power. The answer I have provided is not subjective at all, and it is directly related to programming complex Bayesian models in R. Voted to reopen, marking as off-topic is likely due to ignorance of what MCMC entails, focussing instead on the "AMD vs Intel" red herring.
According to CPU benchmarks you are better off with the AMD between the two, independent of whether you use the parallel computing packages in R such as parallel
or not.
EDIT (wrt. comments etc. - doing it here because I cannot post links long enough in comment):
Hyper-threading should in theory help you somewhat but as the graphs in the article you hint at show, it is unlikely to help you as much as actually using R in a multicore truly parallel mode (Revolution R does some of that for you whenever it can - that is probably why the CART is so much faster). You can use R across multiple cores with MCMC as indicated in an example in this pdf and some packages seem to use it in the background for you. So more cores looks better here ceteris paribus.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments