I want to use a Cassandra database to store time series data from a test site. I am using Pattern 2 from the "Getting started with Time Series Data Modeling" tutorial but am not storing the date to limit the row size as a date, but as an int
counting the number of days elapsed since 1970-01-01, and the timestamp of the value is the number of nanoseconds since the epoch (some of our measuring devices are that precise and the precision is needed). My table for the values looks like this:
CREATE TABLE values (channel_id INT, day INT, time BIGINT, value DOUBLE, PRIMARY KEY ((channel_id, day), time))
I created a simple benchmark, taking into account using asynchronity and prepared statements for batch loading instead of batches:
def valueBenchmark(numVals: Int): Unit = {
val vs = session.prepare(
"insert into values (channel_id, day, time, " +
"value) values (?, ?, ?, ?)")
val currentFutures = mutable.MutableList[ResultSetFuture]()
for(i <- 0 until numVals) {
currentFutures += session.executeAsync(vs.bind(-1: JInt,
i / 100000: JInt, i.toLong: JLong, 0.0: JDouble))
if(currentFutures.length >= 10000) {
currentFutures.foreach(_.getUninterruptibly)
currentFutures.clear()
}
}
if(currentFutures.nonEmpty) {
currentFutures.foreach(_.getUninterruptibly)
}
}
JInt
, JLong
and JDouble
are simply java.lang.Integer
, java.lang.Long
and java.lang.Double
, respectively.
When I run this benchmark for 10 million values, this needs about two minutes for a locally installed single-node Cassandra. My computer is equipped with 16 GiB of RAM and a quad-core i7 CPU. I find this quite slow. Is this normal performance for inserts with Cassandra?
I already read these:
Are there any other things I could check?
Simple maths:
10 millions inserts/2 minutes ≈ 83 333,33333 inserts/sec which is great for a single machine, did you expect something faster?
By the way, what are the specs of your hard-drives ? SSD or spinning disks ?
You should know that massive insert scenarios are more CPU bound than I/O bound. Try to execute the same test on a machine with 8 physical cores (so 16 vcores with Hyper Threading) and compare the results.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments