We are using Kafka 0.10 with Spark 2.1 and I found our producer publish messages was always slow. I can only reach around 1k/s after give 8 cores to Spark executors while other post said they car reach millions/sec easily. I tried to tune the linger.ms and batch.size to find out. However I found linger.ms=0 looks like optimal for me and the batch.size doesn't take much effect. And I was sending 160k events per iteration. Looks like I have to enable the Kafka Producer Metrics to know what exactly happen. But looks like it is not very easy to enable it in Spark Executor.
Could any one share me some lights?
My codes are like this:
private def publishMessagesAttempt(producer: KafkaProducer[String, String], topic: String, messages: Iterable[(String, String)], producerMaxDelay: Long,
individualMessageMaxDelay: Long, logger: (String, Boolean) => Unit = KafkaClusterUtils.DEFAULT_LOGGER): Iterable[(String, String)] = {
val futureMessages = messages.map(message => (message, producer.send(new ProducerRecord[String, String](topic, message._1, message._2))))
val messageSentTime = System.currentTimeMillis
val awaitedResults = futureMessages.map { case (message, future) =>
val waitFor = Math.max(producerMaxDelay - (System.currentTimeMillis - messageSentTime), individualMessageMaxDelay)
val failed = Try(future.get(waitFor, TimeUnit.MILLISECONDS)) match {
case Success(_) => false
case Failure(f) =>
logger(s"Error happened when publish to Kafka: ${f.getStackTraceString}", true)
true
}
(message, failed)
}
awaitedResults.filter(_._2).map(_._1)
}
I finally find the answer. 1. KafkaProducer has a metrics() function which can get the metrics of the producer. Just simply print it should be enough.
Some codes like this should work:
public class MetricsProducerReporter implements Runnable {
private final Producer<String, StockPrice> producer;
private final Logger logger =
LoggerFactory.getLogger(MetricsProducerReporter.class);
//Used to Filter just the metrics we want
private final Set<String> metricsNameFilter = Sets.set(
"record-queue-time-avg", "record-send-rate", "records-per-request-avg",
"request-size-max", "network-io-rate", "record-queue-time-avg",
"incoming-byte-rate", "batch-size-avg", "response-rate", "requests-in-flight"
);
public MetricsProducerReporter(
final Producer<String, StockPrice> producer) {
this.producer = producer;
}
@Override
public void run() {
while (true) {
final Map<MetricName, ? extends Metric> metrics
= producer.metrics();
displayMetrics(metrics);
try {
Thread.sleep(3_000);
} catch (InterruptedException e) {
logger.warn("metrics interrupted");
Thread.interrupted();
break;
}
}
}
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加