Partition By Multiple Nested Fields in Kafka Connect HDFS Sink

rookie

We are running kafka hdfs sink connector(version 5.2.1) and needs HDFS data to be partitioned by multiple nested fields.The data in topics is stored as Avro and has nested elements.How ever connect cannot recognize the nested fields and throws an error that the field cannot be found.Below is the connector configuration we are using. Doesn't hdfs sink connect support partitioning by nested fields ?.I can partition by using non nested fields

{
            "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector",
            "topics.dir": "/projects/test/kafka/logdata/coss",
            "avro.codec": "snappy",
            "flush.size": "200",
            "connect.hdfs.principal": "[email protected]",
            "rotate.interval.ms": "500000",
            "logs.dir": "/projects/test/kafka/tmp/wal/coss4",
            "hdfs.namenode.principal": "hdfs/[email protected]",
            "hadoop.conf.dir": "/etc/hdfs",
            "topics": "test1",
            "connect.hdfs.keytab": "/etc/hdfs-qa/test.keytab",
            "hdfs.url": "hdfs://nameservice1:8020",
            "hdfs.authentication.kerberos": "true",
            "name": "hdfs_connector_v1",
            "key.converter": "org.apache.kafka.connect.storage.StringConverter",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "value.converter.schema.registry.url": "http://myschema:8081",
            "partition.field.name": "meta.ID,meta.source,meta.HH",
            "partitioner.class": "io.confluent.connect.storage.partitioner.FieldPartitioner"
  }
OneCricketeer

I added nested field support for the TimestampPartitioner, but the FieldPartitioner still has an outstanding PR

https://github.com/confluentinc/kafka-connect-storage-common/pull/67

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Kafka Connect AWS Lambda Sink

分類Dev

Kafka Connect SMT to add Kafka header fields

分類Dev

Kafka Connect JDBC SinkコネクターがWorkerSinkTaskエラーを表示する

分類Dev

Reload multiple nested fields value on @object.save failure

分類Dev

Kafka Connect、Cassandra Sink:パーティションとクラスタリングキーを指定する方法は?

分類Dev

Writing data from HDFS to Kafka

分類Dev

Partition specific flink kafka consumer

分類Dev

Kafka List all partition with no leader

分類Dev

Cannot reassign kafka topic partition

分類Dev

Kafka Connect Logstash

分類Dev

Kafka Connect with CockroachDB

分類Dev

Kafka Connect and Streams

分類Dev

Apache Kafka Connect With Springboot

分類Dev

Kafka Connect with Amazon MSK

分類Dev

ngTable nested fields

分類Dev

Workfront sort nested fields

分類Dev

How do I use matlab ismember in data structures with multiple nested fields?

分類Dev

hdfs.connect()とPyArrowのHdfsClient

分類Dev

Is it possible to specify a kafka topic in a Kafka Connect config?

分類Dev

Apache Kafka relation between partition and stream

分類Dev

Kafka Connect vs Streams for Sinks

分類Dev

Exclude Nested Fields and Project only a few Fields

分類Dev

Update a dataframe with nested fields - Spark

分類Dev

How to implement a form with nested fields?

分類Dev

How to order by nested objects fields?

分類Dev

DockerでのKafka接続とHDFS

分類Dev

sortedIndexBy on multiple fields

分類Dev

Match multiple fields in a list?

分類Dev

Django group by multiple fields