Announcing Gafkalo

Annoucing the release of Gafkalo. A tool to manage a Confluent Kafka platform.

While there are a few tools that manage Kafka resources with any current solution, and as they says goes, i scratched a personal itch.

What is Gafkalo?

It is a CLI tool that can primarily be used to manage resources in Confluent platform, using RBAC.

You can provide it with a YAML input definition of Topics, their Key and Value schemas, permissions for any principals and the tool will make the required changes to your cluster.

An example YAML:

topics:
  - name: SKATA.VROMIA.POLY
    partitions: 6
    replication_factor: 1
    # Any topic configs can be added to this key
    configs:
      cleanup.policy: delete
      min.insync.replicas: 1
      retention.ms: 10000000
    key:
      schema: "schema-key.json"
      compatibility: BACKWARD
    value:
      schema: "schema.json"
      compatibility: NONE
  - name: SKATA.VROMIA.LIGO
    partitions: 6
    replication_factor: 3
    configs:
      cleanup.policy: delete
      min.insync.replicas: 1
    key:
      schema: "schema-key.json"
  - name: SKATA1
    partitions: 1
    replication_factor: 1

Having a nice set of topics + schemas is not much useful if nobody can use them. So lets assign some permissions.

Gafkalo currently operates under the idea of giving a set of roles that match a usage pattern. Namely a being a consumer, a producer or (or resourceowner).

For example when assigning consumer_for to a topic, the tool will also create read permissions to the corresponding schema registry subjects, and optionally the consumer group.

Example:

clients:
  # principals must be in the form User:name or Group:name
  # For each principal you can have a consumer_for, producer_for or resourceowner_for
  # and the topics for each of these categories
  - principal: User:poutanaola
    consumer_for:
      # By default we will use PREFIXED.
      # set prefixed: false to set it to LITERAL
      - topic: TOPIC1.
      - topic: TOPIC2.
        prefixed: false
    producer_for:
      - topic: TOPIC1.
    resourceowner_for:
      - topic: TOPIC4.
  - principal: Group:malakes
    consumer_for:
      - topic: TOPIC1.
      - topic: TOPIC2.
    producer_for:
      - topic: TOPIC1.
        strict: false
    groups:
      - name: consumer-produser-
        # if not specified, roles is [DeveloperRead]
        # roles: ["ResourceOwner"]
        # prefixed is true by default but can be disabled like below
        refixed: false

After configuring gafkalo with the required config file (pointing it to bootstrap brokers, schema registry and all required authenticaton you can see a plan of what it would do:

gafkalo plan --config myconfig.yaml

This will produce an output of what operations are going to take place if you run in apply mode.

Once you are satisfied that its going to do the right thing, run in apply (yes, obviously inspired by terrafor..)

gafkalo apply --config myconfig.yaml

You will, again, get a report of what actions were taken.

It is not yet supported to increase replication_factor for topics, but should be easy to implement as re-assignment strategy code is already present..

Debugging tool

Apart from maintaining the state of your cluster, Gafkalo can be a nice debugging tool.

Some functions are:

  • consumer
  • producer
  • schema checker
    • Get a diff between a registered schema and a provided json.
    • Check if a schema is already registered under a subject

As a debugging tool, there are plenty of features to be added still. For example, it will be quite nice to send tombstones manually. Especially when managing some connectors like Debezium it is often required to drop recorded offsets from connectors.

Consumer

Gafkalo can be used as a consumer. It supports reading from multiple topics, setting consumer group, idempotence and resetting partition:offset.

Additionally, it supports pointing it to a Go template file to format records any way you want!

More details in the documentation