Apache Kafka has had a major impact in a short time.
With the project reporting more than 60% of Fortune 100 companies using it today to modernize their data architecture, Kafka has proven to be a popular event streaming platform across a range of industries.
Apache Kafka is an open source software platform developed by the Apache Software Foundation that can publish, subscribe to, store, and process streams of records in real time. Some use cases of Apache Kafka include messaging, website activity tracking, metrics, and log aggregation.
Basically, if you want to move massive amounts of data quickly and at scale, you want Apache Kafka. But that’s not to say you won’t encounter some challenges when using it.
Let’s take a look at some of the common challenges of using Apache Kafka in this article—and what options are available to help you deal with them.
1) Apache Kafka is hard to set up (and learn)
Setting up and managing Apache Kafka no cakewalk.
Are you going to put it on a physical machine? A cloud? What are the considerations to think about for both? You have to figure out networking requirements, setting up the right interfaces, segregating in terms of security.
And then, when you have it all up and running, you still have to tackle Day 2 operations and know how to diagnose and resolve problems as they arise. It’s complicated.
2) Apache Kafka isn’t super developer-friendly
Developers new to Apache Kafka might find it difficult to grasp the concept of Kafka brokers, clusters, partitions, topics, logs, and so on. The learning curve is steep. You’ll need extensive training to learn Kafka’s basic foundations and the core elements of an event streaming architecture.
To better understand this particular challenge, we first need to familiarize ourselves with how Kafka works.
At a high level, Apache Kafka streams data from sources (called producers) to targets (called consumers). Producers can push data into (and consumers can pull data out of) what we call Kafka topics, where our data is stored and published.
Each piece of data in a Kafka topic is a key-value pair, where the value can be serialized into data formats like Avro, JSON, or Protobuf. The structure of the data format is what we call a schema.
A problem developers might run into when trying to use different schemas to encode data is consumers being unable to understand producers: your downstream consumers will start breaking if the producer schema is different. In other words, there is no data verification in Kafka.
Another area where developers might have trouble working with Kafka is in protocol support. Because Kafka works in a Java Virtual Machine (JVM) ecosystem, the main programming language of the client is Java. This could be a problem if your preferred language is Python or C, for example.
While there are open source clients available in other languages, these don’t come with Kafka itself. You’ll have to set up and update these drivers manually for full protocol support.
3) Spinning up Kafka connectors takes time and energy
To move large collections of data into and out of Apache Kafka, there’s a tool called Kafka Connect. Kafka Connect lets you run and build connectors (which tell you where data should be copied to and from) for your Kafka cluster.
Sounds great, right?
The problem is, manually creating or running connectors for Kafka Connect clusters can take up a lot of operational bandwidth that you might not have. Your team will have to spin up these connectors, provision the right infrastructure for them, and generally deal with the day-to-day operations of the cluster. Focusing on larger business challenges becomes difficult when you and your teams have to deal with running and operating Kafka Connect on a rudimentary level.
What can help solve these challenges?
Red Hat Openshift Streams for Kafka is designed to alleviate the pain of doing all the administration work for Kafka.
With Openshift Streams for Kafka, you don’t have to worry about taking the time to get Kafka up and running. You can get started with Kafka right away with a free trial (no strings attached or credit card required) and connect to it from any application.
And if you decide that you want to stick around with Openshift Streams for Apache Kafka, you also don’t need to worry about maintaining it all day. Red Hat's 24/7 global Site Reliability Engineering team fully manages daily operations, including:
-
Monitoring
-
Logging
-
Upgrades
-
And patching, to proactively address issues and quickly solve problems.
Red Hat Cloud Services delivers a streamlined developer experience and ensures consistency in data handling across applications. Red Hat OpenShift Service Registry, for example, enables development teams to publish, discover, and communicate using well-defined data schemas with Apache Kafka.
Lastly, while Openshift Streams for Apache Kafka does not yet handle running and operating Kafka connectors, it is in the overall roadmap. Be sure to check the Openshift Streams for Apache Kafka homepage for updates and more resources.
À propos de l'auteur
Bill Cozens is a recent UNC-Chapel Hill grad interning as an Associate Blog Editor for the Red Hat Blog.
Contenu similaire
Parcourir par canal
Automatisation
Les dernières nouveautés en matière d'automatisation informatique pour les technologies, les équipes et les environnements
Intelligence artificielle
Actualité sur les plateformes qui permettent aux clients d'exécuter des charges de travail d'IA sur tout type d'environnement
Cloud hybride ouvert
Découvrez comment créer un avenir flexible grâce au cloud hybride
Sécurité
Les dernières actualités sur la façon dont nous réduisons les risques dans tous les environnements et technologies
Edge computing
Actualité sur les plateformes qui simplifient les opérations en périphérie
Infrastructure
Les dernières nouveautés sur la plateforme Linux d'entreprise leader au monde
Applications
À l’intérieur de nos solutions aux défis d’application les plus difficiles
Programmes originaux
Histoires passionnantes de créateurs et de leaders de technologies d'entreprise
Produits
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Services cloud
- Voir tous les produits
Outils
- Formation et certification
- Mon compte
- Assistance client
- Ressources développeurs
- Rechercher un partenaire
- Red Hat Ecosystem Catalog
- Calculateur de valeur Red Hat
- Documentation
Essayer, acheter et vendre
Communication
- Contacter le service commercial
- Contactez notre service clientèle
- Contacter le service de formation
- Réseaux sociaux
À propos de Red Hat
Premier éditeur mondial de solutions Open Source pour les entreprises, nous fournissons des technologies Linux, cloud, de conteneurs et Kubernetes. Nous proposons des solutions stables qui aident les entreprises à jongler avec les divers environnements et plateformes, du cœur du datacenter à la périphérie du réseau.
Sélectionner une langue
Red Hat legal and privacy links
- À propos de Red Hat
- Carrières
- Événements
- Bureaux
- Contacter Red Hat
- Lire le blog Red Hat
- Diversité, équité et inclusion
- Cool Stuff Store
- Red Hat Summit