confluentinc / schema-registry Goto Github PK

View Code? Open in Web Editor NEW

2.1K 368.0 1.1K 113.47 MB

Confluent Schema Registry for Kafka

Home Page: https://docs.confluent.io/current/schema-registry/docs/index.html

License: Other

Java 99.33% Shell 0.39% HTML 0.20% Batchfile 0.08%

schema-registry kafka schema schemas avro avro-schema rest-api confluent protobuf protobuf-schema

schema-registry's Introduction

Schema Registry

Confluent Schema Registry provides a serving layer for your metadata. It provides a RESTful interface for storing and retrieving your Avro®, JSON Schema, and Protobuf schemas. It stores a versioned history of all schemas based on a specified subject name strategy, provides multiple compatibility settings and allows evolution of schemas according to the configured compatibility settings and expanded support for these schema types. It provides serializers that plug into Apache Kafka® clients that handle schema storage and retrieval for Kafka messages that are sent in any of the supported formats.

This README includes the following sections:

Documentation
Quickstart API Usage examples
Installation
Deployment
Development
OpenAPI Spec
Contribute
License

Documentation

Here are a few links to Schema Registry pages in the Confluent Documentation.

Quickstart API Usage examples

The following assumes you have Kafka and an instance of the Schema Registry running using the default settings. These examples, and more, are also available at API Usage examples on docs.confluent.io.

# Register a new version of a schema under the subject "Kafka-key"
$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    --data '{"schema": "{\"type\": \"string\"}"}' \
    http://localhost:8081/subjects/Kafka-key/versions
  {"id":1}

# Register a new version of a schema under the subject "Kafka-value"
$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    --data '{"schema": "{\"type\": \"string\"}"}' \
     http://localhost:8081/subjects/Kafka-value/versions
  {"id":1}

# List all subjects
$ curl -X GET http://localhost:8081/subjects
  ["Kafka-value","Kafka-key"]

# List all schema versions registered under the subject "Kafka-value"
$ curl -X GET http://localhost:8081/subjects/Kafka-value/versions
  [1]

# Fetch a schema by globally unique id 1
$ curl -X GET http://localhost:8081/schemas/ids/1
  {"schema":"\"string\""}

# Fetch version 1 of the schema registered under subject "Kafka-value"
$ curl -X GET http://localhost:8081/subjects/Kafka-value/versions/1
  {"subject":"Kafka-value","version":1,"id":1,"schema":"\"string\""}

# Fetch the most recently registered schema under subject "Kafka-value"
$ curl -X GET http://localhost:8081/subjects/Kafka-value/versions/latest
  {"subject":"Kafka-value","version":1,"id":1,"schema":"\"string\""}

# Delete version 3 of the schema registered under subject "Kafka-value"
$ curl -X DELETE http://localhost:8081/subjects/Kafka-value/versions/3
  3

# Delete all versions of the schema registered under subject "Kafka-value"
$ curl -X DELETE http://localhost:8081/subjects/Kafka-value
  [1, 2, 3, 4, 5]

# Check whether a schema has been registered under subject "Kafka-key"
$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    --data '{"schema": "{\"type\": \"string\"}"}' \
    http://localhost:8081/subjects/Kafka-key
  {"subject":"Kafka-key","version":1,"id":1,"schema":"\"string\""}

# Test compatibility of a schema with the latest schema under subject "Kafka-value"
$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    --data '{"schema": "{\"type\": \"string\"}"}' \
    http://localhost:8081/compatibility/subjects/Kafka-value/versions/latest
  {"is_compatible":true}

# Get top level config
$ curl -X GET http://localhost:8081/config
  {"compatibilityLevel":"BACKWARD"}

# Update compatibility requirements globally
$ curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    --data '{"compatibility": "NONE"}' \
    http://localhost:8081/config
  {"compatibility":"NONE"}

# Update compatibility requirements under the subject "Kafka-value"
$ curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    --data '{"compatibility": "BACKWARD"}' \
    http://localhost:8081/config/Kafka-value
  {"compatibility":"BACKWARD"}

Installation

You can download prebuilt versions of the schema registry as part of the Confluent Platform. To install from source, follow the instructions in the Development section.

Deployment

The REST interface to schema registry includes a built-in Jetty server. The wrapper scripts bin/schema-registry-start and bin/schema-registry-stop are the recommended method of starting and stopping the service.

Development

To build a development version, you may need a development versions of common and rest-utils. After installing these, you can build the Schema Registry with Maven.

This project uses the Google Java code style to keep code clean and consistent.

To build:

mvn compile

To run the unit and integration tests:

mvn test

To run an instance of Schema Registry against a local Kafka cluster (using the default configuration included with Kafka):

mvn exec:java -pl :kafka-schema-registry -Dexec.args="config/schema-registry.properties"

To create a packaged version, optionally skipping the tests:

mvn package [-DskipTests]

It produces:

Schema registry in package-schema-registry/target/kafka-schema-registry-package-$VERSION-package
Serde tools for avro/json/protobuf in package-kafka-serde-tools/target/kafka-serde-tools-package-$VERSION-package

Each of the produced contains a directory layout similar to the packaged binary versions.

You can also produce a standalone fat JAR of schema registry using the standalone profile:

mvn package -P standalone [-DskipTests]

This generates package-schema-registry/target/kafka-schema-registry-package-$VERSION-standalone.jar, which includes all the dependencies as well.

OpenAPI Spec

OpenAPI (formerly known as Swagger) specifications are built automatically using swagger-maven-plugin on compile phase.

Contribute

Thanks for helping us to make Schema Registry even better!

Source Code: https://github.com/confluentinc/schema-registry
Issue Tracker: https://github.com/confluentinc/schema-registry/issues

License

The project is licensed under the Confluent Community License, except for the client-* and avro-* libs, which are under the Apache 2.0 license. See LICENSE file in each subfolder for detailed license agreement.

schema-registry's People

Contributors

Stargazers

Watchers

Forkers

ai4labs danharvey uber-archive jcrobak ankurcha frankdallaire ragnard yiming187 noahhl projectflorida stebourbi hachikuji rluta e-plus-healthcare-alliance ngoyal chornyi jamesbassett dminkovsky kafka101 knservis tleach namburikrishna pritigangrade fpj paetling hkropp yushuyao joychugh amarouni jburet kvlou apg mquirin arraycopy jonathan-rodan frenchf chetann audaxhealthinc gabrielnau mohitgargk cequencer cjzainuli binays fysoft2006 senthilnathanrajan aqia358 jh12z dgriff1 rekhajoshm aalkilani apakulov kingcontext kevinjmao datareply eugenemiretsky michaelbearne jocelynqian xushiwei chiyoung0258 epeinado javierholguera dossett is00hcw jcustenborder sven0726 amiron prateek luinix gavinsherry xingfeng2510 easyfmxu jianbzhou alunarbeach anirban-sap viniciusccarvalho datapipelineinc jdehrlich elarasu stunlockstudios bwilliams42 gf53520 mardambey enak supermanscott junminstorage dimad svjain hugheslou leven2012 mongey bestnaja pmavrodiev solidm rucky2013 ravink cotedm bcotton rogers bguillon-td bigdatafly

schema-registry's Issues

Add multi colo setup support

Multi colo setup for schema registry includes setting up a mirror maker for the schema registry log topic between the schema registry cluster in the master colo to the ones in the slave colos. Before setting up mirror maker, the admin also needs to delete the schema registry log topic from the slave schema registry cluster, if one exists. It will be good to have scripts and documentation for this.

Qualify 500 error codes with more details about the type of server side error

As per @ewencp's suggestion -

More specific error codes to qualify 500 errors are useful. For example, we could have a 50001 - Kafka error, which generically covers underlying issues with communicating with Kafka. That's generic, doesn't require defining hundreds of error codes, but still gives more useful information than just a plain 500 (which many people interpret as a buggy web service rather than a legitimate issue with another service).

Evaluate and improve performance of de/serializers

The AbstractKafkaAvroSerializer and AbstractKafkaAvroDeserializer both use the straightforward implementation of Avro serialization that allocates a new encoder and DatumWriter for every call.

We should start by adding a small microbenchmark tool to evaluate performance, then add some caching. Any cache implementation needs to be careful to handle multiple threads well. Additionally, to support applications like kafka-rest that may be long-running, use a lot of schemas, and survive across schema upgrades, it should be able to expire and dispose of cached DatumWriters.

Handle broker failure when putting records into kafka store

KafkaStore doesn't currently have a way to handle failures in the kafka broker cluster.

Update API

Update API to reflect today's discussion. We decided that HTTP POST method is legitimately used to make queries when data in the message body is required. The argument against using GET was that passing message body data is uncommon and may not be supported in some http libraries.

Change compatibility check:

Add support for POST /compatibility/subjects//versions/
schema string in message body is tested against schema in /
return true, or false with incompatibility metadata produced by Avro
this includes adding support for == 'latest'
Remove dry_run support

Add ability to check whether a schema is registered under a topic:

POST /subjects/
schema string in message body
return ?

Change /schemas/ to /schemas/ids/

We also concluded that POST to a collections resource should canonically be considered a create request, but that POST used elsewhere can be more flexible. E.g. POST /subjects//versions creates a version under the collection of versions.

RegisterSchemaForwardingAgent should not be part of the SchemaRegistry interface

This is entirely an implementation detail of our schema registry that should not be leaked in the interface. I had mentioned this in the review for #2 but we missed it during review and checkin

Clean up dependencies

It looks like we have some extra dependencies that aren't actually required. At a minimum, I see Jersey, which should be pulled in transitively via rest-utils. In order to generate binary packages as efficiently as possible we should get rid of any dependencies that aren't strictly required and aren't provided transitively by other dependencies.

Add metrics to schema registry

Similar to the ones we have for kafka-rest (confluentinc/kafka-rest@33943d2)

Also, another one to indicate the master node

close resources when shutting down the schema registry server

Now that we have the clean shutdown support in rest utils (confluentinc/rest-utils#2), we can add the shutdown hook in schema registry server to close resources cleanly.

We also need to add the signal handler in Main so that we can shutdown the server cleanly on ctrl-C.

Add REST endpoint for compatibility check

Spec calls for REST endpoint to allow compatibility check, but this is not yet implemented.

schema registration - check local caches before forwarding request

Currently a register request forwards immediately to master if the given process is not master. The cache-checking to see if the schema has already been registered should take place locally before forwarding the request.

This should help latency and decrease sensitivity to partitions, particularly in a multicolo setup.

Change schema registry APIs to use subjects

Changed the APIs to refer to subjects as discussed last week. Also added schema retrieval using a globally unique id. The id is allocated on the master by allocating a batch of ids from zookeeper and using those to assign to schemas during registration.

Move default values in SchemaRegistryRestConfiguration to an external config file

Rather than defining these defaults in code, better to put these values in an external config file.

Add integration tests to schema registry

This issue includes adding integration tests to schema registry similar to the ones we have in kafka-rest

Simplifying the request in registering schemas

Currently, we use the following http body in the registering schema request.
'{"schema": "{"type": "string"}"}'

Now that we don't support deprecation, we can probably just simplify it to a string (that contains the schema).

minor cleanup - Pair class is unused

Pair class is dead code

Clean up flags in Schema.java

The "compatible" and "latest" boolean flags in Schema class should probably be removed - these values can be derived.

KAFKASTORE_TIMEOUT_CONFIG is too low in VMs

Tests are failing when run in VMs because this value is too low for some operations. Here's the stacktrace:

Running io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClientTest
[2015-02-03 21:47:55,533] INFO SchemaRegistryConfig values:
        port = 41077
        kafkastore.timeout.ms = 500
        kafkastore.write.max.retries = 5
        debug = true
        request.logger.name = io.confluent.rest-utils.requests
        metrics.sample.window.ms = 30000
        kafkastore.zk.session.timeout.ms = 10000
        kafkastore.topic = _schemas
        kafkastore.write.retry.backoff.ms = 100
        avro.compatibility.level = none
        shutdown.graceful.ms = 1000
        response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json]
        metrics.jmx.prefix = kafka.schema.registry
        host.name = localhost
        metric.reporters = []
        kafkastore.commit.interval.ms = -1
        kafkastore.connection.url = 127.0.0.1:60125
        metrics.num.samples = 2
        response.mediatype.default = application/vnd.schemaregistry.v1+json
        kafkastore.topic.replication.factor = 3
 (io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig:135)
[2015-02-03 21:47:56,578] INFO Initialized the consumer offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread:80)
[2015-02-03 21:47:56,955] WARN Creating the schema topic _schemas using a replication factor of 1, which is less than the desired one of 3. If this is a production environment, it's crucial to add more brokers and increase the replication factor of the topic. (io.confluent.kafka.schemaregistry.storage.KafkaStore:169)
[2015-02-03 21:47:57,489] INFO [kafka-store-reader-thread-_schemas], Starting  (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread:68)
[2015-02-03 21:47:58,009] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication:57)
io.confluent.kafka.schemaregistry.storage.exceptions.SchemaRegistryException: Error while initializing the datastore
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:128)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:55)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:37)
        at io.confluent.rest.Application.createServer(Application.java:104)
        at io.confluent.kafka.schemaregistry.RestApp.start(RestApp.java:51)
        at io.confluent.kafka.schemaregistry.ClusterTestHarness.setUp(ClusterTestHarness.java:135)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
        at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
        at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Can't fetch latest offset of Kafka topic _schemas after 500 ms
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:148)
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:126)
        ... 35 more
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Can't fetch latest offset of Kafka topic _schemas after 500 ms
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffsetOfKafkaTopic(KafkaStore.java:347)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.waitUntilBootstrapCompletes(KafkaStore.java:217)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:146)
        ... 36 more

I was going to just submit a patch to increase it, but it looks like this timeout is used in quite a few places. I suspect the issue here is that the tests rely on topic auto creation and in the VM that is taking longer for some reason.

I think the right solution is probably to add a second timeout option which is used in certain situations, like during KafkaStore.init(), which should be more lax since we sometimes expect it to take longer. Thoughts from people who are more familiar with this code?

Make the register call wait until Kafka store thread has applied the update

Currently, schema registry register returns the moment the message is synchronously published to the Kafka topic. But that doesn't guarantee that the schema is ready to be served since it may not have been applied by the background thread to the embedded store.

fix /topics

Currently, /topics returns a list of the form
["topic/subtype", ...]

instead of listing all Kafka topics.

Use a key serializer in KafkaSchemaRegistry

Methods such as register, get, getAll, getAllVersions manually serialize keys with logic like:
String keyEarliestVersion = topic + "," + earliestVersion;
String keyLatestVersion = topic + "," + latestVersion;

This should probably be done with a key serializer.

Get canonical avro string from an Avro schema

The canonical string form of an Avro schema strips off docs and aliases. schema.toString() doesn't seem to be canonical (e.g., reordering fields give different strings). We may need to implement a full canonical string representation of an Avro schema ourselves.

Fix schema registry REST application configuration

Currently, the schema registry prototype has hardcoded configuration. Fix to use the wired in configuration. This is a trivial task. Can be an introductory issue for someone :)

Advertised host setting name inconsistent with Kafka's

The schema registry uses advertised.host, Kafka brokers use advertised.host.name. It would probably be better to be consistent with the existing setting name.

Skip Kafka while building the avro-serializer package

Support for schema metadata

Schema metadata includes compatibility, validation, deprecation - https://sites.google.com/a/confluent.io/product/schema-registration#TOC-1.6-Common-configuration-settings

This issue includes making sure schema metadata can be set at a global as well as per-schema level through the appropriate REST endpoints as defined in the wiki above. Also includes figuring out the storage of schema metadata in the backing Kafka store.

Review error codes and ensure we are consistent with the error code style used in all our REST servers

Kafka rest follows a particular style for error codes where it always sends error codes for all HTTP responses that are sent with HTTP error codes 400 or 500. Let's ensure we do the same for schema registry REST as well.

Add cached schema registry client

For efficiency, need to use the identity map on schema string.

implement an AvroMessageFormatter for kafka-console-consumer

We need to implement an AvroMessageFormatter that understands our avro serialization format and generate json in the output. This will be useful for testing and debugging.

nextSchemaIdCounterBatch - add small backoff to while loop

In KafkaSchemaRegistry in nextSchemaIdCounterBatch(), there is a while loop containing a zookeeper conditional update

while (newSchemaIdCounterDataVersion < 0) { ... }

A small backoff should be added to the loop (perhaps ~50ms)

schema request result format

When fetching a schema by id, it doesn't make sense to return 'version'.

Also note that the 'name' returned when fetching by id will be the name of one of the subjects used when registering the schema.

Fetching a schema by subject/version gives you data in this format:
e.g. GET /subjects/geoff/versions/1
{u'id': 0,
u'name': u'geoff',
u'schema': u'{"type":"record","name":"myrecord","fields":[{"name":"f11","type":"string"}]}',
u'version': 1}

Fetching a schema by id gives the same format:
e.g. GET /schemas/0
{u'id': 0,
u'name': u'geoff',
u'schema': u'{"type":"record","name":"myrecord","fields":[{"name":"f11","type":"string"}]}',
u'version': 1}

Add master/slave awareness

Add master/slave awareness to the schema registry by starting with manual master assignment. This is a predecessor to a more sophisticated failover policy for the schema registry https://sites.google.com/a/confluent.io/product/schema-registration#TOC-8.-Kafka-log-RocksDB-storage-with-auto-failover

Here, one schema registry instance in a cluster will be assigned as the master node through configuration. All writes go to the master, reads go to master or slave.

implement an AvroMessageReader for kafka-console-producer

We need to implement an AvroMessageReader so that we can send avro data in json through kafka-console-producer. This is a nice-to-have and is probably less important than AvroMessageFormatter for kafka-console-consumer.

Byte, Character, and Short not handled by Avro serializer

These primitive Java types aren't handled by AbstractKafkaAvroSerializer.getSchema() because they don't have direct equivalents in Avro. I think there are two possible solutions:

Don't support them, but document it very clearly in the user documentation. In particular, these may warrant specialized exception messages since it would probably be easy to pass them in by accident, especially if you're using Producer<Object,Object>.
Support them via conversion to Avro types. Maybe Byte -> Bytes, Character -> String and Short -> Integer. These would require additional logic in serializeImpl and clear documentation about the conversions. Ideally in that case the conversion back could be done on the other side, although I'm now realizing the deserializer interface probably can't provide enough type information to accomplish that.

At first I was thinking option 2 would be better, but I think it may end up being too messy and isn't symmetrical on producer vs. consumer, so I think option 1 may be better.

ProducerRecord generic type rather than raw

Minor - in KafkaStore, it might be better to use ProducerRecord<byte[], byte[]> rather than the raw form.

Config update requests should be forwarded to the master

Currently we forward register schema and schema lookup under subject requests to the master, but not the config update requests.

timeout is not enfoced accurately in KafkaStoreReaderThread.waitUntilOffset()

We need to recalculate the timeout every time await() returns.

Add deprecate schema support

Add deprecate schema support to the schema registry.

Add proper log4j logging

The schema registry prototype currently doesn't have proper log4j logging. This issue includes fixing that

Embedded storage engine

Current schema registry prototype has a limited capability RocksDB engine that hasn't been unit tested. Stopped at a point where I ran into core dump issues possibly due to gcc issues on my Mac. This needs testing on Linux. This issue includes getting the RocksDb storage engine to work with unit tests

ClientErrorExeption in *Resource classes

Several *Resource.java files throw exceptions of the form:
throw new ClientErrorException(Response.Status.INTERNAL_SERVER_ERROR, e);

ClientErrorException does additional validation to make sure the response status is in the right family -
in our case, this validation fails and causes an additional exception to be thrown because we're instantiating a ClientErrorException with a server error response code.

Schema Registry should create _schemas topic automatically with desired replication factors

_schemas topic is used to store all schema registry information which is critical. However, it may not exist when the schema registry starts. Instead of manually create that topic, we should let schema registry to automatically create _schema topic with desired replication factor when it starts. The desired replication factor can be chosen as min(num_brokers, 3).

KafkaSchemaRegistry.nextSchemaIdCounterBatch() needs to use conditional updates to ZK

Currently, it's possible for two clients, each reads the same value in ZK and reserves the same chunk of values.

KafkaAvroEncoder should support keys

@Ishiihara @junrao Is there a reason we left out keys from the serializer patch? Shouldn't the serializer take in a config that specifies whether it is a key serializer or a value serializer?

Avro compatibility

The schema registry prototype includes accepting string schemas and storing those. This issue includes adding the capability of accepting only Avro JSON schemas to the schema registry and registering schemas only if schemas are backwards compatible. Custom compatibility is tracked in a separate issue.

id - change long to int

Using int for id gives enough possible schemas without the extra 4 bytes of overhead incurred by using long.

Update Resource classes to reflect API changes in wiki

We made some changes recently to the REST API - change appropriate classes to reflect this.

Make keys comparable

Keys in the Store interface are implicitly comparable (getAll method promises to return keys within a specific range), but there is currently no mechanism for ordering keys.

avoid duplicating the Schema class in both the client and the server

The reason that the code is duplicated is that Schema on the server depends on Storage, which we don't want to drag into the client. We can probably refactor the code a bit to avoid the duplication.

Add key/value schema awareness

Every topic can have a key schema or value schema registered under the topic. Currently the schema registry prototype doesn't differentiate between key and value schemas - for example, POST /topics//value/versions, POST /topics//key/versions both do the same thing.