Giter VIP home page Giter VIP logo

dataenum's Introduction

Build Status codecov Maven Central License

DataEnum

DataEnum allows you to work with algebraic data types in Java.

You can think of it as an enum where every individual value can have different data associated with it.

What problem does it solve?

The idea of algebraic data types is not new and already exists in many other programming languages, for example:

It is possible to represent such algebraic data types using subclasses: the parent class is the "enumeration" type, and each child class represents a case of the enumeration with it's associated parameters. This will however either require you to spread out your business logic in all the subclasses, or to cast the child class manually to access the parameters and be very careful to only cast if you know for sure that the class is of the right type.

The goal of DataEnum is to help you generate all these classes and give you a fluent API for easily accessing their data in a type-safe manner.

The primary use-case we had when designing DataEnum was to execute different business logic depending on an incoming message. And as mentioned above, we wanted to keep all that business logic in one place, and not spread it out in different classes. With plain Java, you’d have to write something like this:

if (message instanceof Login) {
    Login login = (Login) message;
    // login logic here
} else if (message instanceof Logout) {
    Logout logout = (Logout) message;
    // logout logic here
}

There are a number of things here that developers tend to not like: repeated if-else statements, manual instanceof checks and safe-but-noisy typecasting. On top of that it doesn't look very idiomatic and there's a high risk that mistakes get introduced over time. If you use DataEnum, you can instead write the same expression like this:

message.match(
   login -> { /* login logic; the 'login' parameter is 'message' but cast to the type Login. */ },
   logout -> { /* logout logic; the 'logout' parameter is 'message' but cast to the type Logout. */ }
);

In this example only one of the two lambdas will be executed depending on the message type, just like with the if-statements. match is just a method that takes functions as arguments, but if you write expressions with linebreaks like in the example above it looks quite similar to a switch-statement, a match-expression in Scala, or a when-expression in Kotlin. DataEnum makes use of this similarity to make match-statements look and feel like a language construct.

There are many compelling use-cases for using an algebraic data type to represent values. To name a few:

  • Create a vocabulary of possible actions. List all the actions that can be performed in a certain part of your application, for example on a login/logout page. Each action can have different data associated with it, for example the login action would have a username and password, while a logout action doesn't have any data.

  • Representing states of a state machine. This allows you to only keep the data that actually is available in each state, making it impossible to even reference data that isn't available in a particular state.

  • Rich, type-safe error handling Instead of having just an error code as a result of a network request, you can have different types for different errors, each with relevant information attached: ConnectivityLost, NoRouteToHost(String host), TooManyRetries(int retryCount).

  • Metadata in RxJava streams. It is often useful to wrap data in RxJava in order to provide metadata about what's happening. One common example is to represent different kinds of success and failure: InProgress(T placeholder), Success(T data), Error(String reason).

Status

DataEnum is in Beta status, meaning it is used in production in Spotify Android applications, but we may keep making changes relatively quickly.

It is currently built for Java 7 (because Android doesn't support Java 8 well yet), hence the duplication of some concepts defined in java.util.function (Consumer, Function, Supplier).

Using it in your project

The latest version of DataEnum is available through Maven Central (LATEST_RELEASE below is latest not found):

Gradle

implementation 'com.spotify.dataenum:dataenum:LATEST_RELEASE'                
annotationProcessor 'com.spotify.dataenum:dataenum-processor:LATEST_RELEASE' 

Maven

<dependencies>
  <dependency>
    <groupId>com.spotify.dataenum</groupId>
    <artifactId>dataenum</artifactId>
    <version>LATEST_RELEASE</version>
  </dependency>
  <dependency>
    <groupId>com.spotify.dataenum</groupId>
    <artifactId>dataenum-processor</artifactId>
    <version>LATEST_RELEASE</version>
    <scope>provided</scope>
  </dependency>
</dependencies>

It may be an option to use the annotationProcessorPaths configuration option of the maven-compiler-plugin rather than an optional dependency.

How do I create a DataEnum type?

First, you define all the cases and their parameters in an interface like this:

@DataEnum
interface MyMessages_dataenum {
    dataenum_case Login(String userName, String password);
    dataenum_case Logout();
    dataenum_case ResetPassword(String userName);
}

Then, you apply the dataenum-processor annotation processor to that code, and your DataEnum case classes will be generated for you.

Some things to note:

  • We use a Java interface for the specification. The rationale is that it allows the IDE to help you find and import types correctly. We deliberately made it look weird, so nobody would think it’s a normal class. This is abusing Java a bit, but we’re OK with that.

  • The interface will never be used for anything other than code generation, so you should normally make the interface package-private. The one exception is when one _dataenum spec needs to reference another as described below.

  • The interface name has to end with _dataenum. This is to make the interface stick out and make it easier to filter out from artifacts and exclude from static analysis.

  • The methods in the interface have to be declared as returning a dataenum_case. Each method corresponds to one of the possible cases of the enum, and the parameters of the method become the member fields of that case. Note that the method names from the interface will be used as class names for the cases, so you'll want to name them using CamelCase as in the example above. The methods  in the _dataenum interface will never be implemented, and there is no way to create a dataenum_case instance. The type is only used as a marker.

  • The prefix of the @DataEnum annotated interface will be used as the name of a generated super-class (MyMessages in the example above). This class will have factory methods for all the cases.

  • For each method in the interface, an inner class will be generated (in this example MyMessages.Login, MyMessages.Logout and MyMessages.ResetPassword). These classes will extend the outer class MyMessages.

Using the generated DataEnum class

Some usage examples, based on the @DataEnum specification above:

// Instantiate by passing in the required parameters. 
// You’ll get something that is of the super type - this is to help Java’s 
// not-always-great type inference do the right thing in many common cases.
MyMessages message = MyMessages.login("petter", "s3cr3t");

// If you actually needed the subtype you can easily cast it using the as-methods.
Logout logout = MyMessages.logout().asLogout();

// For every as-method there is also an is-method to check the type of the message.
assertThat(message.isLogin(), is(true));

// Apply different business logic to different message types. Note how getters are generated (but not
// setters, DataEnum case types should be considered immutable).
message.match(
    login -> Logger.debug("got a login request from user: {}", login.userName()),
    logout -> Logger.debug("user logged out"),
    resetPassword -> Logger.debug("password reset requested for user: {}", resetPassword.userName())
);

// So far we've been looking at 'match', but there is also the very useful 'map' which is used to
// transform values. When using 'map' you define how the message should be transformed in each case.
int passwordLength = message.map(
    login -> login.password().length(),
    logout -> 0,
    resetPassword -> -1);
}

// There are some utility methods provided that allow you to deal with unimplemented or illegal cases:
int passwordLength = message.map(
    login -> login.password().length(),
    logout -> Cases.illegal("logout message does not contain a password"), // throws IllegalStateException
    resetPassword -> Cases.todo()); // throws UnsupportedOperationException
}

// Sometimes, only a minority of cases are handled differently, in which case a 'map' or 'match'
// can lead to duplication:
int passwordLength = message.map(
    login -> handleLogin(login),
    logout -> Cases.illegal("only login is allowed"),
    resetPassword -> Cases.illegal("only login is allowed")
    // This could really get bad if there are many cases here
);

// For those scenarios you can just use regular language control structures (like if-else):
if (message.isLogin()) {
  return handleLogin(message.asLogin()); // Technically just a cast but easier to read than manual casting.
} else {
  throw new IllegalStateException("only login is allowed");
}

Features

  • Case types are immutable. All generated classes are value types and cannot be modified after being created. Of course this assumes that all the parameters of your cases are immutable too, since an object only is immutable if all its fields also are immutable.

  • Everything is non-null by default. Passing in a null will cause an exception to be thrown unless you explicitly annotate the parameters as @Nullable. Any annotation with the name 'Nullable' can be used.

  • toString, hashCode, and equals are generated for all case classes.

  • isFoo/asFoo methods are provided, as a more high level alternative to manually doing instanceof and casting.

  • Generic type support. The DataEnum interfaces can be type parameterized, which makes it possible to create reusable data types.

  • Recursive data type support. The generated DataEnum types may refer to itself recursively, even with type parameters. When doing so you must use the _dataenum-suffixed name to avoid any chicken-and-egg problems with the generated classes.

    The recursive data type support allows you to do things like this:

    @DataEnum
    interface Tree_dataenum<T> {
      dataenum_case Branch(Tree_dataenum<T> left, Tree_dataenum<T> right);
      dataenum_case Leaf(T value);
    }
  • Sometimes, you want to reference a dataenum from another one. You can do that using this slightly clunky syntax:

    interface First_dataenum {
      dataenum_case SomeCase();
    }
    
    interface Second_dataenum {
      dataenum_case NeedsFirst(First_dataenum first);
    }

    The generated NeedsFirst class will have a member field that is of the type First. Again, because the First class doesn't exist until the annotation processor has run, so the Second_dataenum spec must reference the First_dataenum spec. If First_dataenum is in a different package than Second_dataenum, it must of course be public.

  • If you have sensitive information in a field and don't want the generated toString method to print that information, you can use the @Redacted annotation:

    dataenum_case UserInfo(String name, @Redacted String password);

    We provide an annotation in the runtime dependencies, but any annotation named Redacted will work.

Configuration

DataEnum currently has a single configurable setting determining the visibility of constructors in generated code. Generally speaking, private is best as it ensures there is a single way of creating case instances (the generated static factory methods like MyMessages.login(String, String) above). However, for Android development, you want to keep the method count down to a minimum, and private constructors lead to synthetic constructors being generated, increasing the method count. Since that is an important use case for us, we've chosen the package-private as the default. This is configurable through adding a @ConstructorAccess annotation to a package-info.java file. See the javadocs for more information.

Known weaknesses of DataEnum

  • While the generated classes are immutable, they do not enforce that parameters are immutable. It is up to users of DataEnum to eg. use ImmutableList for lists instead of List.

  • The names of the arguments to the lambdas when using match/map only indicate the type of the object by convention, so some discipline is required to make sure you manually update lambda argument names if a case is renamed.

  • Renaming cases of a dataenum can be painful since the generated class doesn't have a connection to the interface.

  • Reordering cases can be dangerous if you only use lambdas with type-inference. If you swap the order of two cases with the same parameter names then usages of map/match will still compile even though they are now incorrect. This can be mitigated using method references instead of lambdas, lambdas with explicit type parameters, and good test coverage of code using DataEnum.

  • The _dataenum-suffixed interface is only used as an input to code generation, and it breaks certain conventions around naming. You might need to suppress some static analysis when you use DataEnum, and you probably want to strip the _dataenum classes from artifacts.

Alternatives

An alternative implementation of algebraic data types for Java is ADT4J. We feel DataEnum has the advantage of being less verbose than ADT4J, although ADT4J is more flexible in terms of customising your generated types.

Features that might be added in the future

  • Generating builders for case types with many parameters.
  • Generating mutator functions for case types to create modified versions of them.
  • Support for writing extensions, eg. to allow adding support for serialization.
  • IntelliJ plugin for refactoring and for generating map/match statements.

Why is it called DataEnum?

The name ‘DataEnum’ comes from the fact that it’s used similarly to an enum, but you can easily and type-safely have different data attached to each enum value.

Code of Conduct

This project adheres to the Open Code of Conduct. By participating, you are expected to honor this code.

dataenum's People

Contributors

andyglow avatar bndbsh avatar dependabot[bot] avatar dflemstr avatar kiramind avatar lukaciko avatar pettermahlen avatar rouzwawi avatar togi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataenum's Issues

Remove synthetic accessor from generated code

The currently generated code is generating synthetic accessors for constructors for inner classes. It can be a big saving if the identifier is changed from private to package protected.

import java.lang.Boolean;
import java.lang.Integer;
import java.lang.Object;
import java.lang.Override;
import java.lang.String;
import java.lang.StringBuilder;
import java.util.function.Consumer;
import java.util.function.Function;
import javax.annotation.Generated;
import javax.annotation.Nonnull;

public abstract class MultipleValues {
  private MultipleValues() {
  }

  public static MultipleValues value1(int param1, boolean param2) {
    return new Value1(param1, param2);
  }

  public static MultipleValues value2(int param1, boolean param2) {
    return new Value2(param1, param2);
  }

  public final boolean isValue1() {
    return (this instanceof Value1);
  }

  public final boolean isValue2() {
    return (this instanceof Value2);
  }

  public final Value1 asValue1() {
    return (Value1) this;
  }

  public final Value2 asValue2() {
    return (Value2) this;
  }

  public abstract void match(@Nonnull Consumer<Value1> value1, @Nonnull Consumer<Value2> value2);

  public abstract <R_> R_ map(@Nonnull Function<Value1, R_> value1,
      @Nonnull Function<Value2, R_> value2);

  public static final class Value1 extends MultipleValues {
    private final int param1;

    private final boolean param2;

    private Value1(int param1, boolean param2) {
      this.param1 = param1;
      this.param2 = param2;
    }

    public final int param1() {
      return param1;
    }

    public final boolean param2() {
      return param2;
    }

    @Override
    public boolean equals(Object other) {
      if (other == this) return true;
      if (!(other instanceof Value1)) return false;
      Value1 o = (Value1) other;
      return o.param1 == param1
          && o.param2 == param2;
    }

    @Override
    public int hashCode() {
      int result = 0;
      result = result * 31 + Integer.valueOf(param1).hashCode();
      result = result * 31 + Boolean.valueOf(param2).hashCode();
      return result;
    }

    @Override
    public String toString() {
      StringBuilder builder = new StringBuilder();
      builder.append("Value1{param1=").append(param1);
      builder.append(", param2=").append(param2);
      return builder.append('}').toString();
    }

    @Override
    public final void match(@Nonnull Consumer<Value1> value1, @Nonnull Consumer<Value2> value2) {
      value1.accept(this);
    }

    @Override
    public final <R_> R_ map(@Nonnull Function<Value1, R_> value1,
        @Nonnull Function<Value2, R_> value2) {
      return value1.apply(this);
    }
  }

  public static final class Value2 extends MultipleValues {
    private final int param1;

    private final boolean param2;

    private Value2(int param1, boolean param2) {
      this.param1 = param1;
      this.param2 = param2;
    }

    public final int param1() {
      return param1;
    }

    public final boolean param2() {
      return param2;
    }

    @Override
    public boolean equals(Object other) {
      if (other == this) return true;
      if (!(other instanceof Value2)) return false;
      Value2 o = (Value2) other;
      return o.param1 == param1
          && o.param2 == param2;
    }

    @Override
    public int hashCode() {
      int result = 0;
      result = result * 31 + Integer.valueOf(param1).hashCode();
      result = result * 31 + Boolean.valueOf(param2).hashCode();
      return result;
    }

    @Override
    public String toString() {
      StringBuilder builder = new StringBuilder();
      builder.append("Value2{param1=").append(param1);
      builder.append(", param2=").append(param2);
      return builder.append('}').toString();
    }

    @Override
    public final void match(@Nonnull Consumer<Value1> value1, @Nonnull Consumer<Value2> value2) {
      value2.accept(this);
    }

    @Override
    public final <R_> R_ map(@Nonnull Function<Value1, R_> value1,
        @Nonnull Function<Value2, R_> value2) {
      return value2.apply(this);
    }
  }
}

add safeAsX(): Optional<X> method

Hi there,
coming from languages with sum types, is such a punishment to use java. We actually wrote handmade ADTs like your library is generating. I really like how less you have to type and what is generated for you. ❤️

This library already generates the

isX(): Boolean and asX(): X

we added in our hand wrote code a asSafeX(): Optional<X> method. This is more typesafe and could actually replace both methods

isX() is asSafeX().isPresent() and
asX() is asSafeX().get()

but you could also do typesafe stuff like asSafeX().map(x=>2*x).orElse(42)

Can you image to add this method?

Cheers Thomas

Using byte[] as a field results in spotbugs warnings

Generated class for a dataenum_case with a field of type byte[] produces 3 spotbugs warnings:

  • DMI_INVOKING_TOSTRING_ON_ARRAY

(The code invokes toString on an array, which will generate a fairly useless result such as [C@16f0472. Consider using Arrays.toString to convert the array into a readable String that gives the contents of the array.)

  • EC_BAD_ARRAY_COMPARE

(This method invokes the .equals(Object o) method on an array. Since arrays do not override the equals method of Object, calling equals on an array is the same as comparing their addresses. To compare the contents of the arrays, use java.util.Arrays.equals(Object[], Object[]). To compare the addresses of the arrays, it would be less confusing to explicitly check pointer equality using ==.)

  • DMI_INVOKING_HASHCODE_ON_ARRAY

(The code invokes hashCode on an array. Calling hashCode on an array returns the same value as System.identityHashCode, and ignores the contents and length of the array. If you need a hashCode that depends on the contents of an array a, use java.util.Arrays.hashCode(a).)

Example:

@DataEnum
public interface Credentials_dataenum {
    dataenum_case Token(byte[] blob);
    ...
}
@Generated("com.spotify.dataenum.processor.DataEnumProcessor")
public abstract class Credentials {

  ...

  public static final class Token extends Credentials {
    private final byte[] blob;

    @Override
    public boolean equals(Object other) {
      if (other == this) return true;
      if (!(other instanceof Credentials)) return false;
      Credentials o = (Credentials) other;
      return o.blob.equals(this.blob); <- EC_BAD_ARRAY_COMPARE
    }

    @Override
    public int hashCode() {
      return blob.hashCode(); <-DMI_INVOKING_HASHCODE_ON_ARRAY 
    }
    
    @Override
    public String toString() {
      StringBuilder builder = new StringBuilder();
      builder.append("Credentials{blob=").append(blob); <- DMI_INVOKING_TOSTRING_ON_ARRAY 
      return builder.append('}').toString();
    }
   }
}

Duplicate field names in @DataEnum specification interfaces

Currently the following is allowed by the annotation processor:

@DataEnum
interface Text_dataenum {
    dataenum_case Foo(int bar):
    dataenum_case Foo(long bar);
}

However this will cause two inner classes to be generated with the same name, so the resulting class does not compile.

The annotation processor should detect duplicate case names and abort compilation with an error explaining the problem.

dataenum:1.5.0 is not compiling

Hi there,

i tried to add you library in v 1.5.0. But unfortunately it didn't work. We also use auto-factory. When I added dataenum, the auto-factory still generated code, but java couldn't find it anymore and failed with cannot find symbol X.

I am no gradle expert at all but here are my infos:
We use gwt and compile with java 1.7.20 but have the sdk 1.17.

Downgrading to v 1.4.0 worked for me.

Do you have any idea?
Cheers Thomas

Referenced DataEnums in Sets doesn't build properly.

If I create a DataEnum which references another DataEnum in a Set the DataEnum doesn't build properly.

Example
dataenum_case Initializing(Set<SessionEffect_dataenum> pendingEffects);
Builds to
Initializing(Set<SessionEffect_dataenum> pendingEffects) {}

Expected is that it builds to
Initializing(Set<SessionEffect> pendingEffects) {}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.