linkedin / dagli Goto Github PK

Framework for defining machine learning models, including feature generation and transformations, as directed acyclic graphs (DAGs).

License: BSD 2-Clause "Simplified" License

Java 97.44% FreeMarker 2.34% Shell 0.08% C++ 0.02% Lex 0.02% PHP 0.02% Pawn 0.05% SourcePawn 0.04%

dagli's People

Contributors

Stargazers

Watchers

dagli's Issues

Be able to output the dl4j computationGraph.summary() for debugging purposes

I had some problems to output the dl4j computationGraph.summary(). After configuring the logging right, I succeeded to output it for training.

But still, after deserializing the prepared dag, something like this is not possible:

NeuralNetwork.Prepared p = prepared.producers(NeuralNetwork.Prepared.class).findFirst().get().peek();
System.out.println(p.getComputationGraph().summary());

Perhaps add some additional logs or allow such a code from above (by making some more things public)?

It would also be great to be able to add some handlers/listeners for training like it is possible in dl4j (for example to save the model after every epoch, or doing some more evaluation every epoch).

Add some xgboost internals access

I'd like to get more informations for the trained xgboost model (Booster) like:

Map<String,Integer> | getFeatureScore(String[] featureNames) Get importance of each feature with specified feature names.

Could you add some Api to access the booster directly, or delegate some more methods?

How to: Feed Output of final dense layer to LSTM using the examples=placeholders as time series data?

I try to do feature matching on html-document-nodes, and use features of these document nodes (tagname, text, class, length,..) as placeholder (struct). Currently, the network consists of some fasttext and lstm layers, and the final layer is a dense layer, and all nodes are processed independently of each other. The network lacks a time series like connection of the nodes in a document (the classification of a node may depend on the classification of the previous node).

Do I understand right, that something like that would be possible in dl4j?
https://deeplearning4j.konduit.ai/deeplearning4j/reference/recurrent-layers#inference-predictions-one-step-at-a-time
So, before each document the recurrrent state could be cleared?

How would I do that with dagli?

Btw, I recognized a strong performance (accuracy) drop, when using fasttext with multi-label classification, instead of multiple fasttext instances with single-boolean classification.

Stacking LSTMs

This is the code for LSTM stacking:

 public NNLSTMLayer stack(int... unitCounts) {
    Arguments.check(unitCounts.length > 0, "At least one unit count for at least one new layer must be provided");
    NNLSTMLayer previous = this.withUnitCount(unitCounts[0]).withInput(this.getInputLayer());

    for (int i = 1; i < unitCounts.length; i++) {
      previous = previous.withInput(previous).withUnitCount(unitCounts[i]);
    }

    return previous;
  }

I wonder, why the first stacked layer does not get this as input, but this.getInputLayer()?

Loading a trained network results in completely different probabilities on same test data

I trained a "NeuralNetwork" with GPU, serialized the prepared network, loaded it on another PC and did inference with CPU with the same test data. The probabilities are completely different, appearing like untrained.

I have no explanation for this. The only bug I found on dl4j side, which maybe could relate to this would be:
deeplearning4j/deeplearning4j#4688

Could this be possible?

Logging frequence has no effect on DL4J logging intervals

No matter what I set as logging frequence in "NeuralNetwork" (for example TrainingAmount.minibatches(1) or TrainingAmount.minibatches(300)), every iteration is logged. I have to admit, that I got new hardware, and compiled against latest dagli combined with dl4j-SNAPSHOT to get it running with cuda-11.2, so I am not sure, if this is reproducible.

More control over loss functions

Reading about multilabel problems because of inbalanced label-distribution in training data:

https://arxiv.org/abs/2109.04712

A label-based loss-function re-weighting is proposed. It seems that at least applying a weights-array is possible with dl4j? How could I do that with dagli?

BinaryEvaluation often results in NaN for best F1-Scores

I then printed out all F1-Scores of the ConfusionMatrices-List, which looks as expected, but has some NaN at the end. Something like

Threshold: F1-Score
0.1: 0.1
0.2: 0.2
0.4: 0.3
0.5: 0.4
0.6: 0.3
0.8: 0.2
0.9: 0.1
0.99: NaN
0.999: NaN

It should pick 0.5 as best threshold.

The problem seems that NaN are also compared, and it seems like Double.compare(SomeNumber, NaN) = -1 and Double.compare(NaN, SomeNumber) = 1. Perhaps something like this would be better:

BinaryConfusionMatrix highestF1CM = eval.getConfusionMatrices().stream().filter(x -> Double.isFinite(x.getF1Score())).max(Comparator.comparingDouble(BinaryConfusionMatrix::getF1Score)).get();

Execution failed for task ':examples:fasttext-and-avro:FastTextExample.main()'.

Task :examples:fasttext-and-avro:FastTextExample.main() FAILED
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" java.lang.IllegalArgumentException: capacity < 0: (-4698104 < 0)
at java.base/java.nio.Buffer.createCapacityException(Buffer.java:251)
at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:340)
at org.apache.avro.io.BinaryDecoder.readBytes(BinaryDecoder.java:288)
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:112)
at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
at org.apache.avro.file.DataFileReader.(DataFileReader.java:89)
at com.linkedin.dagli.objectio.avro.AvroReader.lambda$size64$2(AvroReader.java:98)
at java.base/java.util.stream.ReferencePipeline$5$1.accept(ReferencePipeline.java:229)
at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1494)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.LongPipeline.reduce(LongPipeline.java:452)
at java.base/java.util.stream.LongPipeline.sum(LongPipeline.java:410)
at com.linkedin.dagli.objectio.avro.AvroReader.size64(AvroReader.java:116)
at com.linkedin.dagli.objectio.avro.AvroReader.size64(AvroReader.java:77)
at com.linkedin.dagli.objectio.SampleReader.size64(SampleReader.java:39)
at com.linkedin.dagli.dag.MultithreadedDAGExecutor.executeUnsafe(MultithreadedDAGExecutor.java:1509)
at com.linkedin.dagli.dag.MultithreadedDAGExecutor.prepareUnsafeImpl(MultithreadedDAGExecutor.java:1497)
at com.linkedin.dagli.dag.LocalDAGExecutor.prepareUnsafeImpl(LocalDAGExecutor.java:71)
at com.linkedin.dagli.dag.AbstractDAGExecutor.prepareUnsafe(AbstractDAGExecutor.java:99)
at com.linkedin.dagli.dag.DAG1x1.prepare(DAG1x1.java:253)
at com.linkedin.dagli.examples.fasttextandavro.FastTextExample.main(FastTextExample.java:89)

Execution failed for task ':examples:fasttext-and-avro:FastTextExample.main()'.

Process 'command 'C:/Program Files/Java/jdk-9.0.1/bin/java.exe'' finished with non-zero exit value 1

Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

Example for NNClassification.withMultilabelLabelsInput

I have defined a Placeholder with a label like this:

static public enum Label {some, values, ..}; EnumSet<Label> labels;

So that a record can have multiple labels.

This won't work:

ExtendedNode.Placeholder p = new ExtendedNode.Placeholder();
...
NNClassification<Label> myClassification = new NNClassification<Label>()
  .withFeaturesInput(denseLayers)
  .withMultilabelLabelsInput(p.asLabels());

How should I use the NNClassification Layer now?

Get XGBoostClassification.Prepared has no "asLeafFeatures()" method

I intensivly trained a XGBoostClassification model and saved the prepared DAG instance to a file.

Now, after loading the model from file, I like to examine some examples with the "asLeafFeatures" Producer, but I see no possibility without the XGBoostClassification instance. I only find the ""XGBoostClassification.Prepared" instance in the prepared model. Would it be possible to add these methods also to the Prepared instance?

Layer names missing in summary

I see this line in AbstractNeuralNetwork.uniqueLayerNames (line 534)

      String baseName = layer.internalAPI().hasName() ? layer.getClass().getSimpleName() : layer.getName();

This make no sense to me. Should it be the other way round?

Add CRF (Conditional Random Fields) Layer

They are quite popular for POS/NER tagging

Generated code are checked in to repository

In the examples, e.g. fasttext-and-avro, the generated code is checked into the repository. The common practice is not checking in generated code. Would it be possible to configure the build so the code is generated in the build directory instead of the src directory? This will make the source cleaner and avoid the generated code to be checked in to repository.

linkedin / dagli Goto Github PK

dagli's People

Contributors

Stargazers

Watchers

Forkers

dagli's Issues

Recommend Projects

Recommend Topics

Recommend Org