jihunhamm / crowd-ml Goto Github PK

Framework for Crowd-sourced Machine Learning

License: Apache License 2.0

Objective-C 68.16% Ruby 0.36% Python 2.24% JavaScript 1.16% Java 8.92% Swift 14.58% Objective-C++ 0.24% HTML 0.75% Shell 3.60%

crowd-ml's People

Contributors

Stargazers

Watchers

Forkers

frankhyb sayamganguly 3ygun lalalajiangbiyaosi darg0001 zhangmingyan1996 hxsylzpf weihongli jding0 shawndegroot

crowd-ml's Issues

Work Update, May 22nd: TF Improvements

TensorFlow

Training w/ variable learning rates
Send bias as well as weights
Generalize client TF for any numbers of parameters (weights and biases) [get_collections in python] #29
- 1 layer NN #32
- 2 layer NN, ReLu activations
Adam optimizer
- ~~@3ygun in #32 had a model running with the Adam optimizer. But wasn't seeing improved results as implementation bug? E.G. weight accuracy wasn't going >13% (See #33 for the issue)~~ @3ygun didn't have a proper server config and that resulted in the <20% accuracy 🤕
Reinit user on parameter update, regardless of state

Refractoring

See the david-tensorflow branch for #27 & the tensorflow branch changes. Will merge into tensorflow when it works. Current problems: (All fixed in #32)
- Believe my server configuration is incorrect
- Is the trainer correctly cycled between?

WiFi Down - Too Sensitive

Problem

Wifi on a physical device goes down for a few seconds and the weightIter resets to 0 before ever hitting the update weights. A new checkForWifi method should be implemented.

Parts

Implement new wifi down method
Test the changes on real devices

Tensorflow on Android[status-in progress]

Currently only inference is supported from Tensorflow on Android devices. The Android demo uses the very basic Tensorflow Java API which does not support training (only the python API supports sophisticated training). However, this C++ example shows that training in languages other than python is not impossible. I think using this reference for manual backpropagation will be useful in the future.

Edit March 10, '17:
After some continued research by me and David Soller (@3ygun) we have found a great resource about training models in C++. The great part about this article is that it has convinced me that training is possible simply from the Tensorflow Java API with slight modification (see link above). The models will need to be pre-generated from python code, and then loaded onto the device (or downloaded from the server). The only requirement is that there should be a named operation for training in the model.

Line 14 in training models in C++ shows this named operation:

optimizer = tf.train.AdamOptimizer().minimize(cost, name="train")

He runs this operation on line 41 in the C++ code block:

TF_CHECK_OK(session->Run({{"x", x}, {"y", y}}, {}, {"train"}, nullptr)); // Train

I believe the above line can be achieved purely from the Java API, so I'm changing the status to 'plausible'.

Synchronization error between client and server.

There is an issue when restarting either the client or the server while the other is running.

Weight iteration mismatch(working in issue13 branch):
Run client and server for any amount of time so that the weight iteration
increases by some multiple of localUpdateNum.
Close down the client completely (red stop button in Android Studio).
Reopen client, and view the mismatched weight iteration between the server and
the client.
Client can't restart after a server restart (working in issue13 branch):
Run client and server for any amount of time.
Close down the server.
Restart the server and view the client never continuing to run iterations.

EDIT: I'm currently doing more testing to confirm that my solutions are stable. A PR will be made soon.

Possible memory leak when service is running

Steps to reproduce:

Open up client, and make sure service is running
Make sure server is not running. If it is, close it down to ensure no processing on the client is happening.
Open the Android monitor and select the crowd-ml:datasend process. View the allocated memory continuously increasing in the performance profiler.

This may be due to the wifi listener, because that is the only running thread in the :datasend process.

Update the services WiFi Listening Interface

Goal

Make the WiFi handling more robust and fix #24 and #15

Parts

Use more of the connectivity manager listeners
- Use the CONNECTIVITY_ACTION
- Use the isActiveNetworkMetered

Links to ideas

NOT A BUG : Problem with the server config file

It's not training.

In the TensorFlowTrainer.java replace:

// Copy the weights Tensor into the weights array.
Trace.beginSection("fetch");
trainingInterface.fetch(weightsOp, w);
Trace.endSection();

with:

float[] w_new = new float[D * K];
// Copy the weights Tensor into the weights array.
Trace.beginSection("fetch");
trainingInterface.fetch(weightsOp, w_new);
Trace.endSection();

int z = 0;
boolean equal = true;
for (int x=0; x<D*K && equal; x++) {
    if (Math.abs(w[x])-Math.abs(w_new[x]) > 0.0005) {
        z = x;
        equal = false;
    }
}

Log.d(" ", " ");
Log.d("Weights Equal", "" + w.equals(w_new));
Log.d("Original Weights", ""+w[0]+" "+w[1]+" "+w[2]);
Log.d("New Weights", ""+w[0]+" "+w[1]+" "+w[2]);
Log.d(" ", "");
w = w_new;

And look at the log debug with weights aren't changing between runs only the data we run the initial random weights against which explains why we're not getting >20% accuracy.

Missing parameter in Constants.js while running Android server script

When I try to run crowdML-server-Android.js (master branch), I get the following error:

Error: Firebase.update failed: First argument contains undefined in property 'parameters.maxIter

If you look at line 30 in crowdML-server-Android.js it is looking for the maxIter variable

30| var maxIter = constants.maxIter;

However this variable is not present in any of the Constants.js files. Should this variable be removed from the server file or added to the constants file?

IndexOutOfBoundsException for certain localUpdateNum values

Setting the localUpdateNum to a value, say 10, can greatly increase computation speeds and I would highly recommend this for testing purposes. However, setting the localUpdateNum to high values (I tried 100 EDIT: Also fails for values >=15) will cause an IndexOutOfBoundsException ~line 433. See below for the code block.

From the internalWeightCalc method:

while(dataCount > 0 && batchSlot < batchSize) {
    // This is the problem line
    batchSamples.add(order.get((batchSize*localUpdateNum*(dataCount-1) + batchSlot*(localUpdateIter+1))));
    batchSlot++;
}

If anyone could help explain the calculation on line 3 above that would be much appreciated. Does this try to get random samples from the batchSamples list?

The error I got (for 100) was:

java.lang.IndexOutOfBoundsException: Index: 99900, Size: 12665

Note that 12665 is the size of the binary classification data.

Please checkout the issue8 branch before testing.

Server ran with:

node crowdML-server-Android.js Constants1.js

Constants1.js is the constants file for binary classification.

" trainingInterface.fetch(weightsOp, w)" Is this code working?

trainingInterface.fetch(weightsOp, w);
This line of code is working??????
I got error like " Node weightsOp was not provided to run(), so it cannot be read.

Also, I have one more question. where can I get the TensorflowTrainingInterface.java??
I got the file from another project though.. I want to know the right one.

Maintainability Updates

Goal

Make the project more maintainable.

Parts

Consistent placement of the Copyright notice
Andriod app
- Styling guide
- Fix comments
  - Remember comments rot so try to make sure they code speaks for itself
  - Mainly only comment what the parameters, returns and main parts do
- Remove outdated code
Resolve what data is checked into the project or not...
- Data is committed as a .zip file with all of the .dat/similar files in a directory which is zipped e6d8ed3
Tests... Tests on everything!
- Android app
- Server

Fully Configurable TensorFlow Models

Goal

Allow for the Android app to use any TensorFlow model with a supplied configuration file for retrieving the weights and biases. (Most of the big points on #28)

Parts

Complete #32 basic TensorFlow in the app
Add ability for server to supply the fields to get the ~~weight values~~ trainable parameters
Add ability for app to get & send all the fields supplied by the server
Test that everything is happening
Supply a program (set of python methods) to CrowdML supervisors that will populate the .json configuration data needed
- Should give named parameters from the graph/model and populate the feature sizes etc.
- Something like call:
```
def generateCrowdMLTFConfig(session):
    ...
```

TensorFlow Implementation

Goals

Use TensorFlow for model back-ends to enable better extensibility. This issue will server as a base of operations regarding the progress and/or discussion around the implementation.

Aspects

@tylermzeller and I believe the following would be some of the required implementation plan to enable the above goal:

Rewrite Server to use TensorFlow as it's testing interface
- A simple extension of the existing NodeJS server application wouldn't be possible due to TensorFlow's lack of support for JavaScript bindings.
- In light of this a rewrite in Python would probably be most appropriate. It supports the largest part of TensorFlow's implemented models and documentation while being accessible and performant
  - Could use pyrebase
Rewrite of the Andorid app to use the TensorFlow bindings and TrainingInterface explored and developed in TensorFlow on Android
- Lots of set-up but we've been good about documenting it mostly
- The Java bindings are currently not stable BUT have been made more available for public comment see api_docs
- Single model #32
- Any model

Pre-Requisites

NOTE: most of these can be explored through desktop TensorFlow applications

Validate that weights can be changed after a model is loaded
- If this is not the case a work around must be found (such as a model reload)
- EDIT: view the TF documentation on variables.

Structure Project as

Single app vs Library

Compilation errors with Android client on Master

The android server is only present in the master branch, and it works correctly as far as I know. However, to test the Android client, you need to first switch to Jackson's branch because that is the only branch whose Android code will compile.

Things I will be working on:

Merging Jackson's client with master

adagrad and rmsProp not implemented in Android weight calculation

See InternalServer.java There is no implementation for adagrad or rmsProp descent algorithms. This will crash the app if these options are chosen in the Constants.js files.

// TODO: Why are learningRateDenom and eps not used?
    public static List<Double> calcWeight(List<Double> oldWeights, List<Double> learningRateDenom, List<Double> grad, float t, String descentAlg, double c, double eps){

        List<Double> newWeight = new ArrayList<>(oldWeights.size());

        if(descentAlg.equals("constant")){
            for(int i = 0; i < oldWeights.size(); i ++){
                newWeight.add(oldWeights.get(i) - (c)*grad.get(i));
            }
        }

        if(descentAlg.equals("simple")){
            for(int i = 0; i < oldWeights.size(); i ++){
                newWeight.add(oldWeights.get(i) - (c/t)*grad.get(i));
            }
        }

        if(descentAlg.equals("sqrt")){
            for(int i = 0; i < oldWeights.size(); i ++){
                newWeight.add(oldWeights.get(i) - (c/Math.sqrt(t))*grad.get(i));
            }
        }

        // No support for adagrad or rmsProp

        return newWeight;

    }

Runtime Error on Android client in JacksonBranch

Whenever I login to the Android client, there is a crash. @FrankHYB has pointed out that the crash originates from the DataSend.java class. I did some investigation into the problem and it appears that when the client attempts to receive the weights from the Firebase db, the data the java code expects is different from the actual data stored in Firebase.

This is what the weights tree looks like in my Firebase db. If you expand the first child, thousands of weight values appear. If you expand the second branch, a 1 and a -1 appear.

Continued Server Cleanup

Goals

Continued server building on #19

Parts

Need to do some further testing to affirm I didn't break anything
Extract Firebase config to allow ENV's instead of committing credential information #21
Remove duplicate loading of configuration in tests
Remove old configuration files Constants.js #21

How can the server support more than a single client?

Looking in the Android server see starting on line 238:

if(localUpdateNum > 0){
	iter += localUpdateNum;
} else {
	iter++;
}

My question is how the server can support multiple devices if the server only keeps track of a single weight iteration?

jihunhamm / crowd-ml Goto Github PK

crowd-ml's People

Contributors

Stargazers

Watchers

Forkers

crowd-ml's Issues

TensorFlow

Refractoring

Problem

Parts

Goal

Parts

Links to ideas

Goal

Parts

Goal

Parts

Goals

Aspects

Pre-Requisites

Structure Project as

Goals

Parts

Recommend Projects

Recommend Topics

Recommend Org