Giter VIP home page Giter VIP logo

Comments (4)

hmmlillian avatar hmmlillian commented on July 17, 2024

@dorpxam Hello, do you mean the gradient did not update in the deconv function with your CPU-based L-BFGS?
In the previous GPU-based implementation (decov.cpp), diff (float* diff = m_classifier->net_->blob_by_name(m_layer1)->mutable_gpu_diff();) is gpu_ptr to the gradient object, whose data will be updated by the following substraction (caffe_gpu_sub(m_num1, src, m_dy, **diff**);).
In your code, before calling m_classifier->net_->BackwardFromTo(m_id1, m_id2 + 1), have you updated the gradient values diff on gpu?

from deep-exemplar-based-colorization.

 avatar commented on July 17, 2024

With my change, the cost function become :

void my_cost_function::cpu_f_gradf(const floatdouble *h_x, floatdouble *h_f, floatdouble *h_gradf)
{
	m_classifier->net_->ForwardFromTo(m_id2 + 1, m_id1);

	const float* src = m_classifier->net_->blob_by_name(m_layer1)->cpu_data();
	float* diff = m_classifier->net_->blob_by_name(m_layer1)->mutable_cpu_diff();
	caffe_sub(m_num1, src, m_dy, diff);

	float* diff2;
	diff2 = (float*)malloc(m_num1 * sizeof(float));
	caffe_mul(m_num1, diff, diff, diff2);

	float total;
	total = caffe_cpu_asum(m_num1, diff2);

	m_classifier->net_->BackwardFromTo(m_id1, m_id2 + 1);

	const float* diff3 = m_classifier->net_->blob_by_name(m_layer2)->cpu_diff();
	memcpy(h_gradf, diff3, m_num2*sizeof(float));
	memcpy(h_f, &total, sizeof(float));

	free(diff2);
}

And launched in the deconv method by :

my_cost_function func (classifier, m_layer1, d_y, num1, m_layer2, num2, id1, id2);

lbfgs solver (func);

lbfgs::status s = solver.cpu_lbfgs(d_x);

std::cout << solver.statusToString(s) << std::endl;

diff is correctly modified, but (as you can see in the log) the Forward/Backward seem broken somewhere and does not propagate the change. I don't think it's a model loading problem because except the hack on the math function files (for cublas v1), caffe is the version cloned on your hub with all the (nutget) dependencies.

Thank you for your help.

from deep-exemplar-based-colorization.

hmmlillian avatar hmmlillian commented on July 17, 2024

@dorpxam Thanks for sharing your code.
I think the major problem in your code should be caused by unsynchronized CPU and GPU memory. There are 4 states in Caffe for CPU and GPU memory management, UNINITIALIZED, HEAD_AT_CPU, HEAD_AT_GPU, SYNCED. In your situation, the state should be SYNCED to ensure data consistency between CPU and GPU. But this state will be violated by mutable_cpu_diff(). As a result, only CPU data is updated, asynchronous to GPU memory. You may check the state before and after. Maybe
async_gpu_push can be used to copy data from CPU to GPU before calling BackwardFromTo(m_id1, m_id2 + 1). More details can be found in caffe source code (https://github.com/BVLC/caffe/blob/master/src/caffe/syncedmem.cpp).
BTW, the input data type should be float instead of floatdouble.
Hope this could be helpful.

from deep-exemplar-based-colorization.

 avatar commented on July 17, 2024

@hmmlillian Thanks for your answer.
I understand. As I have said before, I init caffe in CPU mode. So the trained data was loaded in cpu_ptr memory blobs. The classifiers was modified to load the differents vectors of memory in cpu mem (data_A / data_AP / data_B / data_BP) mutable or copy. For my limited GPU memory hardware, I only move the cpu memory to allow use the differents block/thread cuda methods in the deep_image_analogy methods. The DeepAnalogy.cu file (in pseudo code) :

classifier_A -> load all layers on cpu from trained datas to vector of / A / AP / 
classifier B -> load all layers on cpu from trained datas to vector of / B / BP / 

No change on ANN/ANND -> host on cpu / device on gpu

for each layers // from 32 to 512
{
    memcpy current_layer (A/AP/B/BP) to GPU

    No change in the GPU process for :  
    init/upsample -> norm -> blend -> norm -> patchmatch 

    Process avg_vote on GPU too and move target result to CPU

    copy back A/AP/B/BP to CPU

    launch deconv on CPU two times for data_AP[next_layer] and data_B[next_layer]

    free cuda mem for current_layer of A/AP/B/BP
}

So the only time where the memory go to GPU, this is temporary. Only to process the differents layers and use the code in "GeneralizedPatchMatch.cu" without any change.

Note that the floatdouble is not typedefined as double but as float ofcourse ;)

I will investigate the question of async memory and the states as you've said.

In any case, thank's a lot for your help.

from deep-exemplar-based-colorization.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.