Giter VIP home page Giter VIP logo

densenet.pytorch's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

densenet.pytorch's Issues

Help needed on reproducing the performance on Cifar-100

I used the default setting(which I think is Densenet-12-BC with data augmentation) on cifar-100(via just changing the name of dataset class and the nClasses variable). The training curve looks like this:
image
Though the training has not ended yet, from training curves for other networks on Cifar-100 I can tell there would be no more major changes in acc. The highest of acc for now is 75.59%, which can only match the reported performance of Densenet-12(depth 40) with data augmentation.
Has any one tested this repo on Cifar-100 yet?

How did you create the header.png?

I'm quite impressed with how you've presented your densenet implementation.

V-Net as a bit messier in terms of needing substantial preprocessing of the data set, a custom loader, and a custom loss function. Nonetheless, I'm patterning the presentation of my implementation https://github.com/mattmacy/vnet.pytorch after yours and I'm wondering how you created the header.png image.

Thanks in advance..

Error when loading the model saved

Hi,

I modified your code to train a model with my own dataset, and I am trying to load the model saved as "latest.pth" to do some tests. However, I am getting this error:

AttributeError: 'DenseNet' object has no attribute 'copy'

The code I use to load the model is:

net.load_state_dict(torch.load(checkpoint, map_location=lambda storage, loc: storage))

where checkpoint is the path to "latest.pth"

Any help would be appreciated.

Thanks

Multi-GPU implementation

Hi author

Thanks for sharing your code. I notice in README you said "Multi-gpu help wanted". If you are indicating data parallelism, then it can be implemented in several lines in pytorch using nn.DataParallel.

In your train.py line 82, simply modify code

    if args.cuda:
        net = net.cuda()

to

    if args.cuda:
        net = net.cuda()
        net = nn.DataParallel(net, devices=[0,1,2,3])

can make whole model parallel.

Cat vs Dog

I have slightly modified your algorithm and have adapted it for two classes (k = 12, reduction = 0.5, bottleneck = True). When I train it on Cat and Dog images from CIFAR-10, I only go as high as 82% validation accuracy. Is that what you get as well? Or you get something closer to accuracy for all 10 classes, i.e. > %95?

Question - What is the purpose of this piece of Code in densenet.py?

Learning CNN architectures. Can you please tell what this piece of code does and why we are doing this? I could not relate it with the paper.

for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
elif isinstance(m, nn.Linear):
m.bias.data.zero_()

Add bug solution/fix to bug discussion page

Hey,

I think it would be good to include how your CIFAR-10 convergence problem was solved. At the moment the discussion page just includes problem details.

Good to hear you got it working.

There is a size mismatch due to this lines

out = self.dense3(out)
out = torch.squeeze(F.avg_pool2d(F.relu(self.bn1(out)), 8))

I solve the issue by changing the lines of codes to

out = self.dense3(out) out = self.relu(self.bn1(out)) out = F.avg_pool2d(out, 8) out = out.view(-1, self.nChannels)

where self.relu has been initialised as self.relu = nn.ReLU(inplace=True)

Why is there a PID in device 0 while I set called all cuda(1)?

See 2017-05-17-160814_1914x1005_scrot

I have changed the train.py, you can find that I call cuda(1) at all. But why is there the same PID in the device 0??? Am I missing something?

#!/usr/bin/env python3

import argparse
import os
import setproctitle
import shutil

import densenet
import torch
from torch import optim
from torch.autograd import Variable
from torch.nn import functional as F
from torch.utils.data import DataLoader
import torchvision
from torchvision import transforms


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--batchSz', type=int, default=64)
    parser.add_argument('--nEpochs', type=int, default=300)
    parser.add_argument('--no-cuda', action='store_false')
    parser.add_argument('--save')
    parser.add_argument('--seed', type=int, default=1)
    parser.add_argument(
            '--opt', type=str, default='sgd',
            choices=('sgd', 'adam', 'rmsprop'))
    args = parser.parse_args()

    args.cuda = args.no_cuda and torch.cuda.is_available()
    if args.cuda:
        torch.cuda.manual_seed(args.seed)

    args.save = args.save or 'work/densenet.base'
    setproctitle.setproctitle(args.save)
    if os.path.exists(args.save):
        shutil.rmtree(args.save)
    os.makedirs(args.save, exist_ok=True)

    torch.manual_seed(args.seed)

    normMean = [0.49139968, 0.48215827, 0.44653124]
    normStd = [0.24703233, 0.24348505, 0.26158768]
    normTransform = transforms.Normalize(normMean, normStd)
    trainTransform = transforms.Compose([
            transforms.RandomCrop(32, padding=4),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            normTransform
    ])
    testTransform = transforms.Compose([
            transforms.ToTensor(),
            normTransform
    ])

    kwargs = {'num_workers': 1, 'pin_memory': True} if args.cuda else {}
    trainLoader = DataLoader(
            torchvision.datasets.CIFAR10(
                    root='cifar',
                    train=True,
                    download=True,
                    transform=trainTransform),
            batch_size=args.batchSz, shuffle=True, **kwargs)
    testLoader = DataLoader(
            torchvision.datasets.CIFAR10(
                    root='cifar',
                    train=False,
                    download=True,
                    transform=testTransform),
            batch_size=args.batchSz, shuffle=False, **kwargs)

    net = densenet.DenseNet(
            growthRate=12,
            depth=100,
            reduction=0.5,
            bottleneck=True,
            nClasses=10)

    print('  + Number of params: {}'.format(
            sum([p.data.nelement() for p in net.parameters()])))
    if args.cuda:
        net = net.cuda(1)

    if args.opt == 'sgd':
        optimizer = optim.SGD(
                net.parameters(), lr=1e-1, momentum=0.9, weight_decay=1e-4)
    elif args.opt == 'adam':
        optimizer = optim.Adam(net.parameters(), weight_decay=1e-4)
    elif args.opt == 'rmsprop':
        optimizer = optim.RMSprop(net.parameters(), weight_decay=1e-4)

    trainF = open(os.path.join(args.save, 'train.csv'), 'w')
    testF = open(os.path.join(args.save, 'test.csv'), 'w')

    for epoch in range(1, args.nEpochs + 1):
        adjust_opt(args.opt, optimizer, epoch)
        train(args, epoch, net, trainLoader, optimizer, trainF)
        test(args, epoch, net, testLoader, optimizer, testF)
        torch.save(net, os.path.join(args.save, 'latest.pth'))
        os.system('./plot.py {} &'.format(args.save))

    trainF.close()
    testF.close()


def train(args, epoch, net, trainLoader, optimizer, trainF):
    net.train()
    nProcessed = 0
    nTrain = len(trainLoader.dataset)
    for batch_idx, (data, target) in enumerate(trainLoader):
        if args.cuda:
            data, target = data.cuda(1), target.cuda(1)
        data, target = Variable(data), Variable(target)
        optimizer.zero_grad()
        output = net(data)
        loss = F.nll_loss(output, target)
        # make_graph.save('/tmp/t.dot', loss.creator); assert(False)
        loss.backward()
        optimizer.step()
        nProcessed += len(data)
        pred = output.data.max(1)[1]
        # get the index of the max log-probability
        incorrect = pred.ne(target.data).cpu().sum()
        err = 100.0 * incorrect / len(data)
        partialEpoch = epoch + batch_idx / len(trainLoader) - 1
        print(
                'Train Epoch: {:.2f} [{}/{}\t'
                '({:.0f}%)]\n'
                'Loss: {:.6f}\t' 'Error: {:.6f}'.format(
                        partialEpoch, nProcessed, nTrain,
                        100. * batch_idx / len(trainLoader),
                        loss.data[0], err))
        trainF.write('{},{},{}\n'.format(partialEpoch, loss.data[0], err))
        trainF.flush()


def test(args, epoch, net, testLoader, optimizer, testF):
    net.eval()
    test_loss = 0
    incorrect = 0
    for data, target in testLoader:
        if args.cuda:
            data, target = data.cuda(1), target.cuda(1)
        data, target = Variable(data, volatile=True), Variable(target)
        output = net(data)
        test_loss += F.nll_loss(output, target).data[0]
        pred = output.data.max(1)[1]
        # get the index of the max log-probability
        incorrect += pred.ne(target.data).cpu().sum()
    test_loss = test_loss
    test_loss /= len(testLoader)
    # loss function already averages over batch size
    nTotal = len(testLoader.dataset)
    err = 100.0 * incorrect / nTotal
    print()
    print(
            'Test set: Average loss: {:.4f}\n'
            'Error: {}/{} ({:.0f}%)\n'.format(
                    test_loss,
                    incorrect, nTotal, err))

    testF.write('{},{},{}\n'.format(epoch, test_loss, err))
    testF.flush()


def adjust_opt(optAlg, optimizer, epoch):
    if optAlg == 'sgd':
        if epoch < 150:
            lr = 1e-1
        elif epoch == 150:
            lr = 1e-2
        elif epoch == 225:
            lr = 1e-3
        else:
            return

        for param_group in optimizer.param_groups:
            param_group['lr'] = lr

if __name__ == '__main__':
    main()

Is the DenseBlock Implementation correct?

Looking at your DenseBlock implementation, I don't see how the activations of the earlier layers before the previous one are being propagated to the later layers. Is the implementation really the same as in the DenseNet paper?

How can it be ran on Cifar-100?

I changed the dataset class name in the code from CIFAR10 to CIFAR100 but got serveral error during loss.backward(), like CUDNN_STATUS_MAPPING_ERROR or cublas runtime error the gpu program failed to execute. So I guess there must be something specified to CIFAR10 in this code. But I can't find it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.