Giter VIP home page Giter VIP logo

deephash-pytorch's People

Contributors

fuchun-wang avatar riesling00 avatar swuxyj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deephash-pytorch's Issues

colab

大佬您有考虑过把代码改成能在colab TPU中使用吗,那仿佛能节省很多时间,没有其他意思,我就提议一下,真心佩服大佬

A small mistake about the implementation of "Unsupervised_BiHalf.py"

Hey,

I am one author of paper "Unsupervised_BiHalf", and thanks a lot for adding our paper in this nice Implementation.

I notice a small mistake in implementing the model at test time in line 77-79:
1
During training, we train to align continuous distribution p(u) with the half-half distribution of +1 and −1(bi-half layer). This allows us at test time to simply use the "sign function" as a deterministic function for quantization.
image

Could you help us change bi-half to sign at test time, thanks a lot!

自己的数据集运行DSH很慢

`class DSHLoss(torch.nn.Module):
def init(self, config, bit):
super(DSHLoss, self).init()
self.m = 2 * bit
self.U = torch.zeros(config["num_train"], bit).float().to(config["device"])
self.Y = torch.zeros(config["num_train"], config["n_class"]).float().to(config["device"])

def forward(self, u, y, ind, config):
    self.U[ind, :] = u.data
    self.Y[ind, :] = y.float()

    dist = (u.unsqueeze(1).half() - self.U.unsqueeze(0).half()).pow(2).sum(dim=2)
    y = (y @ self.Y.t() == 0).float()

    loss = (1 - y) / 2 * dist + y / 2 * (self.m - dist).clamp(min=0)
    loss1 = loss.mean()
    loss2 = config["alpha"] * (1 - u.abs()).abs().mean()

    return loss1 + loss2


def hashing_loss(b, cls, m, alpha):
"""
compute hashing loss
automatically consider all n^2 pairs
"""
y = (cls.unsqueeze(0) != cls.unsqueeze(1)).float().view(-1)
dist = ((b.unsqueeze(0) - b.unsqueeze(1)) ** 2).sum(dim=2).view(-1)
loss = (1 - y) / 2 * dist + y / 2 * (m - dist).clamp(min=0)

loss = loss.mean() + alpha * (b.abs() - 1).abs().sum(dim=1).mean() * 2

return loss`

第一个是您代码中Loss,第二个是
DSH-pytorch的实现,第二个中并没有生成shape为[num_train,hashbit]的向量,这会不会让训练速度变慢呢

coco训练集与原始HashNet不一致

您好,非常感谢您的贡献!
在跑实验的过程中,发现coco数据集的训练集和原始HashNet不一致。具体地,比如COCO_val2014_000000480663.jpg,没有出现在原始HashNet的训练集中,可以帮忙检查一下吗,非常感谢啦~

关于计算 mAP@K 的一个问题

拿计算mAP@3为例,

case1: [+ - +] AP@3= (1 + 2/3) / 2 = 5/6
case2: [+ - - ] AP@3 = (1) / 1 = 1

有两个case,case1 返回两个正样本,case2 返回一个正样本,但是计算出来的AP@3,case1却低于case2,
这合理吗?

bug report

In GreedyHash.py line 66, y_pre is not defined.
I have looked your history commit and think we should add y_pre = self.fc(b) before that line.

Greedy Hash如何在多标签数据NUS-WIDE, COCO上训练

非常感谢你的project,实在太棒了。有个问题请教,Greedy Hash是基于交叉熵目标分类损失,但是NUS和COCO数据集是多标签图像,如何训练Greedy Hash,看到你给了结果,但不知道怎么训练?

How to speed up the loss computation in DTSH

Hi, swuxyj. Nice work for this community.
It is noted that the training loss of DTSH contains a for loop which is somewhat time-consuming. Is there any change to speed up this op? It seems that the for loop can be parallelized.

pairwise and triplet data preparation

Hi, thank you for your awesome work. I'm learning pytorch so it's little hard for me to understand your code. How do you prepare pairwise or triplet data and feed them into the model in the training phase?

Would anyone be willing to share their pretrained model?

Hi! I am looking to do just inference and would love to avoid retraining on something like ImageNet (I don't have enough gpus). Would anyone be willing to provide a pretrained model on any of the deep hash models? DHN, DSH, etc, and on ImageNet or NUS-WIDE or similar.

I would really appreciate it!! Thanks.

Explanation why to use cifar10-2

image

(Database0 represents unique images that are not in the train set)

When we test our model, we assume that it corresponds to a real-world situation when we can train seen database before making unseen queries. Therefore we can include the train set in the database set. However, in the case of NUS and COCO classes of datasets are not balanced. We have no guarantee that we have 193734//21=9225 images of every class. Empirically, I have found that in nuswide_21 code.py we can set train_num up to ~2100. I do not know why someone decided to put only 500 per class, but I definitely know that Train <= Database.
Thus, we have 3 scenarios:

  1. CIFAR is balanced, therefore we can use Train = Dataset
  2. NUS: we try to balance, therefore we Train <= Dataset
  3. COCO: we do not balance (too hard?), Train <= Dataset

tools.py cifar_dataset_root 路径问题

执行的时候提示无权创建目录,跟踪发现路径错误,建议做以下调整。
原代码:
cifar_dataset_root = '/dataset/cifar/'
建议改成:
cifar_dataset_root = os.getcwd()+'/dataset/cifar/'

hashnet结果差异

我这边直接跑python HashNet.py, 在cifar数据集上,指标可以到80.4%, 和你文档上的有出入

超参数的问题

您好,不知道您是否可以分享一下 imagenet 训练超参数 (如果可以的话,所有数据集能不能都分享呢?),
这三个方法在 imagenet 上我都得不到原始文章的分数,都围绕在 0.3~0.5,和原始文章也差太多,用原始文章的超参数也训不了
非常感谢

关于复现不到论文结果的事情

非常感谢您的工作。我想问一下为什么贝叶斯网络这一系列方法,hashnet,DCH等,都达不到那个结果,请问这跟使用什么GPU有关系嘛,我看您复现的结果是相近的,我也用了您的代码,但是还不行,是有什么细节嘛??

Question about dataset

If I have an image classification dataset, how could I get the onehot label of each image for retrieval?

实验结果

请问这些都是您自己复现的吗,我觉得真的很厉害,就想问一下您在代码中所使用的都是论文中的所使用的参数吗,然后map是您自己通过实验所获得的吗

coco数据集

您好,非常感谢您分享的代码。关于coco数据,请问有国内下载链接吗?我在您给谷歌网盘中下载时,每次下载到一半都会出错无法继续下载。非常感谢。

CSQ: big class num

hi~ I want to train CSQ on a person reid task.
the class num is more than 40,000
the code which was used to set the hash target center can be extremly time consumming,
do you have any suggestion?
@swuxyj

`

if H_2K.shape[0] < n_class:
    hash_targets.resize_(n_class, bit)
    for k in range(20):
        for index in range(H_2K.shape[0], n_class):
            ones = torch.ones(bit)
            # Bernouli distribution
            sa = random.sample(list(range(bit)), bit // 2)
            ones[sa] = -1
            hash_targets[index] = ones
        # to find average/min  pairwise distance
        c = []
        for i in range(n_class):
            for j in range(n_class):
                if i < j:
                    TF = sum(hash_targets[i] != hash_targets[j])
                    c.append(TF)
        c = np.array(c)

        # choose min(c) in the range of K/4 to K/3
        # see in https://github.com/yuanli2333/Hadamard-Matrix-for-hashing/issues/1
        # but it is hard when bit is  small
        if c.min() > bit / 4 and c.mean() >= bit / 2:
            print(c.min(), c.mean())
            break

`

Include Deep Perceptual Hash Based on Hash Center

Anyone keen to build this one?
It's supposed to perform pretty well: https://ieeexplore.ieee.org/document/9950236

I had a crack at it but keep getting FFFFF hashes. I think I'm way off...

import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array
from tensorflow.keras.applications.resnet50 import preprocess_input, ResNet50
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, CenterCrop
from scipy.linalg import hadamard
import random
import os

# Function to generate hash centers using a Hadamard matrix
def generate_hash_centers(hash_size):
    assert (hash_size & (hash_size - 1) == 0) and hash_size != 0, "Hash size must be a power of 2"
    H = hadamard(hash_size)
    return np.where(H > 0, 1, 0)

# Custom loss function
import tensorflow as tf

def hamming_distance(tensor):
    """Compute pairwise Hamming distance for a batch of binary vectors."""
    x = tf.cast(tensor, dtype=tf.int32)
    x_expand = tf.expand_dims(x, 2)  # Expand to make it a 3D tensor
    x_t = tf.transpose(x_expand, [1, 0, 2])
    distances = tf.math.reduce_sum(tf.math.abs(x_expand - x_t), axis=-1)
    return distances

def custom_loss(hash_centers, margin=0.5, lambda_dist=0.1):
    """
    Custom loss function incorporating distinct quantization with central similarity.
    - hash_centers: Predefined hash centers for each class
    - margin: Minimum desired Hamming distance between different class hash outputs
    - lambda_dist: Weighting factor for the distinct quantization component of the loss
    """
    hash_centers_tensor = tf.constant(hash_centers, dtype=tf.float32)

    def loss(y_true, y_pred):
        # Convert predictions to binary
        y_pred_binary = tf.round(y_pred)  # Threshold predictions to 0 or 1

        # Central similarity loss
        centers = tf.gather(hash_centers_tensor, tf.cast(y_true, tf.int32))
        central_similarity_loss = tf.reduce_mean(tf.keras.losses.binary_crossentropy(centers, y_pred))

        # Calculate pairwise Hamming distances for binary predictions
        # Expanded predictions to compare each pair
        expanded_pred = tf.expand_dims(y_pred_binary, 0)
        transposed_pred = tf.expand_dims(y_pred_binary, 1)
        # Calculate Hamming distance
        hamming_distances = tf.reduce_sum(tf.abs(expanded_pred - transposed_pred), axis=2)

        # Mask for distinct quantization: exclude self and same-class comparisons
        batch_size = tf.shape(y_pred)[0]
        mask_self = 1 - tf.eye(batch_size)
        labels_equal = tf.equal(tf.expand_dims(y_true, 0), tf.expand_dims(y_true, 1))
        mask_class = 1 - tf.cast(labels_equal, dtype=tf.float32)
        mask = mask_self * mask_class

        # Distinct quantization loss
        penalties = tf.maximum(0., margin - tf.cast(hamming_distances, tf.float32))
        distinct_loss = tf.reduce_sum(penalties * mask) / (tf.reduce_sum(mask) + 1e-8)

        # Combine losses
        return central_similarity_loss + lambda_dist * distinct_loss

    return loss

# Model creation function
def create_model(hash_size=64):
    base_model = ResNet50(include_top=False, input_shape=(224, 224, 3), pooling='avg')
    hash_layer = Dense(hash_size, activation='sigmoid')
    model = Sequential([base_model, hash_layer])
    return model

def add_noise(img):
    VARIABILITY = 25
    deviation = VARIABILITY*random.random()
    noise = np.random.normal(0, deviation, img.shape)
    img += noise
    np.clip(img, 0., 255.)
    return img

def preprocess_image(img):
    img = add_noise(img)
    img = preprocess_input(img)
    return img

# Function to preprocess images for training and generate augmented images
def preprocess_images(image_directory, batch_size=32):
    datagen = ImageDataGenerator(
        preprocessing_function=preprocess_image,
        rotation_range=10,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,

        horizontal_flip=True,
        fill_mode='nearest'
    )
    generator = datagen.flow_from_directory(
        image_directory,
        target_size=(224, 224),
        batch_size=batch_size,
        class_mode='sparse'  # Assuming sparse labels for hash centers
    )
    if generator.samples == 0:
        print("No images found in specified directory.")
    else:
        print(f"Found {generator.samples} images belonging to {generator.num_classes} classes.")
    return generator

# Load or train model
def get_model(hash_size, hash_centers, dataset_path=None):
    model_path = f'resnet50_hash_model_{hash_size}.keras'
    if os.path.exists(model_path):
        # Load model without specifying custom loss
        model = tf.keras.models.load_model(model_path, compile=False)
        # After loading, recompile the model with the custom loss
        model.compile(optimizer='adam', loss=custom_loss(hash_centers))
    else:
        model = create_model(hash_size)
        model.compile(optimizer='adam', loss=custom_loss(hash_centers))
        if dataset_path:
            train_generator = preprocess_images(dataset_path)
            # print(train_generator)
            model.fit(train_generator, epochs=2)
            model.save(model_path, save_format='tf')
    return model

# Function to generate hash for a single image
def generate_hash(model, preprocessed_img):
    predictions = model.predict(preprocessed_img)
    binary_hash = np.where(predictions > 0.5, 1, 0)
    return binary_hash

# Function to convert binary hash to hexadecimal
def binary_to_hex(binary_hash):
    return ''.join(format(x, '02x') for x in np.packbits(binary_hash[0]))

# Main execution setup
if __name__ == "__main__":
    hash_size = 32  # Using a 1024-bit hash
    hash_centers = generate_hash_centers(hash_size)
    print("Type of hash_centers:", type(hash_centers))
    print("Shape of hash_centers:", hash_centers.shape)

    # Assuming an image path for demonstration; replace with your actual image path
    img_path = 'image.jpg'
    img = load_img(img_path, target_size=(224, 224))

    # # Preprocess the image for the model
    preprocessed_img = preprocess_input(img_to_array(img))  # Keep batch dimension for the model
    print("Type of preprocessed_img:", type(preprocessed_img))
    print("Shape of preprocessed_img:", preprocessed_img.shape)
    
    # Add batch dimension
    preprocessed_img = np.expand_dims(preprocessed_img, axis=0)

    # Get or train the model
    model = get_model(hash_size, hash_centers, dataset_path='./test/')

    # predict
    predictions = model.predict(preprocessed_img)
    print("Raw predictions:", predictions)

    # Generate hash for the provided image
    hash_code = generate_hash(model, preprocessed_img)
    hex_hash = binary_to_hex(hash_code)
    print("Generated Hash for the Image (Hex):", hex_hash)
    

Pretrained models

Could you release pretrained models corresponding to your result table?

About the bit_list in CSQ.py

Line 163-165 for bit in config["bit_list"]:
train_val(config, bit)

orror
for obj in iterable:
TypeError: 'int' object is not iterable

关于cifar10数据集问题

您好,我想问下,我加载了cifar10数据集,怎么找不到train、test和database.txt文件。
还想问下,我是否可以自己加载cifar10,自己生成cifar10的train、test和database.txt来运行,
还有一个疑问,为何这里cifar10数据集要单独处理?谢谢!

Changing the Net to Resnet50

Thanks for sharing the code, beautifule work!

So i simply change the net to the pretrained Resnet50 of pytorch, and run DPSH and DHN on CUB-200-2011. But the mAP is poor, so i tried different learning rate, but not much of a change.

Any idea what's the problem? I modified the fc layer of Resnet 50 to hash code length and use the output to compute loss. Any other specific operations should be applied?

关于imagenet数据集的相关问题

您好,非常感谢您的贡献,请问imagenet数据集的train.txt、test.txt和database.txt之间有什么样的关系,数量比例是多少,我想构建一个imagenet格式的自己的数据集,可是程序会报维度的错误,只有当训练、测试和数据库都用同样的数据集才不报错,期待您的回复,谢谢!

DSDH的问题

DSDH那个模型我个人觉得有问题,应该是一个epoch后更新全部的B,而不是一个batch更新一次

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.