Giter VIP home page Giter VIP logo

kaggle_humpback_whale_identification's Introduction

Some best soluton

rank solution github author keyword
1th 1th Place Solution Github code earhian classification
3rd 3rd Place Solution Github pudae ArcFace
4th 4th Place Solution Github code David SIFT+Siamese
7th 7th Place Solution Github code old-ufo classification
9th 9th Place Solution Github code lvan Sosin GapNet
25th 25th Place Solution Github code Bartek CosFace+ProtoNets
31st 31st Place Solution Github code Khoi Nguyen RGB
57th 57th Place Solution Github code Miguel Pinto SoftTripletLoss

My solution

Heavily based on Whale Recognition Model with score 0.78563

Training

  • Framework: Keras(backend: tensorflow)
  • Model: Siamese(CNN+Metric Learning)
  • Augmentation: slight(otation, shear, height_zoom, width_zoom, height_shift, width_shift)
  • Preprocess: rotate some special images, convert grayscale,get bounding boxs, affine tranformation
  • Optimizer: Adam
  • Learning rate: start at 64e-5, and 4 times less training per epoch group
  • Image size: 512*512
  • Epochs: 400 or more
  • Batch size: 32

Prediction

  • Threshold: 0.99 and 0.94 with bootstrapping
  • TTA number: 4
  • TTA augmentaion: random slight: (rotation, shear, height_zoom, width_zoom, height_shift, width_shift)

Result

  • Training takes about more than 80 hours on GTX 1080TI without pretrained state-of-art model
  • Public LB: 0.92248
  • Private LB: 0.92761

Mode result ensemble:

  • Ensemble of ensemble is not feasible, but ensemble is very effective
  • If single model is selected as far as possible for fusion, the effect is better, but the model difference is large, so the fusion effect is better. The fusion effect of models with similar Epochs is not as good as that with large difference
  • The ensemble of tta*4 + original result is effective

ensemble code

# coding:utf-8
# filename:ensemble.py
# function:模型识别结果融合程序,融合4个最好的结果

import csv
sub_files = [
            './submissions/submission_Simaese_Epochs220_multithreads_lapjv_512size_0.883.csv',
            './submissions/submission_Simaese_Epochs210_multithreads_lapjv_384size_0.884.csv',
            './submissions/submission_ensemble_(Epoch250_tta*4+original)_0.901.csv',
            './submissions/submission_Simaese_Epochs390_multithreads_lapjv_512size_0.905.csv',
            './submissions/submission_ensemble_(Boot_Epoch350_tta*4+original)_0.908.csv',
            './submissions/submission_ensemble_(Epoch400_tta*4+original)_0.912.csv']

print(len(sub_files))

# Weights of the individual subs
sub_weight = [
            0.883 ** 2,
            0.884 ** 2,
            0.901 ** 2,
            0.905 ** 2,
            0.908 ** 2,
            0.912 ** 2]
Hlabel = 'Image'
Htarget = 'Id'
npt = 5 # number of places in target
place_weights = {}
for i in range(npt):
    place_weights[i] = (1 / (i + 1))
print(place_weights)
lg = len(sub_files)
sub = [None] * lg
for i, file in enumerate(sub_files):
    ## input files ##
    print("Reading {}: w={} - {}".format(i, sub_weight[i], file))
    reader = csv.DictReader(open(file, "r")) # 将csv文件数据读入到字典中
    sub[i] = sorted(reader, key=lambda d: str(d[Hlabel]))
## output file ##
out = open("./submissions/submission_ensemble_zh.csv", "w", newline='')
writer = csv.writer(out)
writer.writerow([Hlabel, Htarget])
for p, row in enumerate(sub[0]):
    target_weight = {}
    for s in range(lg):
        row1 = sub[s][p]
        for ind, trgt in enumerate(row1[Htarget].split(' ')):
            target_weight[trgt] = target_weight.get(trgt, 0) + (place_weights[ind] * sub_weight[s])
    tops_trgt = sorted(target_weight, key=target_weight.get, reverse=True)[:npt]
    writer.writerow([row1[Hlabel], " ".join(tops_trgt)])
out.close()

My conclusion

Work

  • Large image size helps a lot
  • ensemble is useful, but correct ensemble strategy is more useful
  • TTA maybe help, but ensemble of tta must be help
  • Put all images into SSD faster than HDD in training
  • training more epochs helps a lot
  • bootstrapping helps, but it need more time to train

Don't work

  • pure classition don't work, but if you do some extra works,classition maybe very useful, such as this 1thsolution
  • n-fold CV: my parteners have tried 5-fold CV, but it dont't work, maybe our ways have some problem, but i dont see n-fold CV as solution in Kaggle Dissussion

Uncertain

  • Grayscale images are not necessarily more effective than RGB

Usage

Environments

Hardware requirements
  • GTX1060, GTX1080TI better
  • 32GB Memory
  • SSD

Software requirments

  • Ubuntu 18.04
  • Anaconda3/Python3
  • Keras(backend: tensorflow

Steps for usage

  • 1.clone the repository
git https://github.com/HarleysZhang/kaggle_humpback_whale_identification.git
cd kaggle_humpback_whale_identification
  • 2.install requirements
pip3 install -r requirements.txt
  • 3.download data and copy it to data folder
kaggle competitions download -c humpback-whale-identification
cp train ./data/
cp test ./data/
cp train.csv ./data/
cp sample_submission.csv ./data/
  • 4.train your model without bootstrapping
python3 main_all.py

with bootstrapping

python3 main_with_bootstrapping.py
  • 5.ensemble submission file
python test.py
# python test_tta.py    # with tta

Some Code Interpretation

Build a transformation matrix with the specified characteristics.

def build_transform(rotation, shear, height_zoom, width_zoom, height_shift, width_shift):
	"""
	Build a transformation matrix with the specified characteristics.
	"""
	rotation = np.deg2rad(rotation)
	shear = np.deg2rad(shear)
	rotation_matrix = np.array(
		[[np.cos(rotation), np.sin(rotation), 0], [-np.sin(rotation), np.cos(rotation), 0], [0, 0, 1]])
	shift_matrix = np.array([[1, 0, height_shift], [0, 1, width_shift], [0, 0, 1]])
	shear_matrix = np.array([[1, np.sin(shear), 0], [0, np.cos(shear), 0], [0, 0, 1]])
	zoom_matrix = np.array([[1.0 / height_zoom, 0, 0], [0, 1.0 / width_zoom, 0], [0, 0, 1]])
	shift_matrix = np.array([[1, 0, -height_shift], [0, 1, -width_shift], [0, 0, 1]])
	return np.dot(np.dot(rotation_matrix, shear_matrix), np.dot(zoom_matrix, shift_matrix))

Compute the score matrix by scoring every pictures from the training set against every other picture O(n^2) with multithreads.

def compute_score(verbose=1):
	"""
	Compute the score matrix by scoring every pictures from the training set against every other picture O(n^2).
	"""
	features = branch_model.predict_generator(FeatureGen(train, batch_size=64, verbose=verbose),max_queue_size=12, workers=6, verbose=0)
	num_threads = 6
	batch = features.shape[0] // (num_threads - 1)
	if features.shape[0] % batch <= 3:
	            num_threads = 5
		if features.shape[0] % batch is not 0:
			batch += 1
	all_score = []
	for start in range(0, features.shape[0], batch):
		end = min(features.shape[0], start + batch)
		temp_features = features[start:end, :]
		temp_score = head_model.predict_generator(ScoreGen(temp_features, batch_size=4096, verbose=verbose),max_queue_size=12, workers=6, verbose=0)
		temp_score = score_reshape(temp_score, temp_features)
		all_score.append(temp_score)
	score = np.zeros((features.shape[0], features.shape[0]), dtype=K.floatx())
	for i, start in enumerate(range(0, features.shape[0], batch)):
		end = min(features.shape[0], start + batch)
		score[start:end, start:end] = all_score[i]
	return features, score

sompute Linear programming problem with multithreads

def my_lapjv(score):
    num_threads = 6
    batch = score.shape[0] // (num_threads - 1)
    if score.shape[0] % batch <= 3:
        num_threads = 5
        if score.shape[0] % batch is not 0:
            batch += 1
    # print(batch)
    tmp = num_threads * [None]
    threads = []
    thread_input = num_threads * [None]
    thread_idx = 0
    for start in range(0, score.shape[0], batch):
        end = min(score.shape[0], start + batch)
        # print('%d %d' % (start, end))
        thread_input[thread_idx] = score[start:end, start:end]
        thread_idx += 1

    def worker(data_idx):
        x, _, _ = lapjv(thread_input[data_idx])
        tmp[data_idx] = x + data_idx * batch

    # print("Start worker threads")
    for i in range(num_threads):
        t = threading.Thread(target=worker, args=(i,), daemon=True)
        t.start()
        threads.append(t)
    for t in threads:
        if t is not None:
            t.join()
    x = np.concatenate(tmp)
    # print("LAP completed")
    return x

Reference

Whale Recognition Model with score 0.78563

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.