Giter VIP home page Giter VIP logo

Hi there 👋

I am a PhD Student in Health Data Science at Oxford supervised by Professor Jens Rittscher and funded by Professor Fergus Gleeson. I am focusing on applications of Computer Vision 👀💻 to improving diagnostics and treatment of patients with lung cancer as part of the DART lung health project ( see my role in the project).

February 2024: my first main conference paper (pre-print 📝, code 💻) got accepted to ISBI-2024 conference!🚀 In our work "Accurate Subtyping of Lung Cancers by Modelling Class Dependencies", we (1) construct a weakly-supervised multi-label lung cancer histology dataset from three public (TCGA, TCIA-CPTAC, DHMC), and one in-house dataset DART, (2) propose a class-dependency injection method allowing the learning of robust bag representations suitable for multi-label problems under weakly-supervised settings. Dataset creation, model building, and training code is available in the dependency-mil repository.

September 2022: my first workshop paper 📝 (pre-print 📝, code 💻) got published at MICCAI 2022 CaPTion workshop! 🚀 In our work "Active Data Enrichment by Learning What to Annotate in Digital Pathology", we (1) proposed a new comprehensive annotation protocol for lung cancer pathology, (2) proposed a new metric for comparing how well a retrieval methods can prioritize examples from underrepresented classes, and (3) demonstrated that annotating and adding top-runked examples into the training set results in greater improvements to the algorithm performance than annotating and adding random examples. Links: published paper, open-access paper, code.

December 2020: my first mini-conference working notes paper 📝 (code 💻) got published at MediaEval 2020 Multimedia Benchmark workshop 🚀. In our work "Real-Time Polyp Segmentation Using U-Net with IoU Loss" we explored how using a combination of differentiable IoU and BCE losses affects the segmentation performance measured by meanIoU and DiceScore when training a simple U-Net. Links: published open-access paper, code.


Public histology data sources. If you also want to start working with histopathology images, but do not have or are waiting for your own data, consider starting with "Dartmouth Lung Cancer Histology Dataset" DHMC, the "The Cancer Genome Atlas" (TCGA), and "The Cancer Imaging Archive" TCIA-CPTAC. Downloading large volumes of data is not a trivial task, so I documented my process for TCGA-lung-histology-download, TCIA-CPTAC-lung-histology-download.

Public natural images sources. Another thing you can do if you are lacking medical data is to simulate parts of your future workflow on natural images, e.g. classifying medical images for presence or absence of particular patterns can be similar to classifying natural images for presence or absence of particular objects. I used images from the COCO dataset. You can see my work here: GeorgeBatch/cocoapi.


Education


Here are some of the best free online resources to boost your ML/DL knowledge 🚀 I am currently doing it, while skipping the repetitive parts ⏰


George Batchkala's Projects

ab-testing icon ab-testing

A/B Testing — A complete guide to statistical testing

active-data-enrichment icon active-data-enrichment

[MICCAI'2022] CaPTion workshop: Active Data Enrichment by Learning What to Annotate in Digital Pathology

arch-pre-training icon arch-pre-training

ARCH pre-training: reproducing results from Multiple Instance Captioning: Learning Representations From Histopathology Textbooks and Articles

attentiondeepmil icon attentiondeepmil

Implementation of Attention-based Deep Multiple Instance Learning in PyTorch

cocoapi icon cocoapi

Multi-label Binary Classification on COCO Dataset @ http://cocodataset.org/

dependency-mil icon dependency-mil

[ISBI 2024] Accurate Subtyping of Lung Cancers by Modelling Class Dependencies

dsmil-wsi-public-fork icon dsmil-wsi-public-fork

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image

gpytorch icon gpytorch

A highly efficient and modular implementation of Gaussian Processes in PyTorch

introduction_to_ml_with_python icon introduction_to_ml_with_python

Notebooks and code for the book "Introduction to Machine Learning with Python". GeorgeBatch: scripts used while going through the book during summer 2018.

kvasir-seg icon kvasir-seg

[MediaEval Medico Challenge'2020]: Polyp Segmentation

moleculenet icon moleculenet

MSc Dissertation: Estimating Uncertainty in Machine Learning Models for Drug Discovery

pythonintro2020 icon pythonintro2020

This is an introductory Python course I am running for the 2019 Health and Data Science CDT at Oxford University.

pytorch-studiogan icon pytorch-studiogan

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

radpathfusionlung icon radpathfusionlung

Repository contains code that allows the registration of histology slices and CT in the context of lung cancer.

resnet icon resnet

Clean, scalable and easy to use ResNet implementation in Pytorch

tiatoolbox icon tiatoolbox

Computational Pathology Toolbox developed by TIA Centre, University of Warwick.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.