Debiasing concepts |
Debiasing Concept Bottleneck Models with Instrumental Variables |
ICLR 2021 submissions page - Accepted as Poster |
|
causality |
Prototype Trajectory |
Interpretable Sequence Classification Via Prototype Trajectory |
ICLR 2021 submissions page |
|
this looks like that styled RNN |
Shapley dependence assumption |
Shapley explainability on the data manifold |
ICLR 2021 submissions page |
|
|
High dimension Shapley |
Human-interpretable model explainability on high-dimensional data |
ICLR 2021 submissions page |
|
|
L2x like paper |
A Learning Theoretic Perspective on Local Explainability |
ICLR 2021 submissions page |
|
|
Evaluation |
Evaluation of Similarity-based Explanations |
ICLR 2021 submissions page |
|
like adebayo paper for this looks like that styled methods |
Model correction |
Defuse: Debugging Classifiers Through Distilling Unrestricted Adversarial Examples |
ICLR 2021 submissions page |
|
|
Subspace explanation |
Constraint-Driven Explanations of Black-Box ML Models |
ICLR 2021 submissions page |
|
to see how close to MUSE by Hima Lakkaraju 2019 |
Catastrophic forgetting |
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting |
ICLR 2021 submissions page |
Code available in their Supplementary zip file |
|
Non trivial counterfactual explanations |
Beyond Trivial Counterfactual Generations with Diverse Valuable Explanations |
ICLR 2021 submissions page |
|
|
Explainable by Design |
Interpretability Through Invertibility: A Deep Convolutional Network With Ideal Counterfactuals And Isosurfaces |
ICLR 2021 submissions page |
|
|
Gradient attribution |
Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability |
ICLR 2021 submissions page |
|
looks like extension of Sixt et al paper |
Mask based Explainable by Design |
Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability |
ICLR 2021 submissions page |
|
|
NBDT - Explainable by Design |
NBDT: Neural-Backed Decision Tree |
ICLR 2021 submissions page |
|
|
Variational Saliency Maps |
Variational saliency maps for explaining model's behavior |
ICLR 2021 submissions page |
|
|
Network dissection with coherency or stability metric |
Importance and Coherence: Methods for Evaluating Modularity in Neural Networks |
ICLR 2021 submissions page |
|
|
Modularity |
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks |
ICLR 2021 submissions page |
Code made anonymous for review, link given in paper |
|
Explainable by design |
A self-explanatory method for the black problem on discrimination part of CNN |
ICLR 2021 submissions page |
|
seems concepts of game theory applied |
Attention not Explanation |
Why is Attention Not So Interpretable? |
ICLR 2021 submissions page |
|
|
Ablation Saliency |
Ablation Path Saliency |
ICLR 2021 submissions page |
|
|
Explainable Outlier Detection |
Explainable Deep One-Class Classification |
ICLR 2021 submissions page |
|
|
XAI without approximation |
Explainable AI Wthout Interpretable Model |
Arxiv |
|
|
Learning theoretic Local Interpretability |
A LEARNING THEORETIC PERSPECTIVE ON LOCAL EXPLAINABILITY |
Arxiv |
|
|
GANMEX |
GANMEX: ONE-VS-ONE ATTRIBUTIONS USING GAN-BASED MODEL EXPLAINABILITY |
Arxiv |
|
|
Evaluating Local Explanations |
Evaluating local explanation methods on ground truth |
Artificial Intelligence Journal Elsevier |
sklearn |
|
Structured Attention Graphs |
Structured Attention Graphs for Understanding Deep Image Classifications |
AAAI 2021 |
PyTorch |
see how close to MACE |
Ground truth explanations |
Data Representing Ground-Truth Explanations to Evaluate XAI Methods |
AAAI 2021 |
sklearn |
trained models available in their github repository |
AGF |
Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided Factorization |
AAAI 2021 |
PyTorch |
|
RSP |
Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations |
AAAI 2021 |
|
|
HyDRA |
HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks |
AAAI 2021 |
PyTorch |
|
SWAG |
SWAG: Superpixels Weighted by Average Gradients for Explanations of CNNs |
WACV 2021 |
|
|
FastIF |
FASTIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging |
Arxiv |
PyTorch |
|
EVET |
EVET: Enhancing Visual Explanations of Deep Neural Networks Using Image Transformations |
WACV 2021 |
|
|
Local Attribution Baselines |
On Baselines for Local Feature Attributions |
AAAI 2021 |
PyTorch |
|
Differentiated Explanations |
Differentiated Explanation of Deep Neural Networks with Skewed Distributions |
IEEE - TPAMI journal |
PyTorch |
|
Human game based survey |
Explainable AI and Adoption of Algorithmic Advisors: an Experimental Study |
Arxiv |
|
|
Explainable by design |
Learning Semantically Meaningful Features for Interpretable Classifications |
Arxiv |
|
|
Expred |
Explain and Predict, and then Predict again |
ACM WSDM 2021 |
PyTorch |
|
Progressive Interpretation |
An Information-theoretic Progressive Framework for Interpretation |
Arxiv |
PyTorch |
|
UCAM |
Uncertainty Class Activation Map (U-CAM) using Gradient Certainty method |
IEEE - TIP |
Project Page |
PyTorch |
progressive GAN explainability- smiling dataset- ICLR 2020 group |
Explaining the Black-box Smoothly - A Counterfactual Approach |
Arxiv |
|
|
Head pasted in another image - experimented |
WHAT DO DEEP NETS LEARN? CLASS-WISE PATTERNS REVEALED IN THE INPUT SPACE |
Arxiv |
|
|
Model correction |
ExplOrs Explanation Oracles and the architecture of explainability |
Paper |
|
|
Explanations - Knowledge Representation |
A Basic Framework for Explanations in Argumentation |
IEEE |
|
|
Eigen CAM |
Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks |
Springer |
|
|
Evaluation of Posthoc |
How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations |
ACM |
|
|
GLocalX |
GLocalX - From Local to Global Explanations of Black Box AI Models |
Arxiv |
|
|
Consistent Interpretations |
Explainable Models with Consistent Interpretations |
AAAI 2021 |
|
|
SIDU |
Introducing and assessing the explainable AI (XAI) method: SIDU |
Arxiv |
|
|
cites This looks like that |
Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies |
AIJ |
|
|
i-Algebra |
i-Algebra: Towards Interactive Interpretability of Deep Neural Networks |
AAAI 2021 |
|
|
Shape texture bias |
SHAPE OR TEXTURE: UNDERSTANDING DISCRIMINATIVE FEATURES IN CNNS |
ICLR 2021 |
|
|
Class agnostic features |
THE MIND’S EYE: VISUALIZING CLASS-AGNOSTIC FEATURES OF CNNS |
Arxiv |
|
|
IBEX |
A Multi-layered Approach for Tailored Black-box Explanations |
Paper |
Code |
|
Relevant explanations |
Learning Relevant Explanations |
Paper |
|
|
Guided Zoom |
Guided Zoom: Zooming into Network Evidence to Refine Fine-grained Model Decisions |
IEEE |
|
|
XAI survey |
A Survey on Understanding, Visualizations, and Explanation of Deep Neural Networks |
Arxiv |
|
|
Pattern theory |
Convolutional Neural Network Interpretability with General Pattern Theory |
Arxiv |
PyTorch |
|
Gaussian Process based explanations |
Bandits for Learning to Explain from Explanations |
AAAI 2021 |
sklearn |
|
LIFT CAM |
LIFT-CAM: Towards Better Explanations for Class Activation Mapping |
Arxiv |
|
|
ObAIEx |
Right for the Right Reasons: Making Image Classification Intuitively Explainable |
Paper |
tensorflow |
|
VAE based explainer |
Combining an Autoencoder and a Variational Autoencoder for Explaining the Machine Learning Model Predictions |
IEEE |
|
|
Segmentation based explanation |
Deep Co-Attention Network for Multi-View Subspace Learning |
Arxiv |
PyTorch |
|
Integrated CAM |
INTEGRATED GRAD-CAM: SENSITIVITY-AWARE VISUAL EXPLANATION OF DEEP CONVOLUTIONAL NETWORKS VIA INTEGRATED GRADIENT-BASED SCORING |
ICASSP 2021 |
PyTorch |
|
Human study |
VitrAI - Applying Explainable AI in the Real World |
Arxiv |
|
|
Attribution Mask |
Attribution Mask: Filtering Out Irrelevant Features By Recursively Focusing Attention on Inputs of DNNs |
Arxiv |
PyTorch |
|
LIME faithfulness |
What does LIME really see in images? |
Arxiv |
Tensorflow 1.x |
|
Assess model reliability |
Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs |
Arxiv |
|
|
Perturbation + Gradient unification |
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations |
Arxiv |
|
hima lakkaraju |
Gradients faithful? |
Do Input Gradients Highlight Discriminative Features? |
Arxiv |
PyTorch |
|
Untrustworthy predictions |
Identifying Untrustworthy Predictions in Neural Networks by Geometric Gradient Analysis |
Arxiv |
|
|
Explaining misclassification |
Explaining Inaccurate Predictions of Models through k-Nearest Neighbors |
Paper |
|
cites Oscar Li AAAI 2018 prototypes paper |
Explanations inside predictions |
Have We Learned to Explain?: How Interpretability Methods Can Learn to Encode Predictions in their Interpretations |
AISTATS 2021 |
|
|
Layerwise interpretation |
LAYER-WISE INTERPRETATION OF DEEP NEURAL NETWORKS USING IDENTITY INITIALIZATION |
Arxiv |
|
|
Visualizing Rule Sets |
Visualizing Rule Sets: Exploration and Validation of a Design Space |
Arxiv |
PyTorch |
|
Human experiments |
Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making |
IUI 2021 |
|
|
Attention fine-grained classification |
Interpretable Attention Guided Network for Fine-grained Visual Classification |
Arxiv |
|
|
Concept construction |
Explaining Classifiers by Constructing Familiar Concepts |
Paper |
PyTorch |
|
EbD |
Human-Understandable Decision Making for Visual Recognition |
Arxiv |
|
|
Bridging XAI algorithm , Human needs |
Towards Connecting Use Cases and Methods in Interpretable Machine Learning |
Arxiv |
|
|
Generative trustworthy classifiers |
Generative Classifiers as a Basis for Trustworthy Image Classification |
Paper |
Github |
|
Counterfactual explanations |
Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties |
AISTATS 2021 |
PyTorch |
|
Role categorization of CNN units |
Quantitative Effectiveness Assessment and Role Categorization of Individual Units in Convolutional Neural Networks |
ICML 2021 |
|
|
Non-trivial counterfactual explanations |
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations |
Arxiv |
|
|
NP-ProtoPNet |
These do not Look Like Those: An Interpretable Deep Learning Model for Image Recognition |
IEEE |
|
|
Correcting neural networks based on explanations |
Refining Neural Networks with Compositional Explanations |
Arxiv |
Code link given in paper, but page not found |
|
Contrastive reasoning |
Contrastive Reasoning in Neural Networks |
Arxiv |
|
|
Concept based |
Intersection Regularization for Extracting Semantic Attributes |
Arxiv |
|
|
Boundary explanations |
Boundary Attributions Provide Normal (Vector) Explanations |
Arxiv |
PyTorch |
|
Generative Counterfactuals |
ECINN: Efficient Counterfactuals from Invertible Neural Networks |
Arxiv |
|
|
ICE |
Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors |
AAAI 2021 |
|
|
Group CAM |
Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks |
Arxiv |
PyTorch |
|
HMM interpretability |
Towards interpretability of Mixtures of Hidden Markov Models |
AAAI 2021 |
sklearn |
|
Empirical Explainers |
Efficient Explanations from Empirical Explainers |
Arxiv |
PyTorch |
|
FixNorm |
FIXNORM: DISSECTING WEIGHT DECAY FOR TRAINING DEEP NEURAL NETWORKS |
Arxiv |
|
|
CoDA-Net |
Convolutional Dynamic Alignment Networks for Interpretable Classifications |
CVPR 2021 |
Code link given in paper. Repository not yet created |
|
Like Dr. Chandru sir's (IITPKD) XAI work |
Neural Response Interpretation through the Lens of Critical Pathways |
Arxiv |
PyTorch- Pathway GradPyTorch - ROAR |
|
Inaugment |
InAugment: Improving Classifiers via Internal Augmentation |
Arxiv |
Code yet to be updated |
|
Gradual Grad CAM |
Enhancing Deep Neural Network Saliency Visualizations with Gradual Extrapolation |
Arxiv |
PyTorch |
|
A-FMI |
A-FMI: LEARNING ATTRIBUTIONS FROM DEEP NETWORKS VIA FEATURE MAP IMPORTANCE |
Arxiv |
|
|
Trust - Regression |
To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions |
AAAI 2021 |
sklearn |
|
Concept based explanations - study |
IS DISENTANGLEMENT ALL YOU NEED? COMPARING CONCEPT-BASED & DISENTANGLEMENT APPROACHES |
ICLR 2021 workshop |
tensorflow 2.3 |
|
Faithful attribution |
Mutual Information Preserving Back-propagation: Learn to Invert for Faithful Attribution |
Arxiv |
|
|
Counterfactual explanation |
Counterfactual attribute-based visual explanations for classification |
Springer |
|
|
User based explanations |
That's (not) the output I expected!” On the role of end user expectations in creating explanations of AI systems |
AIJ |
|
|
Human understandable concept based explanations |
Towards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be Removed |
Arxiv |
|
|
Improved attribution |
Improving Attribution Methods by Learning Submodular Functions |
Arxiv |
|
|
SHAP tractability |
On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability Results |
Arxiv |
|
|
SHAP explanation network |
SHAPLEY EXPLANATION NETWORKS |
ICLR 2021 |
PyTorch |
|
Concept based dataset shift explanation |
FAILING CONCEPTUALLY: CONCEPT-BASED EXPLANATIONS OF DATASET SHIFT |
ICLR 2021 workshop |
tensorflow 2 |
|
EbD |
Towards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be Removed |
Arxiv |
|
|
Evaluating CAM |
Revisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental Analysis |
Arxiv |
|
|
EFC-CAM |
Exclusive Feature Constrained Class Activation Mapping for Better Visual Explanation |
IEEE |
|
|
Causal Interpretation |
Instance-wise Causal Feature Selection for Model Interpretation |
Arxiv |
PyTorch |
|
Fairness in Learning |
Learning to Learn to be Right for the Right Reasons |
Arxiv |
|
|
Feature attribution correctness |
Do Feature Attribution Methods Correctly Attribute Features? |
Arxiv |
Code not yet updated |
|
NICE |
NICE: AN ALGORITHM FOR NEAREST INSTANCE COUNTERFACTUAL EXPLANATIONS |
Arxiv |
Own Python Package |
|
SCG |
A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts |
Arxiv |
|
|
Visual Concepts |
A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts |
Arxiv |
|
|
This looks like that - drawback |
This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks |
Arxiv |
PyTorch |
|
Exemplar based classification |
Visualizing Association in Exemplar-Based Classification |
ICASSP 2021 |
|
|
Correcting classification |
CORRECTING CLASSIFICATION: A BAYESIAN FRAMEWORK USING EXPLANATION FEEDBACK TO IMPROVE CLASSIFICATION ABILITIES |
Arxiv |
|
|
Concept Bottleneck Networks |
DO CONCEPT BOTTLENECK MODELS LEARN AS INTENDED? |
ICLR workshop 2021 |
|
|
Sanity for saliency |
Sanity Simulations for Saliency Methods |
Arxiv |
|
|
Concept based explanations |
Cause and Effect: Concept-based Explanation of Neural Networks |
Arxiv |
|
|
CLIMEP |
How to Explain Neural Networks: A perspective of data space division |
Arxiv |
|
|
Sufficient explanations |
Probabilistic Sufficient Explanations |
Arxiv |
Empty Repository |
|
SHAP baseline |
Learning Baseline Values for Shapley Values |
Arxiv |
|
|
Explainable by Design |
EXoN: EXplainable encoder Network |
Arxiv |
tensorflow 2.4.0 |
explainable VAE |
Concept based explanations |
Aligning Artificial Neural Networks and Ontologies towards Explainable AI |
AAAI 2021 |
|
|
XAI via Bayesian teaching |
ABSTRACTION, VALIDATION, AND GENERALIZATION FOR EXPLAINABLE ARTIFICIAL INTELLIGENCE |
Arxiv |
|
|
Explanation blind spots |
DO NOT EXPLAIN WITHOUT CONTEXT: ADDRESSING THE BLIND SPOT OF MODEL EXPLANATIONS |
Arxiv |
|
|
BLA |
Bounded logit attention: Learning to explain image classifiers |
Arxiv |
tensorflow |
L2X++ |
Interpretability - mathematical model |
The Definitions of Interpretability and Learning of Interpretable Models |
Arxiv |
|
|
Similar to our ICML workshop 2021 work |
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores |
Arxiv |
|
|
EDDA |
EDDA: Explanation-driven Data Augmentation to Improve Model and Explanation Alignment |
Arxiv |
|
|
Relevant set explanations |
Efficient Explanations With Relevant Sets |
Arxiv |
|
|
Model transfer |
Making CNNs Interpretable by Building Dynamic Sequential Decision Forests with Top-down Hierarchy Learning |
Arxiv |
|
|
Model correction |
Finding and Fixing Spurious Patterns with Explanations |
Arxiv |
|
|
Neuron graph communities |
On the Evolution of Neuron Communities in a Deep Learning Architecture |
Arxiv |
|
|
Mid level features explanations |
A general approach for Explanations in terms of Middle Level Features |
Arxiv |
|
see how different from MUSE by Hima Lakkaraju group |
Concept based knowledge distillation |
Towards Black-Box Explainability with Gaussian Discriminant Knowledge Distillation |
CVPR 2021 workshop |
|
compare and contrast with network dissection |
CNN high frequency bias |
Dissecting the High-Frequency Bias in Convolutional Neural Networks |
CVPR 2021 workshop |
Tensorflow |
|
Explainable by design |
Entropy-based Logic Explanations of Neural Networks |
Arxiv |
PyTorch |
concept based |
CALM |
Keep CALM and Improve Visual Feature Attribution |
Arxiv |
PyTorch |
|
Relevance CAM |
Relevance-CAM: Your Model Already Knows Where to Look |
CVPR 2021 |
PyTorch |
|
S-LIME |
S-LIME: Stabilized-LIME for Model Explanation |
Arxiv |
sklearn |
|
Local + Global |
Best of both worlds: local and global explanations with human-understandable concepts |
Arxiv |
|
Been Kim's group |
Guided integrated gradients |
Guided Integrated Gradients: an Adaptive Path Method for Removing Noise |
CVPR 2021 |
|
|
Concept based |
Meaningfully Explaining a Model’s Mistakes |
Arxiv |
|
|
Explainable by design |
It’s FLAN time! Summing feature-wise latent representations for interpretability |
Arxiv |
|
|
SimAM |
SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks |
ICML 2021 |
PyTorch |
|
DANCE |
DANCE: Enhancing saliency maps using decoys |
ICML 2021 |
Tensorflow 1.x |
|
EbD Concept formation |
Explore Visual Concept Formation for Image Classification |
ICML 2021 |
PyTorch |
|
Explainable by design |
Interpretable Compositional Convolutional Neural Networks |
Arxiv |
|
|
Attribution aggregation |
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation |
AAAI 2021 - pdf |
|
|
Perturbation based activation |
A Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with Perturbation |
AAAI 2021 |
|
|
Global explanations |
Feature Synergy, Redundancy, and Independence in Global Model Explanations using SHAP Vector Decomposition |
Arxiv |
Github package |
|
L2E |
Learning to Explain: Generating Stable Explanations Fast |
ACL 2021 |
PyTorch |
NLE |
Joint Shapley |
Joint Shapley values: a measure of joint feature importance |
Arxiv |
|
|
Explainable by design |
Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment |
Arxiv |
|
|
Explainable by design |
SONG: SELF-ORGANIZING NEURAL GRAPHS |
Arxiv |
|
|
Explainable by design |
Designing Shapelets for Interpretable Data-Agnostic Classification |
AIES 2021 |
sklearn |
Interpretable block of time series extended to other data modalitites like image, text, tabular |
Global explanations + Model correction |
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability |
Arxiv |
PyTorch |
|
HIL- Model correction |
Human-in-the-loop Extraction of Interpretable Concepts in Deep Learning Models |
Arxiv |
|
|
Activation based Cause Analysis |
Activation-Based Cause Analysis Method for Neural Networks |
IEEE Access 2021 |
|
|
Local explanations |
Leveraging Latent Features for Local Explanations |
ACM SIGKDD 2021 |
|
Amit Dhurandhar group |
Fairness |
Adequate and fair explanations |
Arxiv - Accepted in CD-MAKE 2021 |
|
|
Global explanations |
Finding Representative Interpretations on Convolutional Neural Networks |
ICCV 2021 |
|
|
Groupwise explanations |
Learning Groupwise Explanations for Black-Box Models |
IJCAI 2021 |
PyTorch |
|
Mathematical |
On Smoother Attributions using Neural Stochastic Differential Equations |
IJCAI 2021 |
|
|
AGI |
Explaining Deep Neural Network Models with Adversarial Gradient Integration |
IJCAI 2021 |
PyTorch |
|
Accountable attribution |
Longitudinal Distance: Towards Accountable Instance Attribution |
Arxiv |
Tensorflow Keras |
|
Global explanation |
Understanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in Images |
Arxiv |
|
|
Concepts based - Explainable by design |
Inducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc Approach |
Arxiv |
|
IITH Vineeth sir group |
Explainable by design |
This looks more like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation |
Arxiv |
|
|
MIL |
ProtoMIL: Multiple Instance Learning with Prototypical Parts for Fine-Grained Interpretability |
Arxiv |
|
|
Concept based explanations |
Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation |
Arxiv |
|
|
Counterfactual explanation + Theory of Mind |
CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models |
Arxiv |
|
|
Evaluation metric |
Counterfactual Evaluation for Explainable AI |
Arxiv |
|
|
CIM - FSC |
CIM: Class-Irrelevant Mapping for Few-Shot Classification |
Arxiv |
|
|
Causal Concepts |
Unsupervised Causal Binary Concepts Discovery with VAE for Black-box Model Explanation |
Arxiv |
|
|
ECE |
Ensemble of Counterfactual Explainers |
Paper |
Code - seems hybrid of tf and torch |
|
Structured Explanations |
From Heatmaps to Structured Explanations of Image Classifiers |
Arxiv |
|
|
XAI metric |
An Objective Metric for Explainable AI - How and Why to Estimate the Degree of Explainability |
Arxiv |
|
|
DisCERN |
DisCERN:Discovering Counterfactual Explanations using Relevance Features from Neighbourhoods |
Arxiv |
|
|
PSEM |
Towards Better Model Understanding with Path-Sufficient Explanations |
Arxiv |
|
Amit Dhurandhar sir group |
Evaluation traps |
The Logic Traps in Evaluating Post-hoc Interpretations |
Arxiv |
|
|
Interactive explanations |
Explainability Requires Interactivity |
Arxiv |
PyTorch |
|
CounterNet |
CounterNet: End-to-End Training of Counterfactual Aware Predictions |
Arxiv |
PyTorch |
|
Evaluation metric - Concept based explanation |
Detection Accuracy for Evaluating Compositional Explanations of Units |
Arxiv |
|
|
Explanation - Uncertainity |
Effects of Uncertainty on the Quality of Feature Importance Explanations |
Arxiv |
|
|
Survey Paper |
TOWARDS USER-CENTRIC EXPLANATIONS FOR EXPLAINABLE MODELS: A REVIEW |
JISTM Journal Paper |
|
|
Feature attribution |
The Struggles and Subjectivity of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets |
AAAI 2021 workshop |
|
|
Contextual explanation |
Context-based image explanations for deep neural networks |
Image and Vision Computing Journal |
|
|
Causal + Counterfactual |
Counterfactual Instances Explain Little |
Arxiv |
|
|
Case based Posthoc |
Explaining Deep Learning using examples: Optimal feature weighting methods for twin systems using post-hoc, explanation-by-example in XAI |
Elsevier |
|
|
Debugging gray box model |
Toward a Unified Framework for Debugging Gray-box Models |
Arxiv |
|
|
Explainable by design |
Optimising for Interpretability: Convolutional Dynamic Alignment Networks |
Arxiv |
|
|
XAI negative effect |
Explainability Pitfalls: Beyond Dark Patterns in Explainable AI |
Arxiv |
|
|
Evaluate attributions |
WHO EXPLAINS THE EXPLANATION? QUANTITATIVELY ASSESSING FEATURE ATTRIBUTION METHODS |
Arxiv |
|
|
Counterfactual explanations |
Designing Counterfactual Generators using Deep Model Inversion |
Arxiv |
|
|
Model correction using explanation |
Consistent Explanations by Contrastive Learning |
Arxiv |
|
|
Visualize feature maps |
Visualizing Feature Maps for Model Selection in Convolutional Neural Networks |
ICCV 2021 Workshop |
Tensorflow 1.15 |
|
SPS |
Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained Recognition |
ICCV 2021 |
PyTorch |
|
DMBP |
Generating Attribution Maps with Disentangled Masked Backpropagation |
ICCV 2021 |
|
|
Better CAM |
Towards Better Explanations of Class Activation Mapping |
ICCV 2021 |
|
|
LEG |
Statistically Consistent Saliency Estimation |
ICCV 2021 |
Keras |
|
IBA |
Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information |
NeurIPS 2021 |
PyTorch |
|
Looks similar to This Looks Like That |
Interpretable Image Recognition by Constructing Transparent Embedding Space |
ICCV 2021 |
Code not yet publicly released |
|
Causal Imagenet |
CAUSAL IMAGENET: HOW TO DISCOVER SPURIOUS FEATURES IN DEEP LEARNING? |
Arxiv |
|
|
Model correction |
Logic Constraints to Feature Importances |
Arxiv |
|
|
Receptive field Misalignment CAM |
On the Receptive Field Misalignment in CAM-based Visual Explanations |
Pattern recognition Letters |
PyTorch |
|
Simplex |
Explaining Latent Representations with a Corpus of Examples |
Arxiv |
PyTorch |
|
Sanity checks |
Revisiting Sanity Checks for Saliency Maps |
Arxiv - NeurIPS 2021 workshop |
|
|
Model correction |
Debugging the Internals of Convolutional Networks |
PDF |
|
|
SITE |
Self-Interpretable Model with Transformation Equivariant Interpretation |
Arxiv |
Accepted at NeurIPS 2021 |
EbD |
Influential examples |
Revisiting Methods for Finding Influential Examples |
Arxiv |
|
|
SOBOL |
Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis |
NeurIPS 2021 |
Tensorflow and PyTorch |
|
Feature vectors |
Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics |
Arxiv |
|
global interpretability |
OOD in explainability |
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations |
NeurIPS 2021 |
sklearn |
|
RPS LJE |
Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models |
NeurIPS 2021 |
PyTorch |
|
Model correction |
Editing a Classifier by Rewriting Its Prediction Rules |
NeurIPS 2021 |
Code |
|
suppressor variable litmus test |
Scrutinizing XAI using linear ground-truth data with suppressor variables |
Arxiv |
|
|
Explainable knowledge distillation |
Learning Interpretation with Explainable Knowledge Distillation |
Arxiv |
|
|
STEEX |
STEEX: Steering Counterfactual Explanations with Semantics |
Arxiv |
Code |
|
Binary counterfactual explanation |
Counterfactual Explanations via Latent Space Projection and Interpolation |
Arxiv |
|
|
ECLAIRE |
Efficient Decompositional Rule Extraction for Deep Neural Networks |
Arxiv |
R |
|
CartoonX |
Cartoon Explanations of Image Classifiers |
Researchgate |
|
|
concept based explanation |
Explanations in terms of Hierarchically organised Middle Level Features |
Paper |
|
see how close to MACE and PACE |
Concept ball |
Ontology-based 𝑛-ball Concept Embeddings Informing Few-shot Image Classification |
Paper |
|
|
SPARROW |
SPARROW: Semantically Coherent Prototypes for Image Classification |
BMVC 2021 |
|
|
XAI evaluation criteria |
Objective criteria for explanations of machine learning models |
Paper |
|
|
Code inversion with human perception |
EXPLORING ALIGNMENT OF REPRESENTATIONS WITH HUMAN PERCEPTION |
Arxiv |
|
|
Deformable ProtoPNet |
Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes |
Arxiv |
|
|
ICSN |
Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations |
Arxiv |
|
|
HIVE |
HIVE: Evaluating the Human Interpretability of Visual Explanations |
Arxiv |
Project Page |
|
Jitter CAM |
Jitter-CAM: Improving the Spatial Resolution of CAM-Based Explanations |
BMVC 2021 |
PyTorch |
|
Interpreting last layer |
dentifying Class Specific Filters with L1 Norm Frequency Histograms in Deep CNNs |
Arxiv |
|
|
FCP |
Forward Composition Propagation for Explainable Neural Reasoning |
Arxiv |
|
|
Protopool |
Interpretable Image Classification with Differentiable Prototypes Assignment |
Arxiv |
|
|
PRELIM |
Pedagogical Rule Extraction for Learning Interpretable Models |
Arxiv |
|
|
Fair correction vectors |
FAIR INTERPRETABLE LEARNING VIA CORRECTION VECTORS |
ICLR 2021 |
|
|
Smooth LRP |
SmoothLRP: Smoothing LRP by Averaging over Stochastic Input Variations |
ESANN 2021 |
|
|
Causal CAM |
EXTRACTING CAUSAL VISUAL FEATURES FOR LIMITED LABEL CLASSIFICATION |
ICIP 2021 |
|
|