Comments (8)
Yep your intuition is correct - ideally one would aim to quantize during training (or emulate). I would surmise that post-quantization would perform poorly on a model that's been compressed via structured multi-hashing (SMH) due to the nonlinear mapping of weights to the reduced weight matrices (V1 and V2).
In theory, there shouldn't be any issue doing both SMH and quantization - however I haven't dug into the implementation internals of QAT to confirm whether its compatible with FeatherMap out the box. Let me know how it goes!
from feathermap.
On further thought, I think one should apply QAT after wrapping in FeatherMap. FeatherNet as a layer will just expose self.V1 and self.V2 as the weights to be updated, which should then be quantized (or emulated quantization) and then trained. E.g, something like
base_model = ResNet50()
f_model = FeatherNet(base_model, compress=0.10)
# now apply quantization awareness to f_model and thus V1 and V2
# train
# evaluate and convert
from feathermap.
Thank you for your reply!
How about evaluation: what order would it follow?
from feathermap.
I'll also try comparing this method to iterative pruning ("To prune or not to prune", Zhu et al) and some dynamic sparse training techniques (RiGL, Evci et al. 2020).
from feathermap.
Thank you for your reply!
How about evaluation: what order would it follow?
For accuracy and other metric evaluation you can make use of the GPU if you keep it in f_model.eval()
mode. However, if you want to benchmark inference time, then you'd want to use f_model.deploy()
. Presumably one would only need to actually go to reduced precision when deploying - if QAT can continue to emulate quantization during evaluation, I'd do that.
from feathermap.
I'll also try comparing this method to iterative pruning ("To prune or not to prune", Zhu et al) and some dynamic sparse training techniques (RiGL, Evci et al. 2020).
Awesome. One of the cool things about FeatherMap is the ability to compound other compression methods. I'm very curious to see what kind of performance you might get compared to 'unstacked' compression methods.
from feathermap.
For accuracy and other metric evaluation you can make use of the GPU if you keep it in
f_model.eval()
mode. However, if you want to benchmark inference time, then you'd want to usef_model.deploy()
. Presumably one would only need to actually go to reduced precision when deploying - if QAT can continue to emulate quantization during evaluation, I'd do that.
I'm actually just interested in compressing the weights V_1, V_2. So I don't need to worry about eval? (model.state_dict
should have V_1, V_2
?).
from feathermap.
Yes, the state_dict will save V1 and V2 as the weights, as well as the batchnorm layers
from feathermap.
Related Issues (3)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from feathermap.