Comments (4)
Testing ML Systems
- ML system은 더 세심하게 테스트되어야 한다. 왜? rules들이 더 모호하게 defined 되었기 때문이다.
Key testing principles for ML: pre-deployemnt
- use a schema for features
- model specification test: model config 변경은 unit test가 필요하다.
- validate model quality: sudden degradation, slow degradation 테스트를 해야함
- test input feature code
- training is reproducible: random seed 고정 등
- integration test the pipeline
from til.
testing theory
- unit test: 단일 로직, 단일 클래스에 대한 테스트
- integration test: assembled component에 대한 테스트
- system test: end to end test
How much testing?
- prioritise the code base?
- what is mission critical?
- test reduce uncertainty about your system?
from til.
test data schema
iris_schema = {
'sepal length': {
'range': {
'min': 4.0, # determined by looking at the dataframe .describe() method
'max': 8.0
},
'dtype': float,
},
'sepal width': {
'range': {
'min': 1.0,
'max': 5.0
},
'dtype': float,
},
'petal length': {
'range': {
'min': 1.0,
'max': 7.0
},
'dtype': float,
},
'petal width': {
'range': {
'min': 0.1,
'max': 3.0
},
'dtype': float,
}
}
import unittest
import sys
class TestIrisInputData(unittest.TestCase):
def setUp(self):
# `setUp` will be run before each test, ensuring that you
# have a new pipeline to access in your tests. See the
# unittest docs if you are unfamiliar with unittest.
# https://docs.python.org/3/library/unittest.html#unittest.TestCase.setUp
self.pipeline = SimplePipeline()
self.pipeline.run_pipeline()
def test_input_data_ranges(self):
# get df max and min values for each column
max_values = self.pipeline.frame.max()
min_values = self.pipeline.frame.min()
# loop over each feature (i.e. all 4 column names)
for feature in self.pipeline.feature_names:
# use unittest assertions to ensure the max/min values found in the dataset
# are less than/greater than those expected by the schema max/min.
self.assertTrue(max_values[feature] <= iris_schema[feature]['range']['max'])
self.assertTrue(min_values[feature] >= iris_schema[feature]['range']['min'])
def test_input_data_types(self):
data_types = self.pipeline.frame.dtypes # pandas dtypes method
for feature in self.pipeline.feature_names:
self.assertEqual(data_types[feature], iris_schema[feature]['dtype'])
- schema를 미리 정의해놓고 inpute data가 스키마에서 주어진 min, max를 만족하는지, data type은 만족하는지 테스트를 한다.
from til.
testing data engineering
import unittest
class TestIrisDataEngineering(unittest.TestCase):
def setUp(self):
self.pipeline = PipelineWithDataEngineering()
self.pipeline.load_dataset()
def test_scaler_preprocessing_brings_x_train_mean_near_zero(self):
# Given
# convert the dataframe to be a single column with pandas stack
original_mean = self.pipeline.X_train.stack().mean()
# When
self.pipeline.apply_scaler()
# Then
# The idea behind StandardScaler is that it will transform your data
# to center the distribution at 0 and scale the variance at 1.
# Therefore we test that the mean has shifted to be less than the original
# and close to 0 using assertAlmostEqual to check to 3 decimal places:
# https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertAlmostEqual
self.assertTrue(original_mean > self.pipeline.X_train.mean()) # X_train is a numpy array at this point.
self.assertAlmostEqual(self.pipeline.X_train.mean(), 0.0, places=3)
print(f'Original X train mean: {original_mean}')
print(f'Transformed X train mean: {self.pipeline.X_train.mean()}')
def test_scaler_preprocessing_brings_x_train_std_near_one(self):
# When
self.pipeline.apply_scaler()
# Then
# We also check that the standard deviation is close to 1
self.assertAlmostEqual(self.pipeline.X_train.std(), 1.0, places=3)
print(f'Transformed X train standard deviation : {self.pipeline.X_train.std()}')
- 데이터에 StandardScaler를 적용하는 전처리를 하였다.
- 이 전처리가 우리가 의도한 대로 적용이 되었는지를 테스트한다.
- 데이터 전처리 로직을 테스트 하기에 쉽데록 잘 쪼개는 것이 중요하다.
- 좋은 아이디어는 파이프라인 자체를 클래스로 만든 뒤, load_data까지만 수행하고 그 다음 스텝들을 차례로 수행하면서 테스트를 진행한다.
- 즉, 데이터 전처리를 수행하는 함수들은 리턴 값을 주지 않고 파이프라인 객체 내의 frame에만 동작을 수행한다. bigdata-platform 프로젝트에서도 이렇게 테스트를 짰더라면 훨씬 편했을 것이다.
from til.
Related Issues (20)
- kafka topic and partition HOT 4
- kafka producer detail HOT 2
- kafka consumer detail HOT 4
- kafka toy project HOT 1
- kafka 사용 인프라 아키텍쳐 HOT 1
- log collectors HOT 1
- ML monitoring - Intro HOT 1
- ML model prediction quality test
- ML model configuration test
- DS school - Unit testing production model HOT 1
- DS School - Config Test HOT 1
- DS School - Differential test
- DS School - Shadow mode HOT 1
- DS School - ML system metrics
- elasticsearch - shard, segment HOT 1
- torchscript HOT 1
- torch-model-archiver
- python grpc HOT 1
- asyncio - coroutine HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from til.