Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.

License: Apache License 2.0

Jupyter Notebook 99.31% Python 0.69% HTML 0.01%

machine-learning deep-learning python classification clustering natural-language-processing computer-vision spacy nltk scikit-learn

practical-machine-learning-with-python's Introduction

Practical Machine Learning with Python

A Problem-Solver's Guide to Building Real-World Intelligent Systems

"Data is the new oil" is a saying which you must have heard by now along with the huge interest building up around Big Data and Machine Learning in the recent past along with Artificial Intelligence and Deep Learning. Besides this, data scientists have been termed as having "The sexiest job in the 21st Century" which makes it all the more worthwhile to build up some valuable expertise in these areas. Getting started with machine learning in the real world can be overwhelming with the vast amount of resources out there on the web.

"Practical Machine Learning with Python" follows a structured and comprehensive three-tiered approach packed with concepts, methodologies, hands-on examples, and code. This book is packed with over 500 pages of useful information which helps its readers master the essential skills needed to recognize and solve complex problems with Machine Learning and Deep Learning by following a data-driven mindset. By using real-world case studies that leverage the popular Python Machine Learning ecosystem, this book is your perfect companion for learning the art and science of Machine Learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute Machine Learning systems and projects successfully.

This repository contains all the code, notebooks and examples used in this book. We will also be adding bonus content here from time to time. So keep watching this space!

Get the book

About the book

Master the essential skills needed to recognize and solve complex problems with machine learning and deep learning. Using real-world examples that leverage the popular Python machine learning ecosystem, this book is your perfect companion for learning the art and science of machine learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute machine learning systems and projects successfully.

We focus on leveraging the latest state-of-the-art data analysis, machine learning and deep learning frameworks including scikit-learn, pandas, statsmodels, spaCy, nltk, gensim, tensorflow, keras, skater and several others to process, wrangle, analyze, visualize and model on real-world datasets and problems! With a learn-by-doing approach, we try to abstract out complex theory and concepts (while presenting the essentials wherever necessary), which often tends to hold back practitioners from leveraging the true power of machine learning to solve their own problems.

^{Edition: 1st Pages: 532 Language: English

Book Title: Practical Machine Learning with Python Publisher: Apress (a part of Springer) Copyright: Dipanjan Sarkar, Raghav Bali, Tushar Sharma

Print ISBN: 978-1-4842-3206-4 Online ISBN: 978-1-4842-3207-1 DOI: 10.1007/978-1-4842-3207-1}

Practical Machine Learning with Python follows a structured and comprehensive three-tiered approach packed with hands-on examples and code.

Part 1 focuses on understanding machine learning concepts and tools. This includes machine learning basics with a broad overview of algorithms, techniques, concepts and applications, followed by a tour of the entire Python machine learning ecosystem. Brief guides for useful machine learning tools, libraries and frameworks are also covered.
Part 2 details standard machine learning pipelines, with an emphasis on data processing analysis, feature engineering, and modeling. You will learn how to process, wrangle, summarize and visualize data in its various forms. Feature engineering and selection methodologies will be covered in detail with real-world datasets followed by model building, tuning, interpretation and deployment.
Part 3 explores multiple real-world case studies spanning diverse domains and industries like retail, transportation, movies, music, marketing, computer vision and finance. For each case study, you will learn the application of various machine learning techniques and methods. The hands-on examples will help you become familiar with state-of-the-art machine learning tools and techniques and understand what algorithms are best suited for any problem.

Practical Machine Learning with Python will empower you to start solving your own problems with machine learning today!

What You'll Learn

Execute end-to-end machine learning projects and systems
Implement hands-on examples with industry standard, open source, robust machine learning tools and frameworks
Review case studies depicting applications of machine learning and deep learning on diverse domains and industries
Apply a wide range of machine learning models including regression, classification, and clustering.
Understand and apply the latest models and methodologies from deep learning including CNNs, RNNs, LSTMs and transfer learning.

Powered by the following Frameworks

Audience

This book has been specially written for IT professionals, analysts, developers, data scientists, engineers, graduate students and anyone with an interest to analyze and derive insights from data!

Acknowledgements

TBA

practical-machine-learning-with-python's People

Contributors

Stargazers

Watchers

Forkers

mshoaib54 pk13055 jing-zhang dizzygz chakradharr ymograi reddyreddys255 pseemakurthi johirbuet amol-kkwieer vikasmech nixiaoyang anushabilakanti meloncloud rmrezarp bharu456 gauravnitc243 deepak-k-zefr janes jwwei kaislar only1dallas rahul0211 cldmello sandiprb pursh2002 piushvaish fabioalvesm mco0l juanscito cylof22 osmangencyurek charlenenicer ldfla manuj005 santhoshashokkumar erayon anuragreddygv323 phanirajl hunslater-deeplearning ghellstern zhongkailv dineshk1493 nkhuyu snci raulvigo selkurdy faadal ephraimmagopa allensmile ellenandangel ricelingz hieuqtran dazzysakb philtsmith570 jdc08161063 markxiaoxiao zhlulux jeffrey511 cosecant-csc frankatmech starzhouchina mabotokyo abhishekkodi o7s8r6 zoery manasaks udhai17 dgq2011 tapiwa7 truocphamkhac-agilityio pandaczm huangshizhi ghim r7788380 shevypo airob curiousm naveedafzal nanfengpo atahmasb youngyi arunpandey77 harphajan axfv philhua vijaifarmguide krucifier-jr franckess phseidl puneet-shivanand xiaoguozhi zhongminjin james-fu peterxiaoguo zzmjohn abhif22 sudhu26 alimpolat sunilayyaps

practical-machine-learning-with-python's Issues

Showing error while importing arima-Utils ,lstm_utils

Below Error are appearing :

from arima_utils import ad_fuller_test, plot_rolling_stats
from arima_utils import plot_acf_pacf, arima_gridsearch_cv

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 from arima_utils import ad_fuller_test, plot_rolling_stats
2 from arima_utils import plot_acf_pacf, arima_gridsearch_cv

ModuleNotFoundError: No module named 'arima_utils'

while trying import getting below error :
PS C:\Users\ankur> pip install arima_utils
Collecting arima_utils
Could not find a version that satisfies the requirement arima_utils (from versions: )
No matching distribution found for arima_utils

Is there any user-defined utils? if yes, please send me the link .

PyLDAvis doesn't display anything

hi, great book.
however, i followed all your instructions in chapter 7, the code pyLDAvis.sklearn.prepare(pos_nmf, ptvf_features, ptvf, R=15) displays nothing in my notebook. And i found this warning from jupyter console:

RuntimeWarning: invalid value encountered in multiply
  relevance = lambda_ * log_ttd + (1 - lambda_) * log_lift

i don't know if there's anything i can do to get the pic.
thanks

this should be a bug comes from pyLDAvis, following codes could work:

data = pyLDAvis.sklearn.prepare(pos_nmf, ptvf_features, ptvf, R=15)
pyLDAvis.show(data)

Why download the csv file , is a version https://git-lfs.github.com/spec/v1 oid sha256:3207c5d96b1b404b3d371bf4692369c90602f532b4861664567836eaef089465 size 230, not a file?

Practical machine learning

Greatest

Index Error

When execute following statement from .ipynb file
`neg_idx = df[(df.news_category=='technology') & (df.sentiment_score == -15)].index[0]

I get following
IndexError Traceback (most recent call last)
in
----> 5 neg_idx = df[(df.news_category=='technology') & (df.sentiment_score == -15)].index[1]
~/anaconda3/envs/myenv/lib/python3.7/site-packages/pandas/core/indexes/base.py in getitem(self, key)
2082
2083 if is_scalar(key):
-> 2084 return getitem(key)
2085
2086 if isinstance(key, slice):
IndexError: index 1 is out of bounds for axis 0 with size 0`

Any assistance will be appreciated.
Gopinathan K.M

Chapter 11 - notebook_gold_forecast_arima.ipynb

Hello,
I hope you're well.
I like very much your handbook and real-world examples.
My issue is the following concerning the Chapter 11 - notebook_gold_forecast_arima.ipynb : in the paragraph 24th of "ARIMA/Training-Testing Split" : there is a problem of "AttributeError: 'Series' object has no attribute 'ix':

In : results_dict = arima_gridsearch_cv(new_df.log_series,cv_splits=5)

Out :
`********************
Iteration 1 of 5
TRAIN: [ 0 1 2 ... 2922 2923 2924] TEST: [2925 2926 2927 ... 5847 5848 5849]

AttributeError Traceback (most recent call last)
in
----> 1 results_dict = arima_gridsearch_cv(new_df.log_series,cv_splits=5)

~\anaconda3\workdoc\handbook\practical-machine-learning-with-python-master\notebooks\Ch11_Forecasting_Stock_and_Commodity_Prices\arima_utils.py in arima_gridsearch_cv(series, cv_splits, verbose, show_plots)
117
118 # split train and test sets
--> 119 train_series = series.iloc[train_index]
120 test_series = series.iloc[test_index]
121

~\anaconda3\lib\site-packages\pandas\core\generic.py in getattr(self, name)
5137 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5138 return self[name]
-> 5139 return object.getattribute(self, name)
5140
5141 def setattr(self, name: str, value) -> None:

AttributeError: 'Series' object has no attribute 'ix'`

Actually, the attribute 'ix' is deprecated since latest Python versions. So, I replaced 'ix' with 'iloc' in the arima_utils.py file (function : arima_gridsearch_cv), but it has another problem. So, I am upset, I have no clue to fix the problem. Could you please send us the updated files related to the chapter 11 ? I thank you in advance.

Best Regards,

Maghnia Dib

unable to install model_evaluation_utils

Hi, I am trying to use your model evaluation package in google colab and I am getting the error as below:
!pip install model_evaluation_utils

Collecting model_evaluation_utils
ERROR: Could not find a version that satisfies the requirement model_evaluation_utils (from versions: none)
ERROR: No matching distribution found for model_evaluation_utils

how can I solve this problem?

Appreciate your reply. Thanks

merge function is not working for Skipgram code

in the below code with keras latest version, its not working.

from keras.layers import Merge
from keras.layers.core import Dense, Reshape
from keras.layers.embeddings import Embedding
from keras.models import Sequential

# build skip-gram architecture
word_model = Sequential()
word_model.add(Embedding(vocab_size, embed_size,
                         embeddings_initializer="glorot_uniform",
                         input_length=1))
word_model.add(Reshape((embed_size, )))

context_model = Sequential()
context_model.add(Embedding(vocab_size, embed_size,
                  embeddings_initializer="glorot_uniform",
                  input_length=1))
context_model.add(Reshape((embed_size,)))

model = Sequential()
model.add(Merge([word_model, context_model], mode="dot"))
model.add(Dense(1, kernel_initializer="glorot_uniform", activation="sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")

# view model summary
print(model.summary())

# visualize model structure
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

SVG(model_to_dot(model, show_shapes=True, show_layer_names=False, 
                 rankdir='TB').create(prog='dot', format='svg'))

I have modified the code from sequential api to functional API. please find below the code:

from keras.layers import dot
from keras.layers import Dense, Input
from keras.layers.core import Reshape
from keras.layers import Embedding
from keras.models import  Model
from keras.backend import reshape

# build skip-gram architecture
inp = Input(shape=(1,),name = "first_input")
word_model22 = Embedding(input_dim=vocab_size, output_dim=embed_size,
                         embeddings_initializer="glorot_uniform",
                         input_length=1)
emb  = word_model22(inp)
word_model22 = Reshape(target_shape= (embed_size,))(emb)

inp1 = Input(shape=(1,),name = "2nd_input")
context_model = Embedding(input_dim=vocab_size, output_dim=embed_size,
                         embeddings_initializer="glorot_uniform",
                         input_length=1)
emb1  = context_model(inp1)
context_model = Reshape(target_shape= (embed_size,))(emb1)

mo = (dot([word_model22, context_model],axes=-1))
mo = (Dense(1, kernel_initializer="glorot_uniform", activation="sigmoid"))(mo)
model = Model(inputs = (inp, inp1), outputs =mo)
model.compile(loss="mean_squared_error", optimizer="rmsprop")

# view model summary
print(model.summary())

# visualize model structure
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

SVG(model_to_dot(model, show_shapes=True, show_layer_names=False, 
                 rankdir='TB').create(prog='dot', format='svg'))

expand contractions is not working as expected

When there are more than one single quote(') such as "you'll've" , Expand contractions in text_normalizer.py is giving output as "you willve" but the expected output is "you will have"

Puzzle for Ch7, clean data, normalize_corpus(test_reviews)

Since we have already implemented "normalize_corpus" function, why there is still

after cleaning?

I simply pick 35002th comment... so confusing.

test_reviews = reviews[35000:35005]

sample_review_ids = [1, 2, 3]

REVIEW: Be careful with this one. Once you get yer mitts on it, it'll change the way you look at kung-fu flicks. You will be yearning a plot from all of the kung-fu films now, you will be wanting character depth and development, you will be craving mystery and unpredictability, you will demand dynamic camera work and incredible backdrops. Sadly, you won't find all of these aspects together in one kung-fu movie, EXCEPT for Five Deadly Venoms!

Easily the best kung-fu movie of all-time, Venoms blends a rich plot, full of twists and turns, with colourful (and developed) characters, along with some of the best camerawork to come out of the 70s. The success of someone liking the film depends on the viewers ability to decipher which character is which, and who specializes in what venom. One is the Centipede, two is the Snake, three is the Scorpion, four is the Lizard, and five is the Toad. Each character has different traits, characteristics, strengths, and weaknesses. Therein lies the hook, we learn along with the student character, finding out who these different men turn out to be. We are in his shoes (so to speak), and we have to pick who we trust, and who we don't, just like he does. We learn along with him.

Not only is the plot, the characters, and the camerawork great, it's also fun to watch, which in my book makes it more valuable than almost any other movie of it's kind. It's worth quite a few watches to pick up on everything that's going on. Venoms is a lesson on what kung-fu can really do...just don't expect many other kung-fu films to live up to it's gauntlet.
Actual Sentiment: positive
Predicted Sentiment polarity: 28.0

Minor import issue with notebook_gold_forecast_arima.ipynb

In the file "notebook_gold_forecast_arima" the following needs to be updated to import correctly...

from time_series_utils import ad_fuller_test, plot_rolling_stats
from time_series_utils import plot_acf_pacf, arima_gridsearch_cv

from arima_utils import ad_fuller_test, plot_rolling_stats
from arima_utils import plot_acf_pacf, arima_gridsearch_cv

Chapter 4 feature engineering numeric data notebook broken on GitHub

As of 2023-12-26, the notebook
notebooks/Ch04_Feature_Engineering_and_Selection/Feature Engineering on Numeric Data.ipynb
doesn’t render on the GitHub site, but instead results in an error page.

Ch07 use contractions_dict instead of import CONTRACTION_MAP

Hi,

In the chapter 7 :: https://github.com/dipanjanS/practical-machine-learning-with-python/blob/master/notebooks/Ch07_Analyzing_Movie_Reviews_Sentiment/Text%20Normalization%20Demo.ipynb ,

please use contractions_dict instead of "import CONTRACTION_MAP".

Also, please correct the spaCy load.

Requirements file

I think this would benefit from a requirements file with pinned versions. I'm getting stuck as usual on packages having conflicting/ incorrect versions in my conda environment.

Contractions bug

"he'll've": "he he will have" in the contractions file should be "he'll've": "he will have"

word.lemma_ will lowercase the token

text = ' '.join([word.lemma_ if word.lemma_ != '-PRON-' else word.text for word in text])

In the latest Spacy, after I run the code, the text will be lowercased. Actually, I do not want to do this at this stage. I think this is an issue.

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Traceback (most recent call last):
File "/Users/evan/PycharmProjects/price-prediction/stock_price_forecast_regression_modeling.py", line 74, in
train_rmse = math.sqrt(mean_squared_error(y_train[train_offset:], np.array(train_pred_seqs).flatten()))
File "/Users/evan/PycharmProjects/price-prediction/venv/lib/python3.6/site-packages/sklearn/metrics/regression.py", line 238, in mean_squared_error
y_true, y_pred, multioutput)
File "/Users/evan/PycharmProjects/price-prediction/venv/lib/python3.6/site-packages/sklearn/metrics/regression.py", line 76, in _check_reg_targets
y_true = check_array(y_true, ensure_2d=False)
File "/Users/evan/PycharmProjects/price-prediction/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 453, in check_array
_assert_all_finite(array)
File "/Users/evan/PycharmProjects/price-prediction/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 44, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Process finished with exit code 1

Ch3 : notebook_wrangle_data.ipynb

getting the following error.
(Im a Prof. utilizing your Book as Text book)
Kindly help.

when i execute ln 4: describe_dataframe(df)

getting the following Error
Dataframe Sample Rows::

NameError Traceback (most recent call last)
in ()
----> 1 describe_dataframe(df)

in describe_dataframe(df)
111
112 print("Dataframe Sample Rows::")
--> 113 display(df.head(5))
114
115 def cleanup_column_names(df,rename_dict={},do_inplace=True):

NameError: name 'display' is not defined

CH 11 Sequence Modeling issue

Hello Dipanjan,
I just want to congratulate you for your book. It is one of the best out there.
Altough, I have a small issue:

on chapter 11 at Sequence Modeling when I call get_seq_train_test function I get an error
on this line: scaled_stock_series = scaler.fit_transform(time_series)

saying : ValueError: Expected 2D array, got 1D array instead: array=[1115.099976 1115.099976 1115.099976 ... 2711.929932 2643.689941
2634.800049].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Can you help me with this?
Thanks

Fault for Ch7 normalize_corpus

When I try to pick a small amount of reviews, test_reviews = reviews[35000:35005], it gives a default as below:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

how could I fix it? thank u!

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in ()
2 # normalize dataset
3 #norm_test_reviews = tn.normalize_corpus(test_reviews)
----> 4 normalize_corpus([test_reviews],text_lemmatization=False, stopword_removal=False,text_lower_case=False)

D:\Jupyter_Notebook\Deep_Learning\Sentimen Analysis\L1\text_normalizer.py in normalize_corpus(corpus, html_stripping, contraction_expansion, accented_char_removal, text_lower_case, text_lemmatization, special_char_removal, stopword_removal)
92
93 if html_stripping:
---> 94 doc = strip_html_tags(doc)
95
96 if accented_char_removal:

D:\Jupyter_Notebook\Deep_Learning\Sentimen Analysis\L1\text_normalizer.py in strip_html_tags(text)
26 # # Cleaning Text - strip HTML
27 def strip_html_tags(text):
---> 28 soup = BeautifulSoup(text, "html.parser")
29 stripped_text = soup.get_text()
30 return stripped_text

~\Anaconda3\envs\deeplearning\lib\site-packages\bs4_init_.py in init(self, markup, features, builder, parse_only, from_encoding, exclude_encodings, **kwargs)
223 self.contains_replacement_characters) in (
224 self.builder.prepare_markup(
--> 225 markup, from_encoding, exclude_encodings=exclude_encodings)):
226 self.reset()
227 try:

~\Anaconda3\envs\deeplearning\lib\site-packages\bs4\builder_htmlparser.py in prepare_markup(self, markup, user_specified_encoding, document_declared_encoding, exclude_encodings)
203 try_encodings = [user_specified_encoding, document_declared_encoding]
204 dammit = UnicodeDammit(markup, try_encodings, is_html=True,
--> 205 exclude_encodings=exclude_encodings)
206 yield (dammit.markup, dammit.original_encoding,
207 dammit.declared_html_encoding,

~\Anaconda3\envs\deeplearning\lib\site-packages\bs4\dammit.py in init(self, markup, override_encodings, smart_quotes_to, is_html, exclude_encodings)
350 self.log = logging.getLogger(name)
351 self.detector = EncodingDetector(
--> 352 markup, override_encodings, is_html, exclude_encodings)
353
354 # Short-circuit if the data is in Unicode to begin with.

~\Anaconda3\envs\deeplearning\lib\site-packages\bs4\dammit.py in init(self, markup, override_encodings, is_html, exclude_encodings)
226
227 # First order of business: strip a byte-order mark.
--> 228 self.markup, self.sniffed_encoding = self.strip_byte_order_mark(markup)
229
230 def _usable(self, encoding, tried):

~\Anaconda3\envs\deeplearning\lib\site-packages\bs4\dammit.py in strip_byte_order_mark(cls, data)
278 # Unicode data cannot have a byte-order mark.
279 return data, encoding
--> 280 if (len(data) >= 4) and (data[:2] == b'\xfe\xff')
281 and (data[2:4] != '\x00\x00'):
282 encoding = 'utf-16be'

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
`

Chapter 8 - frequent_itemsets not defined

2 print("num of required transactions = ", int(input_assoc_rules.shape[0]*support))
3 num_trans = input_assoc_rules.shape[0]*support
----> 4 itemsets = dict(frequent_itemsets(data_tran_uk_en, support))

NameError: name 'frequent_itemsets' is not defined

How can I use import module evaluation utils?

I have Jupiter notebook but I don't know how I can load module evaluation utils.
Import module evaluation utils as meu not function. Where can I load the file and in which folder do I load it?

Easier way to download "en_vectors_web_lg" model in spacy

The procedure for downloading the "en_vectors_web_lg" in spacy. by downloading and unzipping the file, and shifting it to the appropriate directory, as illustrated here is long and cumbersome.

Instead of the above procedure, we could simply do the following to load the model:

import spacy
import spacy.cli
spacy.cli.download("en_vectors_web_lg")
nlp = spacy.load('en_vectors_web_lg')

datasets

is there a link for the datasets used in this book

Chapter-3,Visualization using Matplotlib,page-170(Legends)

plt.plot(x,y,'g',label='y=x^2')
plt.plot(x,z,'b:',label='y=x')
plt.legend(loc="best")
plt.title('Legend Sample')
While trying to exceute following code mention in your book, i am getting error as-
ValueError: x and y must have same first dimension, but have shapes (10,) and (50,)

Very minor bug in contractions with "ain't"

The word ain't produces "as not" because expand_contractions has the following code:

expanded_contraction = first_char + expanded_contraction[1:]

Ain't does not fit this general rule

Error in Accessing Layers of Word Embedding model

In this code, in the "Get word embeddings" section of the "Skip-gram model", the code is as follows:

merge_layer = model.layers[0]
word_model = merge_layer.layers[0]
word_embed_layer = word_model.layers[0]
weights = word_embed_layer.get_weights()[0][1:]

The above code gives the error as

AttributeError: 'InputLayer' object has no attribute 'layers'

The following code should be inserted in place of the above code (it works perfectly):

word_embed_layer = model.layers[2]
weights = word_embed_layer.get_weights()[0][1:]

Chapter 8 - NameError: name 'OneHot' is not defined

OneHot is not defined.

Chapter 9 error in Sklearn "DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)"

The code in Chapter 9 example on "Analyzing Wine Types"

wtp_dnn_predictions = le.inverse_transform(wtp_dnn_ypred)

throws a warning, and then a fatal error. I don't know how to fix this yet. Instructions are vague unclear.

C:\Programdata\Anaconda3\envs\tensorflow\lib\site-packages\sklearn\preprocessing_label.py:154: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)

ValueError Traceback (most recent call last)
Input In [40], in <cell line: 1>()
----> 1 wtp_dnn_predictions = le.inverse_transform(wtp_dnn_ypred)

File C:\Programdata\Anaconda3\envs\tensorflow\lib\site-packages\sklearn\preprocessing_label.py:161, in LabelEncoder.inverse_transform(self, y)
159 diff = np.setdiff1d(y, np.arange(len(self.classes_)))
160 if len(diff):
--> 161 raise ValueError("y contains previously unseen labels: %s" % str(diff))
162 y = np.asarray(y)
163 return self.classes_[y]

ValueError: y contains previously unseen labels: [0.12855081 0.12855084 0.1564262 ... 0.56348777 0.5781211 0.60708743]

TypeError: 'NoneType' object is not subscriptable

I am having issue with the code you shared on Medium article named "A Practitioner's Guide to Natural Language Processing (Part I) — Processing & Understanding Text" where contraction.py file was used in the code.

def expand_contractions(text, contraction_mapping=CONTRACTION_MAP):
    
    contractions_pattern = re.compile('({})'.format('|'.join(contraction_mapping.keys())), 
                                      flags=re.IGNORECASE|re.DOTALL)
    def expand_match(contraction):
        match = contraction.group(0)
        first_char = match[0]
        expanded_contraction = contraction_mapping.get(match)\
                                if contraction_mapping.get(match)\
                                else contraction_mapping.get(match.lower())                       
        expanded_contraction = first_char+expanded_contraction[1:]
        return expanded_contraction
        
    expanded_text = contractions_pattern.sub(expand_match, text)
    expanded_text = re.sub("'", "", expanded_text)
    return expanded_text

and while running the function on the data I am getting the error as mentioned in title.

df['Text_of_quest'] = [expand_contractions(x) for x in df['Text_of_quest'].to_list() if x is not None]

Traceback:

TypeError                                 Traceback (most recent call last)
<ipython-input-23-4358bc968219> in <module>
----> 1 get_ipython().run_cell_magic('timeit', '', "df['Text_of_quest'] = [expand_contractions(x) for x in df['Text_of_quest'].to_list() if x is not None]\n")

~/.conda/envs/project1/lib/python3.8/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2360             with self.builtin_trap:
   2361                 args = (magic_arg_s, cell)
-> 2362                 result = fn(*args, **kwargs)
   2363             return result
   2364 

<decorator-gen-60> in timeit(self, line, cell, local_ns)

~/.conda/envs/project1/lib/python3.8/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188 
    189         if callable(arg):

~/.conda/envs/project1/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, line, cell, local_ns)
   1158             for index in range(0, 10):
   1159                 number = 10 ** index
-> 1160                 time_number = timer.timeit(number)
   1161                 if time_number >= 0.2:
   1162                     break

~/.conda/envs/project1/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, number)
    167         gc.disable()
    168         try:
--> 169             timing = self.inner(it, self.timer)
    170         finally:
    171             if gcold:

<magic-timeit> in inner(_it, _timer)

<magic-timeit> in <listcomp>(.0)

<ipython-input-9-90eb3c3afe2e> in expand_contractions(text, contraction_mapping)
     12         return expanded_contraction
     13 
---> 14     expanded_text = contractions_pattern.sub(expand_match, text)
     15     expanded_text = re.sub("'", "", expanded_text)
     16     return expanded_text

<ipython-input-9-90eb3c3afe2e> in expand_match(contraction)
      9                                 if contraction_mapping.get(match)\
     10                                 else contraction_mapping.get(match.lower())
---> 11         expanded_contraction = first_char+expanded_contraction[1:]
     12         return expanded_contraction
     13 

TypeError: 'NoneType' object is not subscriptable

Why NoneType error even after i checked for None value in my dataframe column?

Chapter 9 error Tensorflow AttributeError: 'Sequential' object has no attribute 'predict_class' because predict_class() changed to predict()

AttributeError Traceback (most recent call last)
Input In [41], in <cell line: 3>()
1 # AFTER changing the code, the line runs without error:
----> 3 wtp_dnn_ypred = wtp_dnn_model.predict_class(wtp_test_SX)
4 wtp_dnn_ypred

AttributeError: 'Sequential' object has no attribute 'predict_class'

######################

I found a solution to this...

Original CODE FAILED due to a change in Tensorflow 2.6 in May-June 2022

Problem described here:

https://stackoverflow.com/questions/68836551/keras-attributeerror-sequential-object-has-no-attribute-predict-classes

wtp_dnn_ypred` = wtp_dnn_model.predict_classes(wtp_test_SX)

This was changed to:

wtp_dnn_ypred = wtp_dnn_model.predict(wtp_test_SX)

dipanjans / practical-machine-learning-with-python Goto Github PK