Feineigle.com - Mastering Machine Learning with Python in Six Steps

Home · Book Reports · 2018 · Mastering Machine Learning With Python in Six Steps

Published: March 16, 2018 (6 years 1 month ago.)



The book in...
One sentence:
A basic introduction to machine learning that includes a decent description of various methods but is decidedly lacking in executable code examples.

Five sentences:
Chapter 1 is, as usual in many of these kinds of books, a whirlwind introduction to python that is safe to skip or skim. The next few chapters introduce machine learning's fundamentals, supervised versus unsupervised learning, various regression and classification techniques, as well as timed series forecasting. The remainder of the book builds on these concepts with model diagnosis and tuning using probability, cleaning up and engineering the features of imbalanced or 'dirty' data, in addition to introducing variance, biasm hyperparamaterization, grid/random searches, and ensemble training, chaining multiple algorithms together - all important topics that add to an overall understanding of machine learning. Another area, of much current interest, is that of text mining that can be employed in sentiment analysis or recommendation systems. Lastly neural nets, what is often called deep learning, is covered in a less rigorous fashion as it is an extensive topic in and of itself, plus it is on the cutting edge of algorithm development.

designates my notes. / designates important.


Thoughts

After starting off on a great foot with a detailed table of contents and a chapter 1 that, while was merely an overview of python I breezed through, had a few bits of knowledge concerning dicts and sets I wasn’t aware of, things quickly took a turn for the worse.

First the not so bad. The algorithms, feature engineering, and general concepts are presented fairly well. I learned quite a bit, this being the first machine learning book I have read. There are plenty of diagrams to supplement the descriptions of the various techniques.

With the good out of the way…

The errors piled up example after example. The most common ’error’ was not including which imports were necessary. I spent a good deal of time online trying to figure out what I needed to import before the code would even try to run. Trying to see a silver lining, I consoled myself in the knowledge that by being forced to look up package after package, I was garnering a better understanding of what was available in the machine learning world.

After presenting a large section talking about pandas, numpy, and sklearn, the book takes another turn and starts using, unannounced and with no introduction I might add, statsmodels. As usual it also does so without telling you how to import it (import statsmodels.api as sm).

The function ‘plot_decision_regions()’ is used and not defined anywhere. I can not find library it is in but I did manage to find a few variations online, but they have different parameterization. Eventually I gave up on this one.

Last, and in this case certainly least egregious was the English, which is at time quite… “odd”. This, of course, can be excused. The missing code can not.

In conclusion, this was one of the worst programming books I have ever read. It was absolutely infuriating to work with the code examples. The programming independent machine learning information, on the other hand, was presented fairly clearly. In the end, I would not recommend it.

Interesting

One, somewhat random, observation that I would be interested to know more about is the fact that “clustering analysis origins can be traced to the area of Anthropology and Psychology in the 1930’s.” Given what my nontechnical research has revealed about these two fields, it might prove to be… insightful to understand the origins of clustering.

Code

This is the code I tinkered with, none of it should be considered usable in any way and is here merely for archival purposes.


Exceptional Excerpts

“Machine learning is a subfield of computer science that evolved from the study of pattern recognition…"

“As of 2008, the world’s servers processed 9.57 zeta-bytes (9.57 trillion gigabytes) of information, which is equivalent to 12 gigabytes of information per person per day, according to the ‘How Much Information? 2010 report on Enterprise Server Information.’"

“Supervised models such as linear and nonlinear regression techniques are useful to model patterns to predict continuous numerical data types. Whereas logistic regression, decision trees, SVM and kNN are useful to model classification problems (functions are available to use for regression as well).You also learned ARIMA, which is one of the key time-series forecasting models. Unsupervised techniques such as k-means and hierarchical clustering are useful to group similar items, whereas principal component analysis can be used to reduce a large dimension data to lower a dimension to enable efficient computation."


Table of Contents


· 01: Getting Started in Python

page 15:
page 049:
page 52:
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use | operator
print "Union of A | B", A|B

# alternative we can use union()
A.union(B)

---- output ----
Union of A | B set([1, 2, 3, 4, 5, 6, 7, 8])
Listing 1-33.
# use & operator
print "Intersection of A & B", A & B

# alternative we can use intersection()
print A.intersection(B)

---- output ----
Intersection of A & B set([4, 5])
Listing 1-34.
page 53:
# use - operator on A
print "Difference of A - B", A - B

# alternative we can use difference()
print A.difference(B)

---- output ----
Difference of A - B set([1, 2, 3])
Listing 1-35.
# use ^ operator
print "Symmetric difference of A ^ B", A ^ B

# alternative we can use symmetric_difference()
A.symmetric_difference(B)

---- output ----
Symmetric difference of A ^ B set([1, 2, 3, 6, 7, 8])
Listing 1-36.
page 56:
dict.get(key, default=None)
# returns None if key not present

dict[key]
# will error if key not present
dict.setdefault(key, default=None)
# adds a new key with default value
# like defaultdict?
page 64:
# Simple function to loop through arguments and print them
def sample_function(*args):
for a in args:
print a

# Call the function
Sample_function(1,2,3)
1
2
3

Listing 1-47.
# Simple function to loop through arguments and print them
def sample_function(**kwargs):
for a in kwargs:
print a, kwargs[a]

# Call the function
sample_function(name='John', age=27)
age 27
name 'John'

Listing 1-48.

· 02: Introduction to Machine Learning

page 75:

page 76:
Observe – identify patterns using the data
Plan – find all possible solutions
Optimize – find optimal solution from the list of possible solutions
Action – execute the optimal solution
Learn and Adapt – is the result giving expected result, if no adapt
page 077:
page 079:
page 082:
page 83:
page 085:
page 87:
page 88:

page 089:
page 090:
page 91:
page 92:
page 94:
page 097:
page 100:
page 101:
page 102:
page 109:
page 111:
a=np.array([[1,2], [3, 4], [5, 6]])

# Find the elements of a that are bigger than 2
print (a > 2)

# to get the actual value
print a[a > 2]

---- output ----
[[False False]
[ True True]
[ True True]]
[3 4 5 6]

Listing 2-8.
Boolean array indexing
page 114:
x=np.array([[1,2],[3,4]])

# Compute sum of all elements
print np.sum(x)

# Compute sum of each column
print np.sum(x, axis=0)

# Compute sum of each row
print np.sum(x, axis=1)

# ---- output ----
10
[4 6]
[3 7]

Listing 2-11.
Sum function
page 115:
# create a matrix
a = np.array([[1,2,3], [4,5,6], [7,8,9]])

# create a vector
v = np.array([1, 0, 1])

# create an empty matrix with the same shape as a
b = np.empty_like(a)

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(3):
b[i, :] = a[i, :] + v
print b

# ---- output ----
[[ 2 2 4]
[ 5 5 7]
[ 8 8 10]]

Listing 2-13.
Broadcasting
# Stack 3 copies of v on top of each other
vv = np.tile(v, (3, 1))
print vv

# ---- output ----
[[1 0 1]
[1 0 1]
[1 0 1]]

# Add a and vv elementwise
b = a + vvprint b

# ---- output ----
[[ 2 2 4]
[ 5 5 7]
[ 8 8 10]]

Listing 2-14.
Broadcasting for large matrix
a = np.array([[1,2,3], [4,5,6], [7,8,9]])
v = np.array([1, 0, 1])

# Add v to each row of a using broadcasting
b = a + v
print b

# ---- output ----
[[ 2 2 4]
[ 5 5 7]
[ 8 8 10]]

Listing 2-15.
Broadcasting using NumPy
# Compute outer product of vectors
# v has shape (3,)
v = np.array([1,2,3])

# w has shape (2,)
w = np.array([4,5])

# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
print np.reshape(v, (3, 1)) * w

# ---- output ----
[[ 4 5]
[ 8 10]
[12 15]]

# Add a vector to each row of a matrix
x = np.array([[1,2,3],[4,5,6]])

# x has shape (2, 3) and v has shape (3,) so they
broadcast to (2, 3)
printx + v

# ---- output ----
[[2 4 6]
[5 7 9]]

# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column
print(x.T + w).T

# ---- output ----
[[ 5 6 7]
[ 9 10 11]]

# Another solution is to reshape w to be a row vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
printx + np.reshape(w,(2,1))

# ---- output ----
[[ 5 6 7]
[ 9 10 11]]

# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays ofshape ();
# these can be broadcast together to shape (2, 3)
printx * 2

# ---- output ----
[[ 2 4 6]
[ 8 10 12]]

Listing 2-16.
Applications of broadcasting
page 118:
page 133:
page 152:

· 03: Fundamentals of Machine Learning

page 154:
page 157:
page 158:
Delete
Replace (mean/mode)
Random
Predictive model
page 161:
page 177:
page 211:
page 226:
page 230:
page 233:
page 236:
page 247:
page 249:
page 258:
page 265:

{{ exclamation(“summary”) }}

· 04: Model Diagnosis and Tuning

page 266:
page 271:
page 272:
page 275:
page 277:
page 278:
page 279:
page 290:
page 293:

· 05: Text Mining and Recommender Systems

page 318:
page 319:
page 320:
page 320:
page 366:

· 06: Deep and Reinforcement Learning

page 376:
page 377:
page 380:
page 383:
page 385:
page 422:
page 423:
page 424:
page 434: