On this page we highlight some books that you might find worthwhile to read. If you know of any other books that we should highlight on the MIIA website or via our real-time community messaging platform (MIIA on Slack), please let us know either on Slack or via info@machineintelligenceafrica.org.

By David Beyer

Publisher: O'Reilly

Released: March 2016

Advances in both theory and practice are throwing the promise of machine learning into sharp relief. The field has the potential to transform a range of industries, from self-driving cars to intelligent business applications. Yet machine learning is so complex and wide-ranging that even its definition can change from one person to the next.

Data Scientists at Work

A collection of interviews with 16 of the world's most influential and innovative data scientists from across the spectrum of this hot new profession - from Yann LeCun at Facebook, to Daniel Tunkelang at LinkedIn, to Caitlin Smallwood at Netflix, to Jake Porway at DataKind and more ...

Street-Fighting Mathematics: The Art of Educated Guessing and Opportunistic Problem Solving

Data Driven: Profiting from Your Most Important Business Asset

Competing on Analytics: The New Science of Winning

Data Analysis with Open Source Tools

Data Source Handbook

Who's #1?: The Science of Rating and Ranking

Doing Data Science: Straight Talk from the Frontline

Data Smart: Using Data Science to Transform Information into Insight

Data Science for Business:

What you need to know about data mining and data-analytic thinking

An Introduction to Statistical Learning: with Applications in R

Data Analysis Using Regression and Multilevel/Hierarchical Models

Statistics As Principled Argument

A Handbook of Statistical Analyses Using R, Second Edition

Mathematical Statistics and Data Analysis (with CD Data Sets)

Pattern Recognition and Machine Learning

Bayesian Reasoning and Machine Learning

Machine Learning: A Probabilistic Perspective

The LION Way: Learning plus Intelligent Optimization

Speech and Language Processing, 2nd Edition

Foundations of Statistical Natural Language Processing

Natural Language Processing with Python

Graph-based Natural Language Processing and Information Retrieval

Natural Language Processing for Online Applications: Text retrieval, extraction and categorization. **Second revised edition**

Visualizing Data: Exploring & Explaining Data with Processing Environment

The Visual Display of Quantitative Information

Mining of Massive Datasets

Data-Intensive Text Processing with MapReduce

Data Mining with R: Learning with Case Studies

The Art of R Programming: A Tour of Statistical Software Design

R Graphics Cookbook

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

Learning Python, 5th Edition

Learning Spark: Lightning-fast big data analytics

Fast Data Processing with Spark

Hadoop: The Definitive Guide

Programming Pig

Practical Cassandra: A Developer's Approach

**1. Overviews and theories – the ideas behind the Big Data revolution, mostly written for any audience regardless of technical ability.**

*The Human Face of Big Data, created by Rick Smolan and Jennifer Erwitt*

http://www.amazon.com/The-Human-Face-Big-Data/dp/1454908270

Rather than a formulaic textbook, this book talks the reader through the ideas and applications of Big Data through a series of essays and photographs. It pays particular attention to humanizing the story – showing how the technologies being discussed are affecting the lives of real people around the world. The essays come from a range of authors noted for their thoughts on the impact of technology and data on society.

*Big Data: A Revolution that will Transform how we Live, Work and Think*

By Viktor Mayer-Schonberger and Kenneth Cukier.

http://www.amazon.com/Big-Data-Revolution-Transform-Think/dp/054422...

This book aims to examine the social impact of the ever-growing amount of data we are collecting, storing and analyzing, as well as providing the reader with a practical toolkit for surviving and thriving in a Big Data world.

*Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie or Die*

By Eric Siegel

http://www.amazon.com/Predictive-Analytics-Power-Predict-Click/dp/1...

Referred to as “The Freakonomics of Big Data”, this book is written for any audience regardless of technical expertise and explores the many ways in which data analysis seems to be giving us the change to predict, and therefore change, the future. Author Siegel is the founder and editor of the Predictive Analytics Times.

*Pattern Recognition and Machine Learning*

By Christopher Bishop

http://www.amazon.com/Pattern-Recognition-Learning-Information-Stat...

This book assumes no prior knowledge of the subject matter, but readers with some intermediate knowledge of mathematics, such as linear algebra and calculus will find it easier going than those without. It explains and illustrates the way data scientists are introducing Bayesian algorithms to enable computers to make decisions more quickly and reliably than any human ever could.

*Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World 1st Edition*

By Bruce Schneier

http://www.amazon.com/Data-Goliath-Battles-Collect-Control/dp/03932...

Every day we are being watched and recorded, by governments as well as corporations, hell-bent on collecting as much information about us as they can. But why? What do they want? And, how can we make sure that the benefits we gain from living in an increasingly digitized and data-centred world outweigh the freedom and anonymity we are sacrificing? This book provides answers to these questions.

*Smart Cities - Big Data, Civic Hackers, and the Quest for a New Utopia*

by Anthony M. Townsend

http://www.amazon.com/Smart-Cities-Civic-Hackers-Utopia/dp/0393349780/

An examination of how datafication of urban spaces and services is changing the way we live in cities, and how what we are seeing start to happen now – in cities such as Chicago, Zaragoza, Spain, and Milton Keynes, UK, is only the beginning.

**2. Practical use – Books which explain specific technical skills, not always suited to beginners**

*Hadoop, the Definitive Guide*

By Tom White

http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1491901632/

The elephant in the room that everyone is talking about. This practical guide to Hadoop is aimed at programmers and data scientists who want to get started using the Hadoop distributed Big Data framework for analytics and predictive modelling.

*The Elements of Statistical Learning: Data Mining, Inference, and Prediction*

By Trevor Hastie, Robert Tibshirani, Jerome Friedman

http://www.amazon.com/Elements-Statistical-Learning-Prediction-Stat...

This is a great book which looks a little deeper into the science behind the theories. You won’t need a maths degree but it goes into some depth on the statistical theories and concepts behind machine learning and predictive algorithms.

*MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems*

By Donald Miner

http://www.amazon.com/MapReduce-Design-Patterns-Effective-Algorithm...

An overview, along with example code, of building MapReduce patterns for use in Big Data and analytical projects. The book was written with the aim of bringing all the disparate information on the subject together from the academic research papers, online communities and blogs where it has evolved.

*Python for Data Analysis*

By Wes McKinney

http://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/144...

There are lots of free courses online which can teach you Python, but as mentioned in the intro, you sometimes just can’t beat a well written and structured book. Python is one of the most popular programming languages for handling data and creating predictive algorithms, and this book explains in detail how to apply it to Big Data tasks.

*Practical Data Science with R*

By Nina Zumel and John Mount

https://www.manning.com/books/practical-data-science-with-r

The basic principles along with real-world case studies showing the many applications of R in statistical modelling and predictive analytics. Not for total R beginners – the emphasis is on explaining how the language can be applied to creating algorithms for data analysis, rather than teaching a beginner to code in R, but most people with a basic understanding of computer programming principles should be able to follow it.

**3. Miscellaneous – books covering the dark side of Big Data, hobbyist applications and specific applications.**

*Future Crimes*

By Marc Goodman

http://www.amazon.com/Future-Crimes-Everything-Connected-Vulnerable...

If you have difficulty sleeping due to thoughts of burglars analyzing social media to determine the best time to break into your house, or hacking your baby monitor to spy on your family, you might want to give this one a miss. An examination of the many ways criminals are taking advantage of our always-connected society.

*Internet of Things – Home Projects for Raspberry Pi, Arduino and Beaglebones Black*

By Donald Norris

http://www.amazon.com/Internet-Things-Do---Yourself-BeagleBone/dp/0...

Fancy having a go at building your own IOT home lighting, security or environmental control system? This book will show you how to put together the hardware using cheap microcontrollers and off-the-shelf components, and explain the programming needed to make it all work.

*Building Data Science Teams*

By DJ Patil

http://www.amazon.com/Building-Data-Science-Teams-Patil-ebook/dp/B0...

Written by the US Chief Data Scientist and currently a free ebook download at Amazon, this book looks at the mix of skills business leaders need to harness to make the most of analytics in their organizations.

*Visualize This: The FlowingData Guide to Design, Visualization, and Statistics 1st Edition*

By Nathan Yau

http://www.amazon.com/Visualize-This-FlowingData-Visualization-Stat...

Explains the principles of visual storytelling with Big Data. How to set goals regarding what you need to explain and what is just noise, and creatively express your results in a way that will get the attention of your intended audience .