On this page we highlight some books that you might find worthwhile to read. If you know of any other books that we should highlight on the MIIA website or via our real-time community messaging platform (MIIA on Slack), please let us know either on Slack or via firstname.lastname@example.org.
Advances in both theory and practice are throwing the promise of machine learning into sharp relief. The field has the potential to transform a range of industries, from self-driving cars to intelligent business applications. Yet machine learning is so complex and wide-ranging that even its definition can change from one person to the next.
Data Scientists at Work
A collection of interviews with 16 of the world's most influential and innovative data scientists from across the spectrum of this hot new profession - from Yann LeCun at Facebook, to Daniel Tunkelang at LinkedIn, to Caitlin Smallwood at Netflix, to Jake Porway at DataKind and more ...
Street-Fighting Mathematics: The Art of Educated Guessing and Opportunistic Problem Solving
Data Driven: Profiting from Your Most Important Business Asset
Competing on Analytics: The New Science of Winning
Data Analysis with Open Source Tools
Data Source Handbook
Who's #1?: The Science of Rating and Ranking
Doing Data Science: Straight Talk from the Frontline
Data Smart: Using Data Science to Transform Information into Insight
Data Science for Business:
What you need to know about data mining and data-analytic thinking
An Introduction to Statistical Learning: with Applications in R
Data Analysis Using Regression and Multilevel/Hierarchical Models
Statistics As Principled Argument
A Handbook of Statistical Analyses Using R, Second Edition
Mathematical Statistics and Data Analysis (with CD Data Sets)
Speech and Language Processing, 2nd Edition
Foundations of Statistical Natural Language Processing
Natural Language Processing with Python
Graph-based Natural Language Processing and Information Retrieval
Natural Language Processing for Online Applications: Text retrieval, extraction and categorization. Second revised edition
1. Overviews and theories – the ideas behind the Big Data revolution, mostly written for any audience regardless of technical ability.
The Human Face of Big Data, created by Rick Smolan and Jennifer Erwitt
Rather than a formulaic textbook, this book talks the reader through the ideas and applications of Big Data through a series of essays and photographs. It pays particular attention to humanizing the story – showing how the technologies being discussed are affecting the lives of real people around the world. The essays come from a range of authors noted for their thoughts on the impact of technology and data on society.
Big Data: A Revolution that will Transform how we Live, Work and Think
By Viktor Mayer-Schonberger and Kenneth Cukier.
This book aims to examine the social impact of the ever-growing amount of data we are collecting, storing and analyzing, as well as providing the reader with a practical toolkit for surviving and thriving in a Big Data world.
Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie or Die
By Eric Siegel
Referred to as “The Freakonomics of Big Data”, this book is written for any audience regardless of technical expertise and explores the many ways in which data analysis seems to be giving us the change to predict, and therefore change, the future. Author Siegel is the founder and editor of the Predictive Analytics Times.
Pattern Recognition and Machine Learning
By Christopher Bishop
This book assumes no prior knowledge of the subject matter, but readers with some intermediate knowledge of mathematics, such as linear algebra and calculus will find it easier going than those without. It explains and illustrates the way data scientists are introducing Bayesian algorithms to enable computers to make decisions more quickly and reliably than any human ever could.
Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World 1st Edition
By Bruce Schneier
Every day we are being watched and recorded, by governments as well as corporations, hell-bent on collecting as much information about us as they can. But why? What do they want? And, how can we make sure that the benefits we gain from living in an increasingly digitized and data-centred world outweigh the freedom and anonymity we are sacrificing? This book provides answers to these questions.
Smart Cities - Big Data, Civic Hackers, and the Quest for a New Utopia
by Anthony M. Townsend
An examination of how datafication of urban spaces and services is changing the way we live in cities, and how what we are seeing start to happen now – in cities such as Chicago, Zaragoza, Spain, and Milton Keynes, UK, is only the beginning.
2. Practical use – Books which explain specific technical skills, not always suited to beginners
Hadoop, the Definitive Guide
By Tom White
The elephant in the room that everyone is talking about. This practical guide to Hadoop is aimed at programmers and data scientists who want to get started using the Hadoop distributed Big Data framework for analytics and predictive modelling.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
By Trevor Hastie, Robert Tibshirani, Jerome Friedman
This is a great book which looks a little deeper into the science behind the theories. You won’t need a maths degree but it goes into some depth on the statistical theories and concepts behind machine learning and predictive algorithms.
MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems
By Donald Miner
An overview, along with example code, of building MapReduce patterns for use in Big Data and analytical projects. The book was written with the aim of bringing all the disparate information on the subject together from the academic research papers, online communities and blogs where it has evolved.
Python for Data Analysis
By Wes McKinney
There are lots of free courses online which can teach you Python, but as mentioned in the intro, you sometimes just can’t beat a well written and structured book. Python is one of the most popular programming languages for handling data and creating predictive algorithms, and this book explains in detail how to apply it to Big Data tasks.
Practical Data Science with R
By Nina Zumel and John Mount
The basic principles along with real-world case studies showing the many applications of R in statistical modelling and predictive analytics. Not for total R beginners – the emphasis is on explaining how the language can be applied to creating algorithms for data analysis, rather than teaching a beginner to code in R, but most people with a basic understanding of computer programming principles should be able to follow it.
3. Miscellaneous – books covering the dark side of Big Data, hobbyist applications and specific applications.
By Marc Goodman
If you have difficulty sleeping due to thoughts of burglars analyzing social media to determine the best time to break into your house, or hacking your baby monitor to spy on your family, you might want to give this one a miss. An examination of the many ways criminals are taking advantage of our always-connected society.
Internet of Things – Home Projects for Raspberry Pi, Arduino and Beaglebones Black
By Donald Norris
Fancy having a go at building your own IOT home lighting, security or environmental control system? This book will show you how to put together the hardware using cheap microcontrollers and off-the-shelf components, and explain the programming needed to make it all work.
Building Data Science Teams
By DJ Patil
Written by the US Chief Data Scientist and currently a free ebook download at Amazon, this book looks at the mix of skills business leaders need to harness to make the most of analytics in their organizations.
Visualize This: The FlowingData Guide to Design, Visualization, and Statistics 1st Edition
By Nathan Yau
Explains the principles of visual storytelling with Big Data. How to set goals regarding what you need to explain and what is just noise, and creatively express your results in a way that will get the attention of your intended audience .