Free Data Science eBooks - November 2017
The clocks fell back an hour last week, halloween is behind us and there's only a few days until bonfire night - most of the autumn has already gone and the nights are getting colder and longer.
What better time to get comfy under a nice warm blanket with a hot mug of cocoa and a good book.
It's getting late in our Back To School series, but here are three free eBooks to help you on your educational journey and make those long nights just that bit shorter.
I hope these books prove to be a valuable resource to you and that you will visit regularly (and share with your friends in social media too).
If you haven't subscribed to our newsletter yet, why not subscribe using the form on the right - you'll be the very first to know when new resources are published.
Disclosure: as well as links to the free ebooks we may also include links to non-free versions of the same books, and we may earn an affiliate commission for purchases you make when using those links
You can find further details in our TCs
This month we highlight 3 books:
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
- Advanced Linear Models for Data Science
- A Probabilistic Theory of Pattern Recognition
They're all FREE, so help yourselves...
by Hadley Wickham and Garrett Grolemund
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible.
Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way.
You’ll learn how to:
- Wrangle – transform your datasets into a form convenient for analysis
- Program – learn powerful R tools for solving data problems with greater clarity and ease
- Explore – examine your data, generate hypotheses, and quickly test them
- Model – provide a low-dimensional summary that captures true “signals” in your dataset
- Communicate – learn R Markdown for integrating prose, code, and results
NOTE: This book is offered for free as a pdf only. If you would rather have a Kindle Edition or a paperback copy for your office or department bookshelf you can get a paid copy from Amazon ← *big fat scary affiliate link!*
Enjoying this blog post? Share it with the world...
by Brian Caffo
Linear models are the cornerstone of statistical methodology. Perhaps more than any other tool, advanced students of statistics, biostatistics, machine learning, data science, econometrics, etcetera should spend time learning the finer grain details of this subject.
In this book, we give a brief, but rigorous treatment of advanced linear models. It is advanced in the sense that it is of level that an introductory PhD student in statistics or biostatistics would see. The material in this book is standard knowledge for any PhD in statistics or biostatistics.
Students will need a fair amount of mathematical prerequisites before trying to undertake this class. First, is multivariate calculus and linear algebra. Especially linear algebra, since much of the early parts of linear models are direct applications of linear algebra results applied in a statistical context. In addition, some basic proof based mathematics is necessary to follow the proofs. In addition, some regression models and mathematical statistics are needed.
NOTE: This ebook is currently only 50% complete at Leanpub, but a similar book you might like to check out at Amazon is Advanced Linear Modeling: Multivariate, Time Series, and Spatial Data ← *big fat scary affiliate link!*
by Luc Devroye, Laszlo Györfi and Gabor Lugosi
A self-contained and coherent account of probabilistic techniques, covering: distance measures, kernel rules, nearest neighbour rules, Vapnik-Chervonenkis theory, parametric classification, and feature extraction. Each chapter concludes with problems and exercises to further the readers understanding. Both research workers and graduate students will benefit from this wide-ranging and up-to-date account of a fast- moving field.
Pattern recognition presents one of the most significant challenges for scientists and engineers, and many different approaches have been proposed.
The aim of this book is to provide a self-contained account of probabilistic analysis of these approaches. The book includes a discussion of distance measures, nonparametric methods based on kernels or nearest neighbors, Vapnik-Chervonenkis theory, epsilon entropy, parametric classification, error estimation, tree classifiers, and neural networks. Wherever possible, distribution-free properties and inequalities are derived. A substantial portion of the results or the analysis is new. Over 430 problems and exercises complement the material.
NOTE: This book is offered for free as a pdf only. If you would rather have a Kindle Edition or a hardback copy or a you can get a paid copy from Amazon ← *big fat scary affiliate link!*
Share this content with your friends...
If you found this content interesting or useful, we would really appreciate it if you would:
- Share it on your favourite social media channel
- Link to this post from your own blog
blog comments powered by Disqus