Limit this search to....

Data Science Live Book: An Intuitive and Practical Approach to Data Analysis, Data Preparation and Machine Learning, Suitable for All Ages! (B
Contributor(s): Casas, Pablo (Author)
ISBN: 9874269049     ISBN-13: 9789874269041
Publisher: Pablo Adrian Casas
OUR PRICE:   $17.10  
Product Type: Paperback
Published: March 2018
Qty:
Additional Information
BISAC Categories:
- Computers | Data Modeling & Design
Physical Information: 0.69" H x 5.98" W x 9.02" (0.99 lbs) 306 pages
 
Descriptions, Reviews, Etc.
Publisher Description:
Welcome It's a book to learn data science, machine learning and data analysis with tons of examples and explanations around several topics like:
  • Exploratory data analysis
  • Data preparation
  • Selecting best variables
  • Model performance
Note: This is the Black & White version. Link to the color one (higher price): https: //www.amazon.com/dp/9874273666.Everything is related to everythingThe book's premise is "everything is related to everything". It is noticed in the relationship across different sections, for example choosing the right data type for any variable could be related to dealing with missing values, and vice-versa.In addition, some technical examples are related to "real-life" situations as well as philosophical concepts. The ultimate goal is to simplify the learning journey. How is it organized?It's a playbook with full of data preparation receipts, using the open source R language. There are two types of examples, some are oriented to teach general concepts around data analysis (like the information theory concept), while others are intended to show how to transform missing values, choosing the correct data type, and the implications in any case; among others, using easy copy-paste pieces of code.Please note that this is not a book to learn how to program R from scratch, nor how algorithms are implemented (math and stats area).Who is the target audience?It's aimed to people who are Programmers and data scientists who work -or want to- in machine learning projects. However, the ones who don't want -or don't know- how to code, can get some useful insights which can add value as data project analysts.All the R examples are well explained in code comments.No math or statistical background to understand it.The book tries to be as tool-independent as possible. For example, the decision of what to do to deal with missing or extreme values is the whether we choose R, Python, Julia. What it changes is the how.Last wordsTo develop a critical thinking, without taking any statement as the "true truth", it's essential in this sea of books, courses, videos and any technical material to learn. This book is just another view in the data science perspective. Hope you like it: )
Index

Exploratory data analysis

  • Profiling, The voice of the numbers
  • Correlation and Relationship
Data preparation
  • Handling Data Types
  • High Cardinality Variable in Descriptive Stats
  • High Cardinality Variable in Predictive Modeling
  • Treatment of outliers
  • Missing Data: Analysis, Handling, and Imputation of
  • Considerations involving time
Selecting best variables
  • General Aspects in Selecting Best Variables
  • Intuition
  • The "best" selection?
  • The nature of the selection
  • Improving variables
  • Cleaning by domain knowledge
  • Variables work in groups
  • Correlation between input variables
  • Keep it simple
  • Variable selection in Clustering?
  • Selecting the best variables in practice
  • Target profiling
Assesing Model performance
  • Knowing the Error
  • Out-of-Time Validation
  • Gain and Lift Analysis
  • Scoring Data
Appendix
  • The magic of percentiles
  • funModeling quick-start