English | 2014 | ISBN: 978-1-78398-024-6 | 448 Pages | PDF | 10 MB
As increasing amounts of data is generated each year, the need to analyze and operationalize it is more important than ever. Companies that know what to do with their data will have a competitive advantage over companies that don’t, and this will drive a higher demand for knowledgeable and competent data professionals.
Starting with the basics, this book will cover how to set up your numerical programming environment, introduce you to the data science pipeline (an iterative process by which data science projects are completed), and guide you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples in the two most popular programming languages for data analysis—R and Python.
What You Will Learn
- Structure a data science project by using the data science pipeline
- Acquire and ingest data from files, data stores, and directly from the Web
- Clean, munge, and manipulate data into shape so that it is ready for analysis
- Draw insights from the data and conduct analyses that will deliver those insights
- Determine and apply the most appropriate model to your data
- Interpret the results of your analysis and modeling
- Communicate your results through a visualization, report, or application