TY - BOOK AU - Minteer, Andrew TI - Analytics for the internet of things (IoT) : : intelligent analytics for your intelligent devices SN - 9781787120730 U1 - 004.678 PY - 2017/// CY - Birmingham : PB - Packt Pub. KW - Internet of things KW - Mobile computing KW - Information visualization N1 - Preface Chapter 1: Defining IoT Analytics and The situation Defining IoT analytics Defining analytics Defining the Internet of Things The concept of constrained IoT analytics challenges The data volume Problems with time Problems with space Data quality Analytics challenges Business value concerns Summary Chapter 2: IoT Devices and Networking Protocols IoT devices The wild world of IoT devices Healthcare Manufacturing Transportation and logistics Retail Oil and gas Home automation or monitoring Wearables Sensor types Networking basics IoT networking connectivity protocols Connectivity protocols (when the available power is limited) Bluetooth Low Energy (also called Bluetooth Smart) 6LoWPAN ZigBee Advantages of ZigBee Disadvantages of ZigBee Common use cases NFC Common use cases Sigfox Connectivity protocols (when power is not a problem) Wi-Fi Common use cases Cellular (4G/LTE) Common use cases IoT networking data messaging protocols Message Queue Telemetry Transport (MQTT) Topics Advantages to MQTT Disadvantages to MQTT QoS levels QoS 0 QoS 1 QoS 2 Last Will and Testament (LWT) Tips for analytics Common use cases Hyper-Text Transport Protocol (HTTP) Representational State Transfer (REST) principles HTTP and IoT Advantages to HTTP Disadvantages to HTTP Constrained Application Protocol (CoAP) Advantages to CoAP Disadvantages to CoAP Message reliability Common use cases Data Distribution Service (DDS) Common use cases Analyzing data to infer protocol and device characteristics Summary. Chapter 3: IoT Analytics for the Cloud Building elastic analytics What is cloud infrastructure? Elastic analytics concepts Design with the endgame in mind Designing for scale Decouple key components Encapsulate analytics Decoupling with message queues Distributed computing Avoid containing analytics to one server When to use distributed and when to use one server Assuming that change is constant Leverage managed services Use Application Programming Interfaces (API) Cloud security and analytics Public/private keys Public versus private subnets Access restrictions Securing customer data The AWS overview AWS key concepts Regions Availability Zones Subnet Security groups AWS key core services Virtual Private Cloud (VPC) Identity and Access Management (IAM) Elastic Compute (EC2) Simple Storage Service (S3) AWS key services for IoT analytics Amazon Simple Queue Service (SQS) Amazon Elastic Map Reduce (EMR) AWS machine learning Amazon Relational Database Service (RDS) Amazon Redshift Microsoft Azure overview Azure Data Lake Store Azure Analysis Services HDInsight The R server option The ThingWorx overview ThingWorx Core ThingWorx Connection Services ThingWorx Edge ThingWorx concepts Thing templates Things Properties Services Events Thing shapes Data shapes Entities Summary Chapter 4: Creating an AWS Cloud Analytics Environment The AWS CloudFormation overview The AWS Virtual Private Cloud (VPC) setup walk-through Creating a key pair for the NAT and bastion instances Creating an S3 bucket to store data Creating a VPC for IoT Analytics What is a NAT gateway? What is a bastion host? Your VPC architecture The VPC Creation walk-through. How to terminate and clean up the environment Summary Chapter 5: Collecting All That Data Strategies and Techniques Designing data processing for analytics Amazon Kinesis AWS Lambda AWS Athena The AWS IoT platform Microsoft Azure IoT Hub Applying big data technology to storage Hadoop Hadoop cluster architectures What is a Node? Node types Hadoop Distributed File System Parquet Avro Hive Serialization/Deserialization (SerDe) Hadoop MapReduce Yet Another Resource Negotiator (YARN) HBase Amazon DynamoDB Amazon S3 Apache Spark for data processing What is Apache Spark? Spark and big data analytics Thinking about a single machine versus a cluster of machines Using Spark for IoT data processing To stream or not to stream Lambda architectures Handling change Summary Chapter 6: Getting to Know Your Data Exploring IoT Data Exploring and visualizing data The Tableau overview Techniques to understand data quality Look at your data au naturel Data completeness Data validity Assessing Information Lag Representativeness Basic time series analysis What is meant by time series? Applying time series analysis Get to know categories in the data Bring in geography Look for attributes that might have predictive value R (the pirate's language ... if he was a statistician) Installing R and RStudio Using R for statistical analysis Summing it all up Solving industry-specific analysis problems Manufacturing Healthcare Retail Summary Chapter 7: Decorating Your Data Adding External Datasets to Innovate Adding internal datasets Which ones and why? Customer information Production data Field services Financial Adding external datasets External datasets geography Elevation. SRTM elevation National Elevation Dataset (NED) Weather Geographical features Planet.osm Google Maps API USGS national transportation datasets External datasets demographic The U.S. Census Bureau CIA World Factbook External datasets economic Organization for Economic Cooperation and Development (OECD) Federal Reserve Economic Data (FRED) Summary Chapter 8: Communicating with Others Visualization and Dashboarding Common mistakes when designing visuals The Hierarchy of Questions method The Hierarchy of Questions method overview Developing question trees Pulling together the data Aligning views with question flows Designing visual analysis for IoT data Using layout positioning to convey importance Use color to highlight important data The impact of using a single color to communicate importance Be consistent across visuals Make charts easy to interpret Creating a dashboard with Tableau The dashboard walk-through Hierarchy of Questions example Aligning visuals to the thought process Creating individual views Assembling views into a dashboard Creating and visualizing alerts Alert principles Organizing alerts using a Tableau dashboard Summary Chapter 9: Applying Geospatial Analytics to IoT Data Why do you need geospatial analytics for IoT? The basics of geospatial analysis Welcome to Null Island Coordinate Reference Systems The Earth is not a ball Vector-based methods The bounding box Contains Buffer Dilation and erosion Simplify Vector summary Raster-based methods Storing geospatial data File formats Spatial extensions for relational databases Storing geospatial data in HDFS Spatial indexing R-tree Processing geospatial data Geospatial analysis software ArcGIS QGIS. Ogr2ogr PostGIS spatial functions Geospatial analysis in the big data world Solving the pollution reporting problem Summary Chapter 10: Data Science for IoT Analytics Machine learning (ML) What is machine learning? Representation Evaluation Optimization Generalization Feature engineering with IoT data Dealing with missing values Centering and scaling Time series handling Validation methods Cross-validation Test set Precision, recall, and specificity Understanding the bias-variance tradeoff Bias Variance Trade-off and complexity Comparing different models to find the best fit using R ROC curves Area Under the Curve (AUC) Random forest models using R Random forest key concepts Random forest R examples Gradient Boosting Machines (GBM) using R GBM key concepts The Gradient Boosting Machines R example Ensemble Anomaly detection using R Forecasting using ARIMA Using R to forecast time series IoT data Deep learning Use cases for deep learning with IoT data A Nickel Tour of deep learning Setting up TensorFlow on AWS Summary Chapter 11: Strategies to Organize Data for Analytics Linked Analytical Datasets Analytical datasets Building analytic datasets Linking together datasets Managing data lakes When data lakes turn into data swamps Data refineries Developing a progression process The data retention strategy Goals Retention strategies for IoT data Reducing accessibility Reducing the number of fields Reduce the number of records The retention strategy example Summary Chapter 12: The Economics of IoT Analytics The economics of cloud computing and open source Variable versus fixed costs The option to quit Cloud costs can escalate quickly Monitoring cloud billing closely N2 - We start with the perplexing task of extracting value from huge amounts of barely intelligible data. The data takes a convoluted route just to be on the servers for analysis, but insights can emerge through visualization and statistical modeling techniques. You will learn to extract value from IoT big data using multiple analytic techniques. Next we review how IoT devices generate data and how the information travels over networks. You'll get to know strategies to collect and store the data to optimize the potential for analytics, and strategies to handle data quality concerns. Cloud resources are a great match for IoT analytics, so Amazon Web Services, Microsoft Azure, and PTC ThingWorx are reviewed in detail next. Geospatial analytics is then introduced as a way to leverage location information. Combining IoT data with environmental data is also discussed as a way to enhance predictive capability. We'll also review the economics of IoT analytics and you'll discover ways to optimize business value. By the end of the book, you'll know how to handle scale for both data storage and analytics, how Apache Spark can be leveraged to handle scalability, and how R and Python can be used for analytic modeling ER -