Exploratory IoT Analysis with R

Sign in to queue

Description

[00:00] - Introduction
[04:41] - Create Virtual Machine
[07:19] - Install R Studio
[09:14] - Create New Script
[10:11] - Install Packages
[11:12] - Connect to SQL Azure
[16:08] - Statistical Summaries
[19:20] - Variance & Standard Deviation
[21:00] - Filters & Dot Plots
[22:32] - Box Plot
[22:57] - Removing Outliers
[26:18] - Time Series
[29:41] - Density Plots
[32:21] - Conclusion

Abstract:

In this video we will do an initial exploratory analysis on a water flow data set that came from a prototype that I built.  The prototype consists of a water pump, a valve and a flow meter.  The data set exists in SQL Azure.  We will use R and R Studio to perform the analysis from an Azure virtual machine.

 The Code:

install.packages("RODBC")
require("RODBC")
d <- odbcDriverConnect(connection = "Driver={SQL Server Native Client 11.0};Server=tcp:qvr9aar3i6.database.windows.net,1433;Database=TelemetryDB;Uid=drcrook@qvr9aar3i6;Pwd=David!2345;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;")
sqlColumns(d, "WaterFlows")
flows <- as.data.frame(sqlFetch(d, "WaterFlows"))
summary(flows)
flows
flows["WaterFlow"]
summary(flows["WaterFlow"])
 
var(flows["WaterFlow"])
sd(flows$WaterFlow)
 
nrow(flows["WaterFlow"])
plot(flows["WaterFlow"])
nrow(flows[flows$WaterFlow < 9, ])
plot(flows[flows$WaterFlow < 9, ]["WaterFlow"])
 
#initial boxplot showing outliers
boxplot(
  x = flows$WaterFlow,
  xlab = "Flow/Minute",
  horizontal = TRUE
)
#calculate outlier threshold
outlierThreshold <- sd(flows$WaterFlow) * 1.5
#calculate mean
m <- mean(flows$WaterFlow)
#filter out anything under threshold
filtFlows <- flows[flows$WaterFlow > (m - outlierThreshold),]
#filter out anything over threshold
filtFlows <- filtFlows[filtFlows$WaterFlow < (m + outlierThreshold),]
summary(flows$WaterFlow)
summary(filtFlows$WaterFlow)
#new box plot, no outliers
boxplot(
  x = filtFlows$WaterFlow,
  xlab = "Flow/Minute",
  horizontal = TRUE
)
#line graph w/outliers
plot(
  y = flows$WaterFlow,
  x = flows$CollectionTime,
  type = "l"
)
#line graph no outliers
plot(
  y = filtFlows$WaterFlow,
  x = filtFlows$CollectionTime,
  type = "l"
)
 
t <- ts(flows$WaterFlow, frequency = 6307200)
plot(t)
 
plot(density(flows$WaterFlow))
 
plot(density(filtFlows$WaterFlow))
 
 

Embed

Download

Download this episode

The Discussion

Add Your 2 Cents