Exploratory IoT Analysis with R

Download this episode

Download Video

Description

[00:00] - Introduction
[04:41] - Create Virtual Machine
[07:19] - Install R Studio
[09:14] - Create New Script
[10:11] - Install Packages
[11:12] - Connect to SQL Azure
[16:08] - Statistical Summaries
[19:20] - Variance & Standard Deviation
[21:00] - Filters & Dot Plots
[22:32] - Box Plot
[22:57] - Removing Outliers
[26:18] - Time Series
[29:41] - Density Plots
[32:21] - Conclusion

Abstract:

In this video we will do an initial exploratory analysis on a water flow data set that came from a prototype that I built.  The prototype consists of a water pump, a valve and a flow meter.  The data set exists in SQL Azure.  We will use R and R Studio to perform the analysis from an Azure virtual machine.

 The Code:

install.packages("RODBC")
require("RODBC")
d <- odbcDriverConnect(connection = "Driver={SQL Server Native Client 11.0};Server=tcp:qvr9aar3i6.database.windows.net,1433;Database=TelemetryDB;Uid=drcrook@qvr9aar3i6;Pwd=David!2345;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;")
sqlColumns(d, "WaterFlows")
flows <- as.data.frame(sqlFetch(d, "WaterFlows"))
summary(flows)
flows
flows["WaterFlow"]
summary(flows["WaterFlow"])
 
var(flows["WaterFlow"])
sd(flows$WaterFlow)
 
nrow(flows["WaterFlow"])
plot(flows["WaterFlow"])
nrow(flows[flows$WaterFlow < 9, ])
plot(flows[flows$WaterFlow < 9, ]["WaterFlow"])
 
#initial boxplot showing outliers
boxplot(
  x = flows$WaterFlow,
  xlab = "Flow/Minute",
  horizontal = TRUE
)
#calculate outlier threshold
outlierThreshold <- sd(flows$WaterFlow) * 1.5
#calculate mean
m <- mean(flows$WaterFlow)
#filter out anything under threshold
filtFlows <- flows[flows$WaterFlow > (m - outlierThreshold),]
#filter out anything over threshold
filtFlows <- filtFlows[filtFlows$WaterFlow < (m + outlierThreshold),]
summary(flows$WaterFlow)
summary(filtFlows$WaterFlow)
#new box plot, no outliers
boxplot(
  x = filtFlows$WaterFlow,
  xlab = "Flow/Minute",
  horizontal = TRUE
)
#line graph w/outliers
plot(
  y = flows$WaterFlow,
  x = flows$CollectionTime,
  type = "l"
)
#line graph no outliers
plot(
  y = filtFlows$WaterFlow,
  x = filtFlows$CollectionTime,
  type = "l"
)
 
t <- ts(flows$WaterFlow, frequency = 6307200)
plot(t)
 
plot(density(flows$WaterFlow))
 
plot(density(filtFlows$WaterFlow))
 
 

Embed

Format

Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to send us feedback you can Contact Us.