WEBVTT

00:00:00.000 --> 00:00:10.530
[MUSIC].

00:00:10.530 --> 00:00:13.170
>> Hey everyone. Welcome to
this episode of data exposed.

00:00:13.170 --> 00:00:15.240
I'm Travis Wright, Group
Product Manager for

00:00:15.240 --> 00:00:18.435
the SQL Server and Azure data
engineering team at Microsoft.

00:00:18.435 --> 00:00:22.335
Today I'm excited to introduce
to you a SQL Server 2019,

00:00:22.335 --> 00:00:24.945
the most recent released SQL Server.

00:00:24.945 --> 00:00:28.515
SQL Server is celebrating its
25th anniversary this year.

00:00:28.515 --> 00:00:31.830
That's quite awhile. By look back
on the early days of my career,

00:00:31.830 --> 00:00:34.230
I started on SQL Server 2000.

00:00:34.230 --> 00:00:36.300
In that 25-year history,

00:00:36.300 --> 00:00:38.490
SQL Server has really
come a long ways.

00:00:38.490 --> 00:00:40.050
It's really expanded to meet

00:00:40.050 --> 00:00:42.030
the needs of our
customers over time as

00:00:42.030 --> 00:00:44.390
the different types of data
that customers need to

00:00:44.390 --> 00:00:47.060
collect and process
and query has changed,

00:00:47.060 --> 00:00:49.310
and as there's been more
and different kinds of

00:00:49.310 --> 00:00:51.965
database engine requirements
that have come along.

00:00:51.965 --> 00:00:54.470
So let's take a trip back
down memory lane for

00:00:54.470 --> 00:00:57.515
a moment and just look at where
SQL Server has come from,

00:00:57.515 --> 00:00:59.390
and then we'll take a look
at where SQL Server is

00:00:59.390 --> 00:01:02.515
going next with SQL Server 2019.

00:01:02.515 --> 00:01:05.350
Let's start with SQL Server 2008.

00:01:05.350 --> 00:01:07.295
SQL Server 2008 is actually

00:01:07.295 --> 00:01:09.995
out of extended support
just this year.

00:01:09.995 --> 00:01:14.390
If you fast forward a bit to look
at SQL Server 2012 and 2014,

00:01:14.390 --> 00:01:17.870
we really made some big improvements
in terms of performance and

00:01:17.870 --> 00:01:19.880
high availability by
introducing always

00:01:19.880 --> 00:01:22.565
on availability groups
for high availability,

00:01:22.565 --> 00:01:24.500
and in memory capabilities to really

00:01:24.500 --> 00:01:26.845
boost the performance
of your databases.

00:01:26.845 --> 00:01:29.630
In SQL Server 2016 and 2017,

00:01:29.630 --> 00:01:31.295
we really change the game a lot

00:01:31.295 --> 00:01:33.320
by introducing some
new capabilities in

00:01:33.320 --> 00:01:37.885
SQL Server to store and query
JSON and graph as well,

00:01:37.885 --> 00:01:41.210
and we also did something
very surprising by bringing

00:01:41.210 --> 00:01:45.580
SQL Server to Linux and
containers in SQL Server 2017.

00:01:45.580 --> 00:01:47.895
In SQL Server 2019,

00:01:47.895 --> 00:01:49.540
we're changing the game yet again,

00:01:49.540 --> 00:01:50.840
and really expanding and

00:01:50.840 --> 00:01:53.480
redefining the definition
of what SQL Server is.

00:01:53.480 --> 00:01:55.490
SQL server of course is still

00:01:55.490 --> 00:01:58.220
the relational database
that was 25 years ago.

00:01:58.220 --> 00:02:00.770
You can still store
your data in SQL Server

00:02:00.770 --> 00:02:03.335
and query it in the same
way that you always have.

00:02:03.335 --> 00:02:06.560
But at the same time, we're
redefining SQL Server and

00:02:06.560 --> 00:02:09.920
extending it well beyond just
the relational database space.

00:02:09.920 --> 00:02:14.135
So let's take a look at what
we're doing in SQL Server 2019.

00:02:14.135 --> 00:02:17.045
In SQL Server 2019,

00:02:17.045 --> 00:02:18.380
we're giving you access

00:02:18.380 --> 00:02:20.420
to query and process data

00:02:20.420 --> 00:02:23.990
outside of the boundary of a
traditional SQL Server instance.

00:02:23.990 --> 00:02:26.840
By taking PolyBase a feature we first

00:02:26.840 --> 00:02:30.445
introduced in SQL Server
2016 to the next level.

00:02:30.445 --> 00:02:34.280
PolyBase allows you to create a
data virtualization layer across

00:02:34.280 --> 00:02:36.170
multiple different
data sources such as

00:02:36.170 --> 00:02:38.810
Oracle other SQL server instances.

00:02:38.810 --> 00:02:42.460
Tera data, MongoDB and much more.

00:02:42.460 --> 00:02:46.460
We've also taken HDFS and
spark and build it in the box.

00:02:46.460 --> 00:02:48.230
So now with SQL Server,

00:02:48.230 --> 00:02:52.370
you can process and store
data the petabyte scale and

00:02:52.370 --> 00:02:57.650
process and store data that's
are also even unstructured data.

00:02:57.650 --> 00:03:01.520
You can use SQL Server with
virtually any programming language.

00:03:01.520 --> 00:03:04.310
You can run it on pretty
much any platform now.

00:03:04.310 --> 00:03:06.155
With SQL Server 2019,

00:03:06.155 --> 00:03:08.000
you can run it on Windows of course.

00:03:08.000 --> 00:03:11.345
You can also run it on
Linux on Red Hat, on Susa,

00:03:11.345 --> 00:03:13.670
or Ubuntu, you can run
it in a container,

00:03:13.670 --> 00:03:15.320
you can run it on Kubernetes.

00:03:15.320 --> 00:03:18.875
You can run it on a different
processor architectures now.

00:03:18.875 --> 00:03:20.630
With the Azure SQL Database edge,

00:03:20.630 --> 00:03:24.640
you can run it on an arm 64
device like a Raspberry Pi,

00:03:24.640 --> 00:03:27.680
and you can run it in the
Cloud and Azure SQL Database,

00:03:27.680 --> 00:03:29.030
or you can run it on-premises,

00:03:29.030 --> 00:03:31.115
or you can run it and
other public Clouds.

00:03:31.115 --> 00:03:32.720
There's a lot of versatility there.

00:03:32.720 --> 00:03:36.130
You can use SQL Server wherever
it suits you the best.

00:03:36.130 --> 00:03:39.290
SQL Server 2019 continues to

00:03:39.290 --> 00:03:42.190
expand our industry-leading
performance.

00:03:42.190 --> 00:03:45.710
SQL Server has established itself
for many years now as the number

00:03:45.710 --> 00:03:49.490
1 in terms of OLTP performance
with TPC-H Benchmarks,

00:03:49.490 --> 00:03:50.990
and as the number 1 in terms of

00:03:50.990 --> 00:03:54.050
data warehouse performance
with TPC-H Benchmarks.

00:03:54.050 --> 00:03:56.090
We've also led the industry by having

00:03:56.090 --> 00:03:58.670
the fewest number of
vulnerabilities reported out of any

00:03:58.670 --> 00:04:01.910
of the major database engines
across the last eight years

00:04:01.910 --> 00:04:06.010
according to the National Institute
of Standards and Technology.

00:04:06.010 --> 00:04:08.330
So let's take a closer
look at just some of

00:04:08.330 --> 00:04:11.075
the highlights of SQL Server 2019.

00:04:11.075 --> 00:04:12.770
Let's start with some
improvements we're

00:04:12.770 --> 00:04:15.005
making in the performance space.

00:04:15.005 --> 00:04:17.600
So first of all, persistent memory as

00:04:17.600 --> 00:04:20.585
a new technology that's
entering the hardware market.

00:04:20.585 --> 00:04:22.730
We've taken advantage
of persistent memory

00:04:22.730 --> 00:04:24.785
to really boost the performance.

00:04:24.785 --> 00:04:27.230
You don't have to make any
changes to your application,

00:04:27.230 --> 00:04:28.430
and you can store your data and

00:04:28.430 --> 00:04:31.330
persistent memory for
faster performance.

00:04:31.330 --> 00:04:34.030
Secondly, for intelligent
query processing,

00:04:34.030 --> 00:04:36.440
we've really expanded the
family of features here

00:04:36.440 --> 00:04:38.990
as you can see in this
chart to include lots

00:04:38.990 --> 00:04:41.615
of new ways where the
query optimizer can

00:04:41.615 --> 00:04:45.679
learn over time based on the
execution of how queries go,

00:04:45.679 --> 00:04:48.935
how future executions of those
queries can be improved,

00:04:48.935 --> 00:04:51.560
boosting the performance
of your applications over

00:04:51.560 --> 00:04:55.225
time without you having to change
anything in your applications,

00:04:55.225 --> 00:04:57.980
and lastly, we've put the TempDB in

00:04:57.980 --> 00:05:01.415
memory for even faster
performance of the temp database.

00:05:01.415 --> 00:05:03.650
Next, let's take a look at
some improvements we're

00:05:03.650 --> 00:05:05.690
making in security and compliance.

00:05:05.690 --> 00:05:08.330
First of all, especially with GDPR,

00:05:08.330 --> 00:05:09.905
customers are faced with

00:05:09.905 --> 00:05:13.220
even more regulatory requirements
that they have to meet.

00:05:13.220 --> 00:05:14.720
To make that easier,

00:05:14.720 --> 00:05:18.230
we provide data classification
capabilities out of the box.

00:05:18.230 --> 00:05:21.850
You can point the data classification
engine at your database,

00:05:21.850 --> 00:05:23.555
and it will automatically discover

00:05:23.555 --> 00:05:25.130
the different types
of data you have in

00:05:25.130 --> 00:05:29.425
your database such as
PCI data or GDPR data,

00:05:29.425 --> 00:05:31.790
and automatically
classify that and produce

00:05:31.790 --> 00:05:34.670
reports for you like you see
in this screenshot here,

00:05:34.670 --> 00:05:37.625
and you can define your own
classification rules as well.

00:05:37.625 --> 00:05:39.470
Next in terms of security,

00:05:39.470 --> 00:05:43.340
we've improved Always Encrypted
our client-side encryption

00:05:43.340 --> 00:05:44.645
technology that allows you to

00:05:44.645 --> 00:05:47.630
separate the encryption
from the database.

00:05:47.630 --> 00:05:50.270
So that way, the
database administrators

00:05:50.270 --> 00:05:53.120
cannot decrypt the data in
the database that allows

00:05:53.120 --> 00:05:55.640
you to separate duties here between

00:05:55.640 --> 00:05:56.840
the database administrators and

00:05:56.840 --> 00:05:59.425
the application developers and users,

00:05:59.425 --> 00:06:01.910
and lastly just as an example here of

00:06:01.910 --> 00:06:03.950
improvements that we're
making as we have also

00:06:03.950 --> 00:06:06.230
added performing the encryption

00:06:06.230 --> 00:06:09.480
of all the data inside of enclaves.

00:06:10.160 --> 00:06:15.050
Now, in the space of developer
and DBA tools, hopefully,

00:06:15.050 --> 00:06:16.670
you've all learned about and tried

00:06:16.670 --> 00:06:19.595
Azure Data Studio a
new cross-platform

00:06:19.595 --> 00:06:22.550
open-source tool for all types

00:06:22.550 --> 00:06:25.190
of data person as whether you're
a database administrator,

00:06:25.190 --> 00:06:28.415
a database engineer,
or a data scientist.

00:06:28.415 --> 00:06:33.350
This tool is available for you
to download for free and use,

00:06:33.350 --> 00:06:35.225
and it is designed to be

00:06:35.225 --> 00:06:39.200
Multi database engine so you can
use it not just with SQL Server,

00:06:39.200 --> 00:06:41.510
but also with SQL server in

00:06:41.510 --> 00:06:44.060
the Cloud such as
Azure SQL Database or

00:06:44.060 --> 00:06:46.460
with Azure SQL data
warehouse also with

00:06:46.460 --> 00:06:49.370
other database engines
like PostgreSQL and MySQL.

00:06:49.370 --> 00:06:52.460
One of the improvements that
people are most excited about

00:06:52.460 --> 00:06:55.340
and Azure Data Studio is
the notebook experience.

00:06:55.340 --> 00:06:58.550
Notebooks allows you to create
a file that contains mark

00:06:58.550 --> 00:07:01.670
down as well as code cells.

00:07:01.670 --> 00:07:03.380
In the markdown, you can describe

00:07:03.380 --> 00:07:06.470
some analysis that you're doing or
steps that should be performed,

00:07:06.470 --> 00:07:08.240
and then in the code cells that are

00:07:08.240 --> 00:07:10.640
intermingled with
those markdown cells,

00:07:10.640 --> 00:07:13.705
you can have some code that you
or somebody else can execute.

00:07:13.705 --> 00:07:17.250
We have notebooks for
TSQL, for PowerShell,

00:07:17.250 --> 00:07:20.240
for Python, and you

00:07:20.240 --> 00:07:23.075
can run it either locally
or you can run it in Spark.

00:07:23.075 --> 00:07:25.910
It's a very powerful
way to collaborate with

00:07:25.910 --> 00:07:29.915
other people by capturing this
information and notebooks,

00:07:29.915 --> 00:07:32.180
and these notebooks
can be used to capture

00:07:32.180 --> 00:07:35.450
samples or maybe some standard
operating procedures or

00:07:35.450 --> 00:07:38.180
troubleshooting guides and share
those with other people through

00:07:38.180 --> 00:07:42.085
the Git integration that we have
built-in to Azure Data Studio,

00:07:42.085 --> 00:07:43.685
and lastly, we've integrated

00:07:43.685 --> 00:07:45.650
some really cool technology from

00:07:45.650 --> 00:07:48.290
Microsoft Research called
SandDance which allows

00:07:48.290 --> 00:07:51.725
you to do ad hoc data
visualization and exploration

00:07:51.725 --> 00:07:54.020
using some really cool
charting capabilities

00:07:54.020 --> 00:07:55.975
right there inside of
Azure Data Studio.

00:07:55.975 --> 00:07:59.585
So definitely, go grab Azure Data
Studio if you haven't already.

00:07:59.585 --> 00:08:01.280
It's a super powerful tool,

00:08:01.280 --> 00:08:03.950
and the innovation is coming
there on a monthly basis as we

00:08:03.950 --> 00:08:07.640
release every month
for Azure Data Studio.

00:08:07.640 --> 00:08:11.270
So we continue to double-down on

00:08:11.270 --> 00:08:14.180
our new approach to

00:08:14.180 --> 00:08:16.820
how we look at different
platforms for SQL Server.

00:08:16.820 --> 00:08:18.500
In SQL Server 2017,

00:08:18.500 --> 00:08:20.465
we introduced support for Linux.

00:08:20.465 --> 00:08:22.100
But SQL Server 2019,

00:08:22.100 --> 00:08:24.470
we're taking that to the
next step by creating

00:08:24.470 --> 00:08:27.620
even greater feature parody
between SQL Server on Windows,

00:08:27.620 --> 00:08:31.875
and SQL Server on Linux by bringing
PolyBase and all services,

00:08:31.875 --> 00:08:35.680
distributed transaction coordinator
and replication to Linux,

00:08:35.680 --> 00:08:37.160
and that pretty much checks off

00:08:37.160 --> 00:08:39.515
all the boxes for the
database engine features.

00:08:39.515 --> 00:08:42.200
So you have near 100
percent compatibility

00:08:42.200 --> 00:08:45.695
between SQL Server on Windows
and SQL Server on Linux.

00:08:45.695 --> 00:08:47.450
In partnership with Red Hat,

00:08:47.450 --> 00:08:49.880
we've also created rel
based container images

00:08:49.880 --> 00:08:52.585
which are available on the
Microsoft Container Registry,

00:08:52.585 --> 00:08:54.170
and you can discover them in

00:08:54.170 --> 00:08:56.675
the Red Hat Container
catalog as well.

00:08:56.675 --> 00:08:58.730
Lastly in preview right now,

00:08:58.730 --> 00:09:02.080
we have support for always on
availability groups in Kubernetes,

00:09:02.080 --> 00:09:04.610
so that you can get the
benefits of having always on

00:09:04.610 --> 00:09:07.415
availability groups
for scale out reads

00:09:07.415 --> 00:09:09.350
or for high availability

00:09:09.350 --> 00:09:13.760
living right there on top of the
Kubernetes layer underneath.

00:09:13.970 --> 00:09:17.270
Lastly, probably the
most significant area

00:09:17.270 --> 00:09:19.040
of improvements and just

00:09:19.040 --> 00:09:21.290
spreading out the tent
of SQL server if you

00:09:21.290 --> 00:09:24.215
will to handle new
types of scenarios,

00:09:24.215 --> 00:09:26.540
is the improvements that
we're making in PolyBase

00:09:26.540 --> 00:09:28.850
and data virtualization as I
mentioned at the beginning,

00:09:28.850 --> 00:09:30.140
where we can create

00:09:30.140 --> 00:09:31.760
a data virtualization layer across

00:09:31.760 --> 00:09:33.890
many different data
sources like Oracle,

00:09:33.890 --> 00:09:37.755
other SQL Server
instances, and Teradata.

00:09:37.755 --> 00:09:40.100
That allows us to bring
together data across

00:09:40.100 --> 00:09:42.800
multiple data sources at query time,

00:09:42.800 --> 00:09:44.840
and really minimize
the need for using

00:09:44.840 --> 00:09:47.420
ETL as a way to integrate
our data together.

00:09:47.420 --> 00:09:50.705
Nobody likes building and
maintaining ETL pipelines.

00:09:50.705 --> 00:09:54.200
So we want to give you an another
option that you can use in

00:09:54.200 --> 00:09:58.385
addition to ETL for how you
integrate your data together.

00:09:58.385 --> 00:10:00.545
In SQL Server 2019,

00:10:00.545 --> 00:10:03.110
we've introduced a new
pattern for how we deploy

00:10:03.110 --> 00:10:07.970
SQL Server by introducing a new
pattern called big data clusters,

00:10:07.970 --> 00:10:09.650
and big data clusters allows you to

00:10:09.650 --> 00:10:12.440
deploy a SQL Server
instance with all of

00:10:12.440 --> 00:10:16.400
its typical capabilities
along with HDFS and

00:10:16.400 --> 00:10:20.825
Spark in one integrated solution
as deployed on Kubernetes,

00:10:20.825 --> 00:10:22.610
that provides you the ability to take

00:10:22.610 --> 00:10:24.820
SQL server and do all the things
that you do a SQL Server,

00:10:24.820 --> 00:10:26.750
but then easily integrate that

00:10:26.750 --> 00:10:29.120
together with HDFS and
sparks so you can do

00:10:29.120 --> 00:10:32.600
queries over high volume
data that may scale

00:10:32.600 --> 00:10:34.400
out 1000 times greater than you

00:10:34.400 --> 00:10:37.070
could possibly store
and SQL Server today,

00:10:37.070 --> 00:10:39.500
up into the tens or even
hundreds of petabytes of

00:10:39.500 --> 00:10:42.260
data as well as being
able to store and

00:10:42.260 --> 00:10:44.540
query and process
unstructured data like

00:10:44.540 --> 00:10:48.174
video files or audio files in HDFS,

00:10:48.174 --> 00:10:50.900
and you have the benefit
of having the Spark engine

00:10:50.900 --> 00:10:53.260
there for data preparation
activities or for doing

00:10:53.260 --> 00:10:55.310
Machine Learning model training or

00:10:55.310 --> 00:10:58.525
operationalization of those
models inside of Spark.

00:10:58.525 --> 00:11:00.815
So by Microsoft providing

00:11:00.815 --> 00:11:02.660
an integrated solution and supporting

00:11:02.660 --> 00:11:05.420
that one integrated solution
and big data clusters,

00:11:05.420 --> 00:11:08.810
you get a shared scalable
data lake built on

00:11:08.810 --> 00:11:12.545
HDFS that either SQL Server
or Spark can access.

00:11:12.545 --> 00:11:15.500
This really provides you
a complete AI platform

00:11:15.500 --> 00:11:17.420
for doing everything
from the ingestion

00:11:17.420 --> 00:11:22.070
of the data by storing it
in HDFS or in SQL Server,

00:11:22.070 --> 00:11:23.900
and then doing data preparation tasks

00:11:23.900 --> 00:11:26.250
using either Spark or SQL Server,

00:11:26.250 --> 00:11:28.995
and then doing Machine
Learning model training using

00:11:28.995 --> 00:11:31.185
either the built-in Machine
Learning libraries in

00:11:31.185 --> 00:11:34.380
Spark or by using

00:11:34.380 --> 00:11:35.900
the Machine Learning
services built into

00:11:35.900 --> 00:11:38.600
the SQL Server Master instance
and then you can operationalize

00:11:38.600 --> 00:11:41.030
those either in the Spark Runtime

00:11:41.030 --> 00:11:43.520
by doing batch Machine
Learning scoring,

00:11:43.520 --> 00:11:45.500
or you could do it inside
of a store procedure

00:11:45.500 --> 00:11:47.090
in SQL Server for example,

00:11:47.090 --> 00:11:49.640
or we have a way where you
can actually take a model and

00:11:49.640 --> 00:11:53.180
automatically wrap it up
in a rest API container,

00:11:53.180 --> 00:11:54.980
and provision that
container on top of

00:11:54.980 --> 00:11:56.600
the big data cluster so that

00:11:56.600 --> 00:11:58.220
it's easy for application
developers to

00:11:58.220 --> 00:12:01.160
call in and use that
container as a way to

00:12:01.160 --> 00:12:04.745
submit some data habits scored
and get a score value back.

00:12:04.745 --> 00:12:07.940
So it makes for a really a
complete AI platform end to

00:12:07.940 --> 00:12:09.500
end to be able to do
everything you need to

00:12:09.500 --> 00:12:11.770
do around AI and Machine Learning.

00:12:11.770 --> 00:12:14.615
So hopefully, that gives
you a quick introduction

00:12:14.615 --> 00:12:18.085
into SQL Server 2019.

00:12:18.085 --> 00:12:22.085
This is really just one
video in a series of videos

00:12:22.085 --> 00:12:24.080
on the SQL 2019 channel

00:12:24.080 --> 00:12:26.465
that you see linked here at
the bottom of the screen,

00:12:26.465 --> 00:12:27.860
and we really hope that

00:12:27.860 --> 00:12:29.840
you have a chance to go
through all these videos.

00:12:29.840 --> 00:12:31.220
We hope to publish maybe around

00:12:31.220 --> 00:12:33.290
a hundred videos that go into lots of

00:12:33.290 --> 00:12:37.730
details about everything
that's new in SQL Server 2019.

00:12:37.730 --> 00:12:39.095
If you have any feedback,

00:12:39.095 --> 00:12:40.700
please post that in
the comments below

00:12:40.700 --> 00:12:42.830
and subscribe to the channel.

00:12:42.830 --> 00:12:44.990
So thanks for joining us today to

00:12:44.990 --> 00:12:47.375
learn more about SQL Server 2019,

00:12:47.375 --> 00:12:49.220
and we'll see you out
there at the next event

00:12:49.220 --> 00:12:50.720
or SQL Saturday. Thank you.

00:12:50.720 --> 00:13:05.290
[MUSIC]

