WEBVTT

00:00:00.000 --> 00:00:02.280
>> SQL Server big data
clusters provides

00:00:02.280 --> 00:00:05.385
embedded administration experiences
to manage the platform.

00:00:05.385 --> 00:00:07.140
There's a lot going
on at the platform so

00:00:07.140 --> 00:00:08.955
we made sure to build a lot of

00:00:08.955 --> 00:00:10.800
administration experiences to make

00:00:10.800 --> 00:00:12.885
it easy to understand
what's going on.

00:00:12.885 --> 00:00:17.860
Mihaela here will tell us all
about it today at Data Exposed.

00:00:26.720 --> 00:00:27.820
>> [MUSIC]

00:00:27.820 --> 00:00:29.685
>> Hi and welcome to
another episode of

00:00:29.685 --> 00:00:32.295
Data Exposed. I'm your host Jeroen.

00:00:32.295 --> 00:00:36.450
Today, I have with me Mihaela to
talk about some of the Built-in

00:00:36.450 --> 00:00:38.280
administration experiences that are

00:00:38.280 --> 00:00:40.940
available for you in SQL
Server big data clusters.

00:00:40.940 --> 00:00:43.670
So Mihaela, back on
the show, welcome.

00:00:43.670 --> 00:00:46.015
>> Thank you for
having me here today.

00:00:46.015 --> 00:00:48.975
>> So administration
experiences, right?

00:00:48.975 --> 00:00:50.550
So what can you tell us about it?

00:00:50.550 --> 00:00:52.320
>> Yeah, that's one of

00:00:52.320 --> 00:00:55.245
the key value product
for big data clusters.

00:00:55.245 --> 00:00:57.260
We're going to go through some of

00:00:57.260 --> 00:01:00.380
those experiences today
highlighting what are some of

00:01:00.380 --> 00:01:05.660
the management built-in cluster
administration experience

00:01:05.660 --> 00:01:08.425
that we have available
for you to try out.

00:01:08.425 --> 00:01:12.380
As you know, big data clusters are

00:01:12.380 --> 00:01:16.160
deployed as containerized
applications and talk of Kubernetes.

00:01:16.160 --> 00:01:18.920
That is giving us some of

00:01:18.920 --> 00:01:23.195
the flexibility to enable
some of these experiences.

00:01:23.195 --> 00:01:26.930
It's very easy to deploy
for some of you that

00:01:26.930 --> 00:01:30.805
had tried to deploy big data
clusters so that it's very fast.

00:01:30.805 --> 00:01:34.530
Similarly for upgrades,
that's going to be very easy.

00:01:34.530 --> 00:01:38.780
Leveraging, elasticity
and scalability of

00:01:38.780 --> 00:01:41.330
Kubernetes containers that transfers

00:01:41.330 --> 00:01:44.885
to big data cluster experiences where

00:01:44.885 --> 00:01:49.460
it's very easy to scale out
and up and down and do have

00:01:49.460 --> 00:01:50.510
all this experience in

00:01:50.510 --> 00:01:52.805
a predictable and consistent way

00:01:52.805 --> 00:01:55.140
irrespective of where we are
going to deploy it, right?

00:01:55.140 --> 00:01:57.645
Because when you deploy
big data cluster,

00:01:57.645 --> 00:02:00.420
the underneath platform of Kubernetes

00:02:00.420 --> 00:02:03.660
irrespective if it's On-prem
or in Azure or anywhere else,

00:02:03.660 --> 00:02:06.965
we are agnostic to where that
Kubernetes cluster is running.

00:02:06.965 --> 00:02:08.790
>> Cool. Sounds good.

00:02:08.790 --> 00:02:10.535
>> Again, transfer to

00:02:10.535 --> 00:02:14.450
this experiences that we
built for you to manage

00:02:14.450 --> 00:02:16.970
this platform because that's
something that you deploy on

00:02:16.970 --> 00:02:19.780
your own infrastructure and
you have to manage it in.

00:02:19.780 --> 00:02:21.390
>> Sure. That makes sense.

00:02:21.390 --> 00:02:26.450
So what do we mean when we talk
about broken administration?

00:02:26.450 --> 00:02:28.100
>> Yeah. So that means that we deploy

00:02:28.100 --> 00:02:30.200
a set of components and services

00:02:30.200 --> 00:02:33.980
in the cluster to be able for you
to deploy during the cluster,

00:02:33.980 --> 00:02:36.815
of course, to scale to upgrade.

00:02:36.815 --> 00:02:39.380
Similarly for security and I'm

00:02:39.380 --> 00:02:42.200
going to mention a bit
later what does that mean,

00:02:42.200 --> 00:02:45.200
you have built-in HADR

00:02:45.200 --> 00:02:49.110
and as well as cluster
health system that is

00:02:49.110 --> 00:02:52.370
used for other workflows like

00:02:52.370 --> 00:02:56.760
when we do deployment
on or do upgrade,

00:02:57.190 --> 00:03:01.370
we are reading the signals from
the cluster automatically for you

00:03:01.370 --> 00:03:05.630
to be able to make sure that we're
upgrading in the right order,

00:03:05.630 --> 00:03:09.215
we are listening to health
signals from the cluster to

00:03:09.215 --> 00:03:13.085
not roll forward services that
are not healthy and so on.

00:03:13.085 --> 00:03:14.185
>> Okay.

00:03:14.185 --> 00:03:17.570
In the same time, we
have components in

00:03:17.570 --> 00:03:20.675
the cluster that are collecting
metrics, collecting logs,

00:03:20.675 --> 00:03:23.960
store them and then expose
them through dashboards,

00:03:23.960 --> 00:03:27.920
again we're deploying automatically
for you to make use of that.

00:03:27.920 --> 00:03:29.245
>> Cool.

00:03:29.245 --> 00:03:31.980
>> How exactly you want to ask?

00:03:31.980 --> 00:03:38.810
What exactly that works and what
are some of the services that are

00:03:38.810 --> 00:03:42.200
deployed in the cluster
and we are making use

00:03:42.200 --> 00:03:47.190
of to further enable all
these experiences for you?

00:03:48.320 --> 00:03:52.880
We call all these
components that are part of

00:03:52.880 --> 00:03:56.660
the management or
administration experience

00:03:56.660 --> 00:03:59.990
are under cluster management
service umbrella or control.

00:03:59.990 --> 00:04:02.945
We're going to hear
control plane sometime.

00:04:02.945 --> 00:04:08.310
I would split them among
monitoring services,

00:04:08.310 --> 00:04:10.290
as I was mentioning earlier,

00:04:10.290 --> 00:04:14.390
we have components that
are collecting metrics.

00:04:14.390 --> 00:04:16.970
Let's take a very simple example for

00:04:16.970 --> 00:04:20.600
SQL Server has DMVs that is
producing a lot of metrics.

00:04:20.600 --> 00:04:23.470
They have components in

00:04:23.470 --> 00:04:26.340
the cluster that are
reading from those DMVs,

00:04:26.340 --> 00:04:28.050
it's storing them in InfluxDB,

00:04:28.050 --> 00:04:30.520
and then we have Grafana
that is sitting on

00:04:30.520 --> 00:04:33.190
top of that to expose those metrics.

00:04:33.190 --> 00:04:33.430
>> Cool.

00:04:33.430 --> 00:04:36.790
>> Similarly for note
host in Kubernetes,

00:04:36.790 --> 00:04:39.730
we collect some of those
resource consumption like

00:04:39.730 --> 00:04:43.315
memory and so on and expose
dashboards on top of that.

00:04:43.315 --> 00:04:47.890
Another set of services are
helping to deploy and to upgrade

00:04:47.890 --> 00:04:53.380
to set up security or
high availability, right?

00:04:53.380 --> 00:04:57.130
Those are components that are
working together to ensure

00:04:57.130 --> 00:04:58.840
cluster health to ensure that

00:04:58.840 --> 00:05:02.315
all those things are
working properly.

00:05:02.315 --> 00:05:04.125
>> Okay. So basically, we
have two groups, right?

00:05:04.125 --> 00:05:06.585
Monitoring and more of
management control?

00:05:06.585 --> 00:05:06.975
>> Yeah.

00:05:06.975 --> 00:05:08.500
>> Okay. Cool.

00:05:08.570 --> 00:05:11.980
>> So for example for security,

00:05:11.980 --> 00:05:14.030
to go through what

00:05:14.030 --> 00:05:17.615
exactly does it mean that
we have built-in security?

00:05:17.615 --> 00:05:20.930
As with any SQL Server release,

00:05:20.930 --> 00:05:25.360
security's mission critical for
us to enable for our customers.

00:05:25.360 --> 00:05:28.980
It was very important to
enable AD Authentication.

00:05:28.980 --> 00:05:33.770
What exactly does that mean is
that once a deployment time,

00:05:33.770 --> 00:05:36.740
you tell us that I
want the services in

00:05:36.740 --> 00:05:40.220
the clusters to be integrated
with AD so later for

00:05:40.220 --> 00:05:42.605
users to authenticate using

00:05:42.605 --> 00:05:47.450
their AD Identity that
everything is taken care of in

00:05:47.450 --> 00:05:56.535
terms integrating containers
with Active Directory,

00:05:56.535 --> 00:05:59.790
tokens, certificates and everything,

00:05:59.790 --> 00:06:04.800
deploying all that stuff in
a containerized environment,

00:06:05.570 --> 00:06:09.045
it's something that it's
new in the industry, right?

00:06:09.045 --> 00:06:09.250
>> Sure.

00:06:09.250 --> 00:06:11.540
>> So that's something that
we were very hard to make it

00:06:11.540 --> 00:06:14.165
happen and we have it
available will their cluster.

00:06:14.165 --> 00:06:15.005
>> Awesome.

00:06:15.005 --> 00:06:17.765
>> Another thing that was
very important and we

00:06:17.765 --> 00:06:20.885
listened to customers' feedback
when they were saying that

00:06:20.885 --> 00:06:25.190
I want to make sure that the user
identity that is being used to

00:06:25.190 --> 00:06:27.500
login to a certain
service is passed through

00:06:27.500 --> 00:06:30.200
the entire stack because we
know in big data cluster,

00:06:30.200 --> 00:06:32.315
we have various layers of service,

00:06:32.315 --> 00:06:37.490
and when a new user identity
connects for example to SQL Server,

00:06:37.490 --> 00:06:40.220
I want the same identity to be passed

00:06:40.220 --> 00:06:43.190
through down to HDFS if needed so I

00:06:43.190 --> 00:06:48.890
can audit and to track that
activity of that user, right?

00:06:48.890 --> 00:06:50.300
So that's something that is available

00:06:50.300 --> 00:06:52.250
in big data clusters as well.

00:06:52.250 --> 00:06:55.790
Again, certificates and
rotational certificates

00:06:55.790 --> 00:06:57.620
is happening automatically for you.

00:06:57.620 --> 00:07:02.655
You don't have to do
anything for that.

00:07:02.655 --> 00:07:04.395
>> Okay. So that's great.

00:07:04.395 --> 00:07:05.790
This sounds cool.

00:07:05.790 --> 00:07:07.170
We sold all of this,

00:07:07.170 --> 00:07:10.905
we made sure that your credentials
flow from top to bottom,

00:07:10.905 --> 00:07:12.180
we configure all of it,

00:07:12.180 --> 00:07:13.710
but that's just security, right?

00:07:13.710 --> 00:07:14.025
>> Yeah.

00:07:14.025 --> 00:07:15.290
>> How about something
else that's very

00:07:15.290 --> 00:07:16.850
important like scalability,

00:07:16.850 --> 00:07:18.230
making sure that when

00:07:18.230 --> 00:07:20.960
something breaks something
else is there to pick it up?

00:07:20.960 --> 00:07:22.460
That's an important
factor of database.

00:07:22.460 --> 00:07:24.305
>> Yeah. So that's something

00:07:24.305 --> 00:07:27.620
that was very important for
us as well to make sure

00:07:27.620 --> 00:07:30.200
that the mission critical
services are available

00:07:30.200 --> 00:07:33.784
in big data cluster
like SQL Server Master,

00:07:33.784 --> 00:07:37.595
HDFS NameNode, they're
highly available.

00:07:37.595 --> 00:07:38.195
>> Okay.

00:07:38.195 --> 00:07:41.480
>> That's where we enabled

00:07:41.480 --> 00:07:45.785
an experience where you
can deploy and manage

00:07:45.785 --> 00:07:51.680
all this aspects again in a very easy

00:07:51.680 --> 00:07:58.145
and embedded within the
control plane as well.

00:07:58.145 --> 00:07:59.330
For example availability groups.

00:07:59.330 --> 00:08:01.460
This is a flagship feature that was

00:08:01.460 --> 00:08:04.100
available for SQL Server since 2012,

00:08:04.100 --> 00:08:05.960
I think the least and

00:08:05.960 --> 00:08:14.670
[inaudible] they know that they
have multiple prerequisites,

00:08:14.670 --> 00:08:15.960
they have to set up,

00:08:15.960 --> 00:08:19.530
they have to set up database
mirroring end points,

00:08:19.530 --> 00:08:22.545
they need to set up the certificates.

00:08:22.545 --> 00:08:25.175
There are multiple steps to
even set up the cluster.

00:08:25.175 --> 00:08:27.680
Once you tell us that you want HA and

00:08:27.680 --> 00:08:30.490
big data clusters for
SQL Server Master,

00:08:30.490 --> 00:08:33.555
we take care of everything for you.

00:08:33.555 --> 00:08:35.310
>> Wow. So we simplified it, right?

00:08:35.310 --> 00:08:39.290
>> It's very easy for you to set up.

00:08:39.290 --> 00:08:40.750
You don't have to think about,

00:08:40.750 --> 00:08:44.690
did I use the right URL for the
replicas or things like that?

00:08:44.690 --> 00:08:47.735
You don't have to worry
about these things.

00:08:47.735 --> 00:08:48.630
>> Cool,so.

00:08:48.630 --> 00:08:51.415
>> Guess what is the cluster
technology that we use for that?

00:08:51.415 --> 00:08:53.415
>> Well, you tell me.

00:08:53.415 --> 00:08:57.810
>> None. So that's the
beauty of Kubernetes.

00:08:57.810 --> 00:09:01.480
So through a tight integration
with Kubernetes and adding

00:09:01.480 --> 00:09:03.950
the logic for monitoring and

00:09:03.950 --> 00:09:07.670
orchestration tight into
this control plane,

00:09:07.670 --> 00:09:10.280
there is no need for additional
cluster of technology

00:09:10.280 --> 00:09:13.550
to put in the bigger cluster

00:09:13.550 --> 00:09:16.400
to manage that aspect
of SQL Server Master.

00:09:16.400 --> 00:09:17.030
>> Okay.

00:09:17.030 --> 00:09:20.180
>> Similarly for HDFS, right?

00:09:20.180 --> 00:09:24.425
Other resources from
the Hadoop stack,

00:09:24.425 --> 00:09:27.125
they have to be highly
available as well.

00:09:27.125 --> 00:09:30.830
In this case, we use Zookeeper,

00:09:30.830 --> 00:09:32.285
which is an open-source.

00:09:32.285 --> 00:09:34.340
So well established
cluster technology

00:09:34.340 --> 00:09:36.710
to help with orchestration and

00:09:36.710 --> 00:09:41.585
storing the metadata for the high
availability of these services.

00:09:41.585 --> 00:09:43.640
>> So you told us about
mission critical,

00:09:43.640 --> 00:09:44.690
you told us about security.

00:09:44.690 --> 00:09:47.000
So you got my head spinning now.

00:09:47.000 --> 00:09:48.080
There's a lot of stuff going on,

00:09:48.080 --> 00:09:49.775
but how do I actually use this?

00:09:49.775 --> 00:09:51.110
Do you have tools that

00:09:51.110 --> 00:09:51.980
you can give to

00:09:51.980 --> 00:09:53.110
me to make sure that I
understand what's going on here?

00:09:53.110 --> 00:09:54.470
>> Yes, don't worry about it.

00:09:54.470 --> 00:10:00.610
So again, I was telling you that
you can easily deploy, right?

00:10:00.610 --> 00:10:00.900
>> All right.

00:10:00.900 --> 00:10:02.810
>> The only thing that you need to do

00:10:02.810 --> 00:10:04.760
as with anything in
Kubernetes you just

00:10:04.760 --> 00:10:07.040
have to declare your
intention and describe

00:10:07.040 --> 00:10:11.225
your target configuration and
we take care of everything.

00:10:11.225 --> 00:10:13.250
So one of the thing you want

00:10:13.250 --> 00:10:15.800
to make sure you're
using and you have

00:10:15.800 --> 00:10:20.975
as a tool on your client
machines ASI data.

00:10:20.975 --> 00:10:23.180
You can do deployment,
you can do configuration,

00:10:23.180 --> 00:10:27.750
you can do monitoring even with
ASI data as well as Azure data.

00:10:27.750 --> 00:10:33.110
So if you want to go through a more
guided or to see dashboards in

00:10:33.110 --> 00:10:36.890
a more user-friendly way

00:10:36.890 --> 00:10:39.080
and that's what I'm going
to show you further,

00:10:39.080 --> 00:10:42.065
you can make use of
Azure Data Studio to

00:10:42.065 --> 00:10:45.800
leverage some of these experiences
that we're adding there.

00:10:45.800 --> 00:10:47.990
>> Okay. Well, talking about
experiences, can you show us?

00:10:47.990 --> 00:10:50.640
>> Let's see how that
looks like for monitoring.

00:10:50.640 --> 00:10:52.860
If I want to see the
status of the cluster,

00:10:52.860 --> 00:10:54.675
are my services healthy or not.

00:10:54.675 --> 00:10:56.030
>> There's a lot going
on in the cluster.

00:10:56.030 --> 00:10:57.150
So I need to know what's going.

00:10:57.150 --> 00:11:01.640
>> There are many of them exactly
and we have a new experience

00:11:01.640 --> 00:11:04.250
in Azure Data Studio where you

00:11:04.250 --> 00:11:07.205
can see the status of the
cluster through the controller,

00:11:07.205 --> 00:11:09.050
because again this is the brain of

00:11:09.050 --> 00:11:11.330
your cluster and this
is the source of truth.

00:11:11.330 --> 00:11:12.660
>> Sure.

00:11:13.340 --> 00:11:18.570
>> You can see here all of the
services that are deployed.

00:11:18.570 --> 00:11:20.495
There are health status.

00:11:20.495 --> 00:11:22.130
If I want details,

00:11:22.130 --> 00:11:24.920
I can go further to HDFS for example

00:11:24.920 --> 00:11:29.615
and see what is the
health there and so on.

00:11:29.615 --> 00:11:32.270
So this is one of the
things that we have new

00:11:32.270 --> 00:11:35.405
in the upcoming releases
for big data clusters.

00:11:35.405 --> 00:11:38.030
All this experience again
you can use ASI data for

00:11:38.030 --> 00:11:41.820
as well with the BDC status,

00:11:44.710 --> 00:11:48.685
and it's more intuitive
to use a tool like this.

00:11:48.685 --> 00:11:51.480
>> Sometimes. Yeah,
makes sense. Cool.

00:11:51.480 --> 00:11:55.570
>> Again, all this and how to deploy

00:11:55.570 --> 00:11:56.630
all these services and

00:11:56.630 --> 00:11:59.390
highly available configuration
for security and so on,

00:11:59.390 --> 00:12:02.270
you can find on our
documentation page.

00:12:02.270 --> 00:12:05.045
I put some pointers here for you to

00:12:05.045 --> 00:12:08.630
leverage as a starting point
for our documentation.

00:12:08.630 --> 00:12:13.055
So either for deployment or
for more resources on BDC,

00:12:13.055 --> 00:12:15.200
workshops, samples that you can go

00:12:15.200 --> 00:12:18.825
to these resources to make use of.

00:12:18.825 --> 00:12:21.150
>> Cool. Well again,

00:12:21.150 --> 00:12:23.055
thanks a lot for sharing.

00:12:23.055 --> 00:12:25.280
I'm happy to see there's a lot of

00:12:25.280 --> 00:12:27.245
administration and
monitoring going on,

00:12:27.245 --> 00:12:29.450
and very happy to see that we have

00:12:29.450 --> 00:12:33.365
both a command line version of
it to automate things again,

00:12:33.365 --> 00:12:35.510
building graphs, building dashboards,

00:12:35.510 --> 00:12:37.100
and then the interface, I
like interface as well.

00:12:37.100 --> 00:12:39.020
So I'm very happy to [inaudible].

00:12:39.020 --> 00:12:41.060
So thanks a lot for being here
and sharing this with others.

00:12:41.060 --> 00:12:42.080
>> No problem, thank you.

00:12:42.080 --> 00:12:43.775
>> Thank you for watching.

00:12:43.775 --> 00:12:45.340
Please like and subscribe,

00:12:45.340 --> 00:12:48.180
leave a comment and hope to
see you next time. Thanks.

00:12:48.180 --> 00:13:03.100
[MUSIC]

