WEBVTT

00:00:00.000 --> 00:00:01.830
>> SQL Server 2019 provides

00:00:01.830 --> 00:00:04.995
a new feature for Linux called
Persistent Log Buffers.

00:00:04.995 --> 00:00:06.960
It was available for Windows before,

00:00:06.960 --> 00:00:08.385
nowadays on Linux as well,

00:00:08.385 --> 00:00:10.740
and it helps you to eliminate
bottlenecks that might

00:00:10.740 --> 00:00:14.130
occur when waiting for a log
buffer two flush to disc.

00:00:14.130 --> 00:00:18.300
Brian is here to tell us all
about it today on Data Exposed.

00:00:18.300 --> 00:00:29.040
[MUSIC]

00:00:29.040 --> 00:00:32.115
>> Hi, and welcome to another
episode of Data Exposed.

00:00:32.115 --> 00:00:35.220
I'm your host Jeroen, and
today I have Brian with me

00:00:35.220 --> 00:00:38.460
to talk about Persistent
Log Buffers in SQL 2019.

00:00:38.460 --> 00:00:40.230
So hi, Brian, welcome to the show.

00:00:40.230 --> 00:00:42.195
>> Hi, Jeroen. Thank you.

00:00:42.195 --> 00:00:46.045
>> So what are we going to talk
about Persistent Long Buffers?

00:00:46.045 --> 00:00:47.160
>> Yes. So-

00:00:47.160 --> 00:00:47.685
>> What is that?

00:00:47.685 --> 00:00:50.400
>> So Persistent Log
Buffer is one of what

00:00:50.400 --> 00:00:53.325
we call the In-Memory
Database feature family,

00:00:53.325 --> 00:00:55.965
which includes In-Memory OLTP,

00:00:55.965 --> 00:00:59.265
Persistent Log Buffer which
I'll demonstrate today,

00:00:59.265 --> 00:01:01.845
sometimes called Tail of Log Caching,

00:01:01.845 --> 00:01:05.040
a Data and Log File
Enlightenment in Linux,

00:01:05.040 --> 00:01:07.470
Hybrid Buffer Pool in
Linux and Windows,

00:01:07.470 --> 00:01:09.870
and Memory-Optimize TempDB Metadata.

00:01:09.870 --> 00:01:11.370
>> Okay. Cool.

00:01:11.370 --> 00:01:17.195
>> So I'll just mention quickly
about persistent memory devices.

00:01:17.195 --> 00:01:19.550
A lot of people haven't
seen them but essentially

00:01:19.550 --> 00:01:21.730
these are regular DIMMs that you

00:01:21.730 --> 00:01:26.275
feed into your server that
come in different capacities.

00:01:26.275 --> 00:01:30.545
MVDIMM-N which is one type of
persistent memory technology,

00:01:30.545 --> 00:01:34.325
comes an 8, 16, or 32
gig DIMM capacity,

00:01:34.325 --> 00:01:36.980
and then the latest Intel obtained

00:01:36.980 --> 00:01:41.150
DC Persistent memory comes in
much higher capacities of a 128,

00:01:41.150 --> 00:01:44.810
256 gigabytes, or 512 gigabyte DIMMs.

00:01:44.810 --> 00:01:46.820
>> That's all of them
persistent memory. Wow.

00:01:46.820 --> 00:01:48.060
>> Yes. So you can,

00:01:48.060 --> 00:01:49.290
on a nate socket server,

00:01:49.290 --> 00:01:52.370
you can support up to 24
terabytes of persistent memory.

00:01:52.370 --> 00:01:53.750
>> I can unlock all of

00:01:53.750 --> 00:01:55.970
that with this persistent
log buffer, right?

00:01:55.970 --> 00:01:56.570
>> Correct.

00:01:56.570 --> 00:01:57.680
>> Wow.

00:01:57.680 --> 00:02:00.110
>> Persistent Log
Buffer is designed to

00:02:00.110 --> 00:02:02.075
solve a particular use case

00:02:02.075 --> 00:02:07.400
where you were incurring slowdowns
or waits in your workload,

00:02:07.400 --> 00:02:12.385
waiting for the log buffer that
is in memory to flush to disk.

00:02:12.385 --> 00:02:13.005
>> Okay.

00:02:13.005 --> 00:02:16.114
>> So it uses the
persistent memory device

00:02:16.114 --> 00:02:19.355
on it knows that once it's
written to that device,

00:02:19.355 --> 00:02:21.650
that it doesn't need
to wait for the flush

00:02:21.650 --> 00:02:24.270
because it's already on
a persistent device.

00:02:24.270 --> 00:02:26.195
>> Then the device will
take care of the rest.

00:02:26.195 --> 00:02:28.835
>> Yes, the device will
then take care of the rest

00:02:28.835 --> 00:02:31.730
while you carry on essentially
with your workload.

00:02:31.730 --> 00:02:32.180
>> Yeah.

00:02:32.180 --> 00:02:35.585
>> So when you're setting up
these devices in Windows,

00:02:35.585 --> 00:02:41.600
we have some basic recommendations
that you lock pages in memory,

00:02:41.600 --> 00:02:44.150
you use the two megabyte
allocation unit size

00:02:44.150 --> 00:02:46.760
for NTFS which won't be to default.

00:02:46.760 --> 00:02:47.180
>> Okay.

00:02:47.180 --> 00:02:49.715
>> Also you need to
set this flag DAX.

00:02:49.715 --> 00:02:51.920
So DAX is really what enables us to

00:02:51.920 --> 00:02:55.280
treat a persistent memory
device and write to it

00:02:55.280 --> 00:02:57.260
directly skipping all of

00:02:57.260 --> 00:02:59.795
the kernel stack that

00:02:59.795 --> 00:03:03.090
you would typically need
when dealing with files.

00:03:03.090 --> 00:03:05.145
Won't be available in the GUI,

00:03:05.145 --> 00:03:07.250
so you will need to use
some PowerShell for this.

00:03:07.250 --> 00:03:09.560
>> Okay. All right. You will
show us how this works, right?

00:03:09.560 --> 00:03:13.325
>> Yes. I will show how
these get configured.

00:03:13.325 --> 00:03:16.430
Also some of your OS level
disc counters that you

00:03:16.430 --> 00:03:19.510
may be used to looking at like
these transfers and so forth,

00:03:19.510 --> 00:03:21.830
may not be available to

00:03:21.830 --> 00:03:24.200
you when you're working with
persistent memory devices.

00:03:24.200 --> 00:03:28.865
That's just one of the things
you need to be aware of.

00:03:28.865 --> 00:03:29.330
>> Sure.

00:03:29.330 --> 00:03:33.575
>> These are new devices and this
is very brand-new exciting tack.

00:03:33.575 --> 00:03:33.935
>> Okay.

00:03:33.935 --> 00:03:37.565
>> So there may be some catching
up to do on the monitoring side.

00:03:37.565 --> 00:03:38.245
>> Sure.

00:03:38.245 --> 00:03:42.580
>> For Linux, non-volatile
device control

00:03:42.580 --> 00:03:45.110
is the utility that you
use to configure this.

00:03:45.110 --> 00:03:47.840
You will set it to fsdax mode,

00:03:47.840 --> 00:03:50.795
use two megabyte huge page faults,

00:03:50.795 --> 00:03:53.555
set your block allocation
also to two megabytes.

00:03:53.555 --> 00:03:56.180
We support XFS or EXT

00:03:56.180 --> 00:04:00.620
for these are two supported
file systems with DAX.

00:04:00.620 --> 00:04:01.295
>> Okay.

00:04:01.295 --> 00:04:03.050
>> So Persistent Log Buffer,

00:04:03.050 --> 00:04:05.585
this has been available
actually in SQL since

00:04:05.585 --> 00:04:10.140
SQL 2016 only for Windows until now.

00:04:10.140 --> 00:04:12.470
With SQL 2019, we'll also have

00:04:12.470 --> 00:04:15.875
this feature now available
in Linux as well as Windows.

00:04:15.875 --> 00:04:18.590
Uses only a very small
amount of capacity,

00:04:18.590 --> 00:04:21.720
the log buffer is only 20
megabytes per user database.

00:04:21.720 --> 00:04:22.355
>> Okay.

00:04:22.355 --> 00:04:26.330
>> So it really doesn't take
up a huge amount of capacity,

00:04:26.330 --> 00:04:28.850
and the behavior that you get is very

00:04:28.850 --> 00:04:31.250
similar to forcing
delayed durability.

00:04:31.250 --> 00:04:31.910
>> Okay.

00:04:31.910 --> 00:04:34.040
>> So again, you're not waiting for

00:04:34.040 --> 00:04:36.890
that Log flush to happen to disk

00:04:36.890 --> 00:04:40.040
but encouraged none of the risks that

00:04:40.040 --> 00:04:43.235
you take what Forced Delayed
Durability around data loss.

00:04:43.235 --> 00:04:45.290
>> So can you tell us a
little bit more about

00:04:45.290 --> 00:04:47.550
Forced Delayed Durability
for those that are-

00:04:47.550 --> 00:04:48.615
>> Sure, for those-

00:04:48.615 --> 00:04:49.425
>> -not aware of it?

00:04:49.425 --> 00:04:52.095
>> Yes. For those who
are not familiar,

00:04:52.095 --> 00:04:53.840
this is essentially

00:04:53.840 --> 00:04:57.260
an asynchronous commit
mechanism in SQL Server.

00:04:57.260 --> 00:04:57.710
>> Okay.

00:04:57.710 --> 00:05:01.280
>> So there are a couple
of ways to do it.

00:05:01.280 --> 00:05:03.740
One is allowed, in which case

00:05:03.740 --> 00:05:07.190
your normal commits
happen as you expect,

00:05:07.190 --> 00:05:08.270
you wait for the flush,

00:05:08.270 --> 00:05:10.455
wait for them to be hardened on disc,

00:05:10.455 --> 00:05:15.440
or in a forced mode where all
commits behave like this.

00:05:15.440 --> 00:05:16.000
>> Okay.

00:05:16.000 --> 00:05:19.220
>> So what allowed in
you specify on a per

00:05:19.220 --> 00:05:22.880
commit basis if you want this
behavior and that's allowed,

00:05:22.880 --> 00:05:24.935
disallowed which is the default

00:05:24.935 --> 00:05:27.425
doesn't matter what you have in
there it's not going to happen.

00:05:27.425 --> 00:05:27.905
>> Sure.

00:05:27.905 --> 00:05:30.170
>> Then forced all
commits behaves this way.

00:05:30.170 --> 00:05:32.285
>> Okay. So in a persistent
low level is very

00:05:32.285 --> 00:05:34.890
similar but not entirely the same.

00:05:34.890 --> 00:05:37.215
>> Very similar but
not entirely the same,

00:05:37.215 --> 00:05:39.845
because we have the
persistent memory device,

00:05:39.845 --> 00:05:42.965
we put our log buffer on there,

00:05:42.965 --> 00:05:46.640
and once we write there we know
that it's persisted and we

00:05:46.640 --> 00:05:50.360
don't have any risk of data loss
in the event of a server crash,

00:05:50.360 --> 00:05:53.000
power failure, anything
of that nature,

00:05:53.000 --> 00:05:56.570
we can recover from the data on
the persistent memory device.

00:05:56.570 --> 00:05:57.920
>> Okay. Cool.

00:05:57.920 --> 00:06:00.230
>> It's actually quite simple.

00:06:00.230 --> 00:06:01.640
A lot of people don't realize,

00:06:01.640 --> 00:06:04.355
you simply add a log file

00:06:04.355 --> 00:06:07.580
of 20 megabytes on the
persistent memory device,

00:06:07.580 --> 00:06:10.370
SQL Server will
recognize this device,

00:06:10.370 --> 00:06:13.265
and will treat it as the log buffer.

00:06:13.265 --> 00:06:14.405
>> It's very simple

00:06:14.405 --> 00:06:15.665
>> Really that simple.

00:06:15.665 --> 00:06:16.205
>> Wow.

00:06:16.205 --> 00:06:19.550
>> Yeah, and as we can see
here are log buffer sitting on

00:06:19.550 --> 00:06:23.090
our storage class memory
which is PMM sometimes

00:06:23.090 --> 00:06:26.480
we call it storage class
memory and in some places

00:06:26.480 --> 00:06:30.405
but same thing and our
log records are there,

00:06:30.405 --> 00:06:31.950
and as I mentioned,

00:06:31.950 --> 00:06:33.200
we don't have to wait
for them through

00:06:33.200 --> 00:06:36.365
flushed to the main
transaction log file.

00:06:36.365 --> 00:06:37.010
>> Cool.

00:06:37.010 --> 00:06:41.875
>> So I'll just switch
quickly to my demo here.

00:06:41.875 --> 00:06:42.990
>> Yeah.

00:06:42.990 --> 00:06:46.280
>> First I'll just show
that we have configured

00:06:46.280 --> 00:06:49.310
here our persistent memory devices.

00:06:49.310 --> 00:06:50.945
As I mentioned, these
are regular DIMMs,

00:06:50.945 --> 00:06:53.180
you can see the Device IDs there.

00:06:53.180 --> 00:06:56.405
We've configured two
devices one per NUMA node.

00:06:56.405 --> 00:06:56.855
>> Okay.

00:06:56.855 --> 00:07:01.565
>> Interleaved across the devices
at DIMMs on that NUMA node.

00:07:01.565 --> 00:07:05.330
So this is the recommended
way that we say to set it up.

00:07:05.330 --> 00:07:06.410
>> Okay.

00:07:06.410 --> 00:07:08.950
>> Again, we can see that

00:07:08.950 --> 00:07:12.920
our DAX value is enabled
it set to true here,

00:07:12.920 --> 00:07:17.464
and if we want to use our older
command line type utility,

00:07:17.464 --> 00:07:21.830
we can just get that little
bit more info here and we can

00:07:21.830 --> 00:07:26.450
see that we have set the allocation
unit size to two megabytes.

00:07:26.450 --> 00:07:28.640
>> As you just described.
It should be- yeah.

00:07:28.640 --> 00:07:31.505
>> Yeah. As I've just
described and quite

00:07:31.505 --> 00:07:36.185
simple we just add the
log-file, as I mentioned,

00:07:36.185 --> 00:07:38.205
and we just create and

00:07:38.205 --> 00:07:40.700
regardless of what size you
put it in here we'll actually

00:07:40.700 --> 00:07:42.860
integrated to use 20 megabytes

00:07:42.860 --> 00:07:46.025
but just go ahead and
say 20 megabytes sits.

00:07:46.025 --> 00:07:47.975
>> Yeah. Just to make sure.

00:07:47.975 --> 00:07:50.960
>> Yeah, and it's really that simple.

00:07:50.960 --> 00:07:52.550
>> Wow. All right.

00:07:52.550 --> 00:07:54.200
So that's impressive.

00:07:54.200 --> 00:07:56.900
So basically I can unlock
all these new tech with

00:07:56.900 --> 00:07:58.580
a Persistent Log Buffer by just

00:07:58.580 --> 00:08:00.650
running very simple command, right?

00:08:00.650 --> 00:08:01.055
>> Yeah.

00:08:01.055 --> 00:08:02.930
>> Sure. You have to
configure the device first,

00:08:02.930 --> 00:08:05.965
and then after that's done
in SQL you just add a log.

00:08:05.965 --> 00:08:09.350
>> Yeah, and this type

00:08:09.350 --> 00:08:12.725
of technology is really
enabling a new tier of

00:08:12.725 --> 00:08:15.020
storage helping remove some of

00:08:15.020 --> 00:08:17.075
the traditional
bottlenecks that we see

00:08:17.075 --> 00:08:19.640
in SQL Server on high-end workloads.

00:08:19.640 --> 00:08:22.220
>> Right. So big innovation but

00:08:22.220 --> 00:08:24.710
then done in a very simple manner

00:08:24.710 --> 00:08:26.570
for the user and for
the configuration.

00:08:26.570 --> 00:08:29.360
>> Yes. We build intelligence
into SQL Server to

00:08:29.360 --> 00:08:32.240
recognize these devices
and behave accordingly.

00:08:32.240 --> 00:08:34.295
>> Yeah. Very cool. Well,
thanks for sharing.

00:08:34.295 --> 00:08:34.895
>> Thank you.

00:08:34.895 --> 00:08:36.560
>> I think this was very useful,

00:08:36.560 --> 00:08:37.910
very interesting, at least to me.

00:08:37.910 --> 00:08:40.490
I hope this was useful and
interesting to you as well.

00:08:40.490 --> 00:08:43.065
Please subscribe, like,
comment on the video,

00:08:43.065 --> 00:08:44.660
and I hope to see you next time on

00:08:44.660 --> 00:08:47.040
another episode of
Data Exposed. Thanks.

00:08:47.040 --> 00:09:01.630
[MUSIC]

