Best practices for Windows Media Encoder in 2009
- Posted: Jan 05, 2009 at 5:54PM
- 8 comments
Loading user information from Channel 9
Something went wrong getting user information from Channel 9
Loading user information from MSDN
Something went wrong getting user information from MSDN
Loading Visual Studio Achievements
Something went wrong getting the Visual Studio Achievements
I’ve had a bunch of emails lately from people still using Windows Media Encoder for a variety of reasons. A surprising number of people seem to be using ancient versions of actual encoder .dll files, so I figured it was time for one-last roundup post for best practices with WME and the old Format SDK. Hopefully everyone’s planning their migration to a VC-1 Encoder SDK based product, but for those who can’t move yet, please follow these best practices.
Bear in mind that Windows Media Encoder is older than five years now, which is when multimedia products start moving from Mainstream Support to Extended Support in the support lifecycle. We don’t set support policy for unreleased products, so note that there’s been no formal guidence on whether WME will be supported on Windows 7.
WME is ancient code at this point, and all updates since the original Windows Media 9 Series launch have been done as hotfixes. You need to have these installed for security, stability, and performance.
If you’re on Vista, you’ll absolutely want to install this one.
It can be a bit tricky to install sometimes, so follow these instructions from the invaluable
This hotfix address three issues:
When you run the Windows Media Encoder command-line script WMCmd.vbs, the script host Cscript.exe may crash.
The icons that appear on the encoder toolbar and in the encoder dialog boxes are displayed by using a low bit depth. Therefore, the icons appear to have a low resolution.
When you configure the encoding profile or start an encoding session, the encoder may crash.
This one fixes a pretty embarrassing bug – Format SDK 11 wouldn’t use multiple processors when encoding in multiple bitrates (aka “Intelligent Streaming”). Originally blogged here.
This is a security fix for a critical vulnerability that could allow remote code execution on a machine running WME. Install it before you launch WME! Note that the vulnerability requires two things that shouldn’t be happening anyway on a production encoder
WME in the end is mainly a graphical front-end to the Windows Media Format SDK. The actual codec version you’re running is determined by the SDK. Most people get those updates bundled with Windows Media Player, so as long as you have the most recent WMP version for your platform, you’re good to go.
If there’s some reason that you don’t want to update WMP but still want the current codec .dlls, you can also install the Format SDK directly. The SDK version also includes lots of samples and all the normal SDK goodness for building apps for authoring Windows Media content.
For vendors of compression tools, it’s always a good idea to chain the redistributable installer (WMFDist11-WindowsXP-X86-ENU.exe) along with your installer, so you can ensure that users have the current version. That installer can also un manually to install .dll updates without updating .
I’ve seen way too many production encoders in the last few months that have WMP 9 and thus the 9.0 SDK. That’s a good six years old now, and will offer much lower quality and perf than the current versions deliver. Remember this comparison?
Windows XP shipped with Windows Media Player XP, which was a flavor of WMP 7. However, the almost universally installed Service Pack 2 included WMP 9. WMP 11 is also available for XP, and should absolutely be installed on any XP-based encoder (and really all XP machines in general).
Windows Vista shipped with WMP preinstalled, so no action is required there.
On Windows Server 2008, Windows Media Player, and hence the FSDK, is installed as part of the Desktop Experience feature. You’ll want to enable the DEP on any encoder box (and in general for any 2K8 box you’ll be using as a workstation).
The most recent version for Server 2003 is the older Format SDK 9.5 (which is still better than 9.0). As service to our customers looking for a high degree of confusion, Format SDK 9.5 was released with Windows Media Player 10.
Note that running on Server 2003 means you won’t have access to the SMPTE-compliant “WVC1” flavor of Windows Media Video 9 Advanced Profile, nor the excellent WMA 10 Pro LBR audio codec, and hence the 32-96 Kbps range for WMA Pro. The codec is also only 2-way threaded instead of 4-way threaded, slower in general, and lacks support for Tarari acceleration and the advanced registry keys.
If you need to encode on a Server OS, you should use a VC-1 Encoder SDK based product or upgrade to Server 2008 (ideally both!), which will give you better video. However, WMA 10 Pro will still not be available on Server 2003.
There have been reports of people using the WMFDist11-WindowsXP-X86-ENU.exe installer from FSDK 11 set to XP compatibility mode to install the FSDK 11 .dll files onto Server 2003. While there aren’t known issues with this, I should point out that this isn’t a supported configuration for Server 2003, and so Microsoft support will not be available for any issues when running with mismatched .dll files.
While there is a “Windows Media Encoder 64-bit edition”, don’t use it (I’m not even going to link to it). It predates FSDK 11, and offers lower quality and performance than FSDK 11 does; the performance advantage of running 64-bit native code is a lot smaller than the other improvements in FSDK 11.
So, keep using the 32-bit stuff on 64-bit as well as 32-bit versions of Windows.
Once upon a time, this blog seemed to be mainly about using special registry key options in the Format SDK. Thankfully, with Expression Encoder and the other VC-1 Encoder SDK based products mean we have GUI access to all of these now in modern products.
First off, install Alex Zambelli’sWMV9 PowerToy, which is a simple .NET GUI for setting the codec settings. The tooltips are a better reference than anything I’ll type here; I’ll just give some basic recommendations for different scenarios.
This is a good set of defaults with no significant performance downside that almost always helps quality.
Perceptional Option: Adaptive Dead-Zone 1. This maps to the “Adaptive Dead Zone: Conservative” option from Expression Encoder. This lets the codec reduce detail before introducing artifacts, and generally improves quality at lower bitrates.
In-Loop Filter: On. This turns on the In-Loop deblocking filter which softens the edges of block artifacts. This improves the current frame, and also future frames based on it.
Overlap Filter: On. This further smoothes the edges of blocks. It can reduce detail a little at high bitrates, but is almost always helpful at typical web bitrates.
B-Frame Number: 1. Turns on B-frames, and hence enables flash/fade detection when using Lookahead or 2-pass encoding, and also improves compression efficiency.
Lookahead: 16. Tells the codec to buffer ahead 16 frames in 1-pass (CBR or VBR) encoding, letting the codec detect flash frames and fades and switch the frame type based on it. Maps to the “Scene Change Detection” option in Expression Encoder. It will increase end-to-end latency by that many frames in live encoding, but is generally worth it due to quality improvements.
Assuming you have a fast enough machine to run these settings in at least Complexity 3, they will improve the live experience.
Motion Search Level: Fixed Integer Chroma. This adds basic chroma search to the encoding, which can help the quality of motion graphics significantly. It’s a pretty small boost with more typical camera-shot content, so turn it off if you have perf issues; getting to Complexity 3 is more important.
Motion Search Range: Adaptive. This tells the encoder to switch to a bigger motion search range for frames with high motion, and then go back to a smaller range when motion dies down. This dramatically improves quality with higher motion at bigger frame sizes. The default range is 64 pixels left/right and 32 pixels up/down, so if any objects move more than that between any two P-frames (if you’re using B-frames, that’ll be 2 frames apart), than this feature will help.
These settings offer maximum quality for offline encoding, and are slower yet. Use them when you’ve got the time, and run at Complexity 4. Complexity 5 ignores the Motion Search Level and Motion Match Method settings, making it lower quality. The only time you need to use Complexity 5 is if you can’t set registry keys (in which case it’s quite a bit better) or need to use WME to scale your video, as scaling quality is much better in Complexity 5. However, you’re better off preprocessing in another tool if at all possible.
Motion Search Level: Fixed True Chroma. This is a full-precision motion search for chroma. It never hurts, and can help quality a lot with motion graphics and animation.
Motion Match Method: Adaptive. This switches between the Sum of Absolute Differences (SAD) and the Hadamard method to compare motion between frames as appropriate for each macroblock. Full Hadamard can be higher quality for some very complex content, but the Adaptive mode is faster and better most of the time.
WME’s preprocessing was designed for good live encoding performance on circa 2002 machines, which means it’s tuned far more for speed than quality. In particular, deinterlacing and scaling aren’t very good (to say the least). So if you’re using WME with a hardware capture card like an Osprey, you’ll get better results doing all the preprocessing on the card, and just passing the final scaled, cropped, deinterlaced YV12 bitmaps off to WME. This also saves some additional CPU cycles for the codec. For file-to-file encoding, you also want to do any deinterlacing and scaling before you import the file into WME. Expression Encoder has great scaling and good deinterlacing, so that’s another reason to use it over WME.
If WMP 9 is all you have on your encoder because that’s all there was when you configured it, you’re going to be running on some very old and very slow hardware! Bear in mind that Moore’s Law predicts you can get twice the computer for your dollar every 18 months or so. So today’s best machines will have about 16x the encoding horsepower of what you could have had when WME was released!
When tuning hardware, the goal is to enable higher encoder complexity values to be used. Your target is at least complexity of 3; that’s most of the way towards optimal quality; lower values sacrifice a quite a bit of quality for improved speed, while higher values sacrifice a lot of speed for only a little additional quality improvement. Complexity 4 is optimum when time/performance isn’t a concern. As mentioned above, Complexity 5 in FSDK 11 invalidates some registry keys (not an issue with VC-1 Encoder SDK).
For live encoding, the best way to test is to run the encode using real-world sources and make sure you’re not dropping frames at your target frame size. You’ll want to set Video Smoothness to 0 for this test to make sure frames aren’t being dropped for quality reasons instead of performance reasons.
Since the encoder is 4-way threaded, you want to be on at least a 4-core machine. These are dirt cheap these days, with affordable 4-core laptops coming soon. Generally speaking, a 4-core is a lot faster than a 2-core at a slightly faster clock speed.
Clock speed is a big factor as well, but it only matters in context to the generation of processors. For example, the new Intel i7 “Nehalem” get a whole lot more work done at 3 GHz than the original P4 “NetBurst” CPUs did at 3 GHz.
Memory isn’t a big factor in encoding, as long as you have enough. You never want an encoder to start swapping, as performance will fall through the floor. 2 GB should be plenty for a dedicated encoder, even doing HD encoding.
If you’re encoding to higher resolutions and just can’t hit the performance you need at a sufficient quality, check out the Tarari Encoder Accelerator. This offloads a number of the computationally intensive parts of compression to a PCI-X board. While it can’t produce the absolutely best quality a tweaked Complexity 4 + registry key software encode can, it can certainly deliver much higher quality for live HD than pure software, particularly on older computers.
The Tarari board is also fully supported by VC-1 Encoder SDK products, so your investment isn’t tied to a FSDK workflow.
So, that’s how to get best results out of legacy Format SDK encoders, particularly WME. But really, it’s time to start planning a migration to a VC-1 Encoder SDK based encoder. If there’s something you need that they simply don’t provide, please let me know. Future improvements in VC-1 and Windows Media encoding are going to focus on the SDK, so the gap between Windows Media Encoder and the best encoders is going to be growing bigger over time.
And just so they don’t disappear off the bottom of the blog, here are my blog posts about WME and the FSDK 11 codecs that readers may still find useful: