New WMcmd.vbs with improved multithreading and new presets

Sign in to queue

Description

Alex Zambelli has updated his invaluable WMCmd.vbs yet again. The two main new features are improved support for running multiple versions at once (great with my new 8-core Barcelona workstation), an explicit QP mode for 1-pass VBR, and refactored presets for different compression levels.

The full details are in the Readme, but here's my summary and elaboration, respectively.

For running multiple versions, the script sets the registry keys, starts the encode, and then reverts them immediately after the encode starts. This should reduce the chance of the wrong keys being set during an encode (not that I've had any problems with that in the last six months or so).

QP is a measure of how compressed the image is, with lower numbers being less compressed. For the explicit QP mode, the script now lets you specify the QP you want, instead of providing a 0-100 range and knowing the magical translation table. Why is this useful? Well, for most readers, you probably don't have an intuitive sense of what quantization parameter you want to use, but it's there for those that do. And better yet, it give you a chance to understand how QP works.

Reading the above, it's clear I need to do a blog post on QP and how to use it. That will reveal the mysteries of the subtle "Quality" control for WMV 1-pass CBR modes.

Lastly, we have preset refactoring, where Alex has cleaned up what combination of settings get applied for different targets for encode time. You can think of these as an extension of the old "Complexity" slider, applying yet more options and getting better results overall. We'll be sharing these recommendations to vendors using our VC-1 Encoder SDK. These new modes are worth some detailed discussion:

fast: Up to 1.5x faster than default with comparable quality.
-v_complexity 2
-v_bframedist 1
-v_lookahead 16
-v_loopfilter 1
-v_overlap 1


Even for the fastest mode, we don't mess with Complexity 1 (the live default in Windows Media Encoder, but very rarely needed even for live encoding on a modern system). And we can use features that help quality a lot without much CPU hit like B-Frames and Lookahead. For any Main or Advanced Profile encode, B-Frames are almost always a big plus. And Lookahead should be used for all 1-pass encodes (there's no downside to having it set for 2-pass encodes; it's ignored).

good: Up to 1.5x slower than default.
-v_complexity 3
-v_bframedist 1
-v_lookahead 16
-v_loopfilter 1


A little higher complexity, and Overlap is off. Overlap causes the image to get softer, so ideally it won't be needed. But for aggressive bitrates, it might be needed with any preset.

better: Up to 2.5x slower than default.
-v_complexity 3
-v_bframedist 1
-v_lookahead 16
-v_loopfilter 1
-v_mslevel 1
-v_msrange 0

We add Integer Chroma Search which can help a lot with animation and motion graphics, and adaptive motion search range, which helps with higher resolutions and higher motion.

best: Up to 4.5x slower than default.
-v_complexity 5
-v_bframedist 1
-v_lookahead 16
-v_loopfilter 1
-v_msrange 0

Complexity jumps from 3 to 5. MSLevel isn't specified because Complexity 5 is a little unique - it has hardcoded amounts of both chroma search and Hadamard motion match that can't otherwise be specified. The nice thing about Complexity 5 is that it can provide some of the quality gains of using registry keys for machines where those can't be set. However, it doesn't set B-frames or Lookahead, so "better" would generally look better and encode faster than a default "Complexity 5"

insane: The slowest and highest quality preset.
-v_complexity 4
-v_bframedist 1
-v_lookahead 16
-v_loopfilter 1
-v_mslevel 2
-v_msrange 0
-v_mmatch 0

And lastly, Insane. Note this goes back down to Complexity 4, which allows us to specify a Full Chroma Search and adaptive SAD/Hadamard Motion Match. This is both better and slower than Complexity 5.

And the above is what I used for most of my encodes, personally.

I'll sometimes use what I think of as "Hyper Insane" which is turning -v_numthreads down to 1, which gives a very slight further improvement. Also, 4x single-thread encodes are faster than a single 4-thread encode on the same hardware. Which is why I wind up using multiple instances so much - For a huge batch of files to encode, I'll be done slightly better running 4 simultaneous single-threaded encodes.

 

The Discussion

Add Your 2 Cents