How does ""MSBuild's"" batching algorithm work?
Terminology:
*
Metadata. This refers to the set of arbitrary user-defined name/value string pairs associated with a particular item. Note: The term "metadata" is also used in the singular context, referring to a
single name/value pair. Even though the grammatically correct singular term would be "metadatum", we don't use this term because it's too unfamiliar to many people.
*
Metadata reference. This is the syntax beginning with a percent sign that is used to refer to item metadata inside an ""MSBuild"" project file. For example, either %(metadataname) or %(itemname.metadataname).
*
Qualified metadata reference. This is a metadata reference that specifies the item name, as in %(itemname.metadataname).
*
Unqualified metadata reference. This is a metadata reference that does not specify the item name, as in %(metadataname).
*
Item list. A single list of items all with the same item name.
*
Item reference. This is the syntax beginning with an at-sign that is used to refer to an item list inside an ""MSBuild"" project file. For example, @(itemname), or @(itemname->'transformexpression'), or @(itemname, 'separator'), or @(itemname->'transformexpression', 'separator').
*
Transform expression. Within an item reference that uses the arrow ""(->)"" to transform the list into a new list, the transform expression is the portion within the single quotes following the arrow.
*
Batching tag. The blob of XML (or portion thereof) that we will be batching. We support target-level and task-level batching today. For task-level batching, the batching tag is the entire <Task> element, including any child <Output> elements. For target-level batching, the batching tag just consists of the Input and Output attributes on the <Target> element.
*
Batched item list. A batched item list is an item list that will be divided up into buckets, such that only a subset of the items in that list will be passed in to each invocation of the target or task.
*
Wholesale item list. This is an item list that will
not be split up, meaning that every single item in the list will be passed into every invocation of that target or task.
Algorithm:
Preparation.
- Collect a list of the consumed item names from all the item references in the batching tag. Call this ""ConsumedItemNames"".
- Collect a list of all the metadata references in the batching tag, both qualified and unqualified. Do not include metadata references within a transform expression. Call this ""ConsumedMetadataReferences"".
- Take all the unqualified metadata references in ""ConsumedMetadataReferences"", and call this ""ConsumedUnqualifiedMetadataReferences"".
- Take all the qualified metadata references in ""ConsumedMetadataReferences"", and call this ""ConsumedQualifiedMetadataReferences"".
Detemining which item lists to batch.
- For each metadata reference in ""ConsumedUnqualifiedMetadataReferences"", make sure that every item in every item list in ""ConsumedItemNames"" has a non-null value for that metadata. If not, throw an error because unqualified metadata references are not allowed unless this criteria is met. Assuming this criteria passes, and there is at least one unqualified metadata reference in ""ConsumedUnqualifiedMetadataReferences"", then add all item names from ""ConsumedItemNames"" into ""BatchedItemNames"".
- For each item name represented in ""ConsumedQualifiedMetadataReferences"", add the item name to ""BatchedItemNames"".
- For each item name represented in ""ConsumedItemNames"", that is not represented in ""BatchedItemNames"", place that item name in ""WholesaleItemNames"". (i.e., ""WholesaleItemNames = ConsumedItemNames - BatchedItemNames"").
Bucketizing.
- For each item list represented in ""BatchedItemNames"", and then for each item within that list, get the values for that item for each of the metadata in ""ConsumedMetadataReferences"". In the table of metadata values, ""%(MyItem.MyMetadata)"" would get a separate entry than ""%(MyMetadata)"", even though the metadata name is the same. In this step (and only this step), a "null" value for metadata is treated as empty-string.
- For each unique set of metadata values, create a bucket. Into each bucket, place the items that have that exact set of metadata values.
Run!
- For each bucket created above, execute the target or task. The task will receive only the items represented in that bucket, as well as all of the items in all of the item lists represented in ""WholesaleItemNames"". Metadata references will be expanded according the values associated with that bucket.
Determining metadata values for an item.
Given an item and a metadata reference, what is the value of that metadata? ...
* If it is a qualified metadata reference, then …
* If the item's name matches the item name in the metadata reference, then ...
* Return the value for that metadata (empty string is allowed), or return null if that metadata is not defined on that item.
* If the item name doesn't match, return null.
* If it is an unqualified metadata reference, return the value for that metadata (empty string is allowed), or return null if that metadata is not defined on that item.
Note: Special metadata like %(Filename) is considered to be "defined" on every single item, even if the item's Include spec doesn't look like a valid filesystem path. The complete list of special metadata for Whidbey is at
BuiltInItemAttributes.
Quick Example
Let's illustrate this with a simple example where we want to display the elements of a
ItemGroup using the Exec task (echo ...).
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
[<ItemGroup>]
[<ExecArgs] Include="arg1"/>
[<ExecArgs] Include="arg2"/>
[<ExecArgs] Include="arg3"/>
[</ItemGroup>]
<Target Name="Exec">
<Exec Command="echo %(ExecArgs.Filename)"/>
</Target>
</Project>
Here's the output:
>msbuild
Microsoft (R) Build Engine Version 2.0.xxx
[Microsoft .NET Framework, Version 2.0.xxx]
Copyright (C) Microsoft Corporation 2004. All rights reserved.
Target "Exec" in project "x.proj"
Task "Exec"
echo arg1
arg1
Task "Exec"
echo arg2
arg2
Task "Exec"
echo arg3
arg3