So, I fired the profiler and SkeletonInstance::resetToPose appeared in the hotspot list. Since it’s not theoretically a performance sensitive function, that felt strange.
Basically VS is thinking the address may be unaligned when storing, and is using the movq trick; even though it already knows the address is aligned since aligned operations have been performed a few lines above on the same address.
At least 2013 uses movdqu instead of movq, which is a major improvement. Because VS is using a movaps then movq pattern to store memory “safely”, I don’t need an advanced profiler to tell me that will cause a load blocked by store forwarding.
I’ll be trying to isolate the bug into a test case to file a bug report tomorrow and see what happens.
Update: A bug report has been filed. Turns out the issue is quite easy to trigger.