Efficient Selection of Vector Instructions Using Dynamic Programming
Created on 2022-01-27T20:39:50-06:00
Create dependency graph of values at a basic block level
Dependency graph is used to determine degrees of freedom for reorganizing the formula
Filter functions run which transform the dependency graph
Scoring metrics are selected to rate the current state of a dependency graph.
Unrelated formulas that share operands should be combined if it would not create issues. For example decomposing two complex formulas in to individual add and multiply nodes and then having the add and multiplies done for all expressions.
Scoring metrics determine where to put scalars. When is it better to pack and unpack the SIMD registers versus using shuffling operators.
Determining the order to run filters, when to shuffle/swizzle and migrate terms, etc, is delegated to a Knuth-Plass style "dynamic programming" engine. It will explore promising optimization steps
TODO implement this algorithm so we can vectorize some crypto and art code
20220417 Came back to download and skim the paper.
20230504 Came back to re-read the paper.