From: "Zebediah Figura (she/her)" Subject: Re: [PATCH vkd3d v6 5/6] vkd3d-shader/hlsl: Handle conditionals in copy propagation. Message-Id: Date: Tue, 16 Nov 2021 22:41:22 -0600 In-Reply-To: <20211116190033.1889518-5-gmascellani@codeweavers.com> References: <20211116190033.1889518-1-gmascellani@codeweavers.com> <20211116190033.1889518-5-gmascellani@codeweavers.com> On 11/16/21 13:00, Giovanni Mascellani wrote: > Signed-off-by: Giovanni Mascellani > --- > libs/vkd3d-shader/hlsl_codegen.c | 157 +++++++++++++++++++++++++++++-- > 1 file changed, 149 insertions(+), 8 deletions(-) > I'd been putting off these two patches because I thought they would be harder and I wanted to get the basic parts in first, but I had a look at them anyway and they are actually pretty simple. So, nice job :-) I do have some comments, however... > diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c > index 42422220..775fec19 100644 > --- a/libs/vkd3d-shader/hlsl_codegen.c > +++ b/libs/vkd3d-shader/hlsl_codegen.c > @@ -255,7 +255,18 @@ static void replace_node(struct hlsl_ir_node *old, struct hlsl_ir_node *new) > * updated. When scanning through a load, it is checked if all the > * registers involved in the load come from a single node. In such > * case, the store can be replaced with a swizzle based on that > - * node. */ > + * node. > + * > + * All of the above works when we disregard control flow. With control > + * flow it becames slightly more complicated: instead of a single map > + * we keep a stack of them, pushing a new entry each time we enter an > + * embedded block, and popping the entry when leaving the block. Typo, 'becames'. Also, this is nitpicky, but the first sentence is a little weird when the earlier comment contains the phrase 'because control flow forces us to drop information'. > + * > + * When entering a conditional block, both branches ("then" and > + * "else") can inherit the variable state available just before the > + * conditional block. After the conditional block, all variables that > + * might have been written in either branch must be invalidated, > + * because we don't know which branch has executed. */ > > struct copy_propagation_value > { > @@ -272,7 +283,9 @@ struct copy_propagation_variable > > struct copy_propagation_state > { > - struct rb_tree variables; > + struct rb_tree *variables; > + unsigned int depth; > + unsigned int capacity; > }; As an alternative, you could create a new copy_propagation_state struct on stack in copy_propagation_process_if(). > > static int copy_propagation_variable_compare(const void *key, const struct rb_entry *entry) > @@ -293,7 +306,7 @@ static void copy_propagation_variable_destroy(struct rb_entry *entry, void *cont > static struct copy_propagation_variable *copy_propagation_get_variable(struct copy_propagation_state *state, > struct hlsl_ir_var *var) > { > - struct rb_entry *entry = rb_get(&state->variables, var); > + struct rb_entry *entry = rb_get(&state->variables[state->depth], var); > > if (entry) > return RB_ENTRY_VALUE(entry, struct copy_propagation_variable, entry); > @@ -304,7 +317,7 @@ static struct copy_propagation_variable *copy_propagation_get_variable(struct co > static struct copy_propagation_variable *copy_propagation_create_variable(struct hlsl_ctx *ctx, > struct copy_propagation_state *state, struct hlsl_ir_var *var) > { > - struct rb_entry *entry = rb_get(&state->variables, var); > + struct rb_entry *entry = rb_get(&state->variables[state->depth], var); > struct copy_propagation_variable *variable; > int res; > > @@ -323,7 +336,7 @@ static struct copy_propagation_variable *copy_propagation_create_variable(struct > return NULL; > } > > - res = rb_put(&state->variables, var, &variable->entry); > + res = rb_put(&state->variables[state->depth], var, &variable->entry); > assert(!res); > > return variable; > @@ -354,6 +367,99 @@ static void copy_propagation_set_value(struct copy_propagation_variable *variabl > } > } > > +static void copy_propagation_invalidate_from_block(struct hlsl_ctx *ctx, struct copy_propagation_state *state, > + struct hlsl_block *block) > +{ > + struct hlsl_ir_node *instr; > + > + LIST_FOR_EACH_ENTRY(instr, &block->instrs, struct hlsl_ir_node, entry) > + { > + switch (instr->type) > + { > + case HLSL_IR_STORE: > + { > + struct hlsl_ir_store *store = hlsl_ir_store(instr); > + struct copy_propagation_variable *variable; > + struct hlsl_deref *lhs = &store->lhs; > + struct hlsl_ir_var *var = lhs->var; > + unsigned int offset; > + > + variable = copy_propagation_get_variable(state, var); > + if (!variable) > + continue; > + > + if (hlsl_offset_from_deref(lhs, &offset)) > + copy_propagation_set_value(variable, offset, store->writemask, NULL); > + else > + copy_propagation_invalidate_whole_variable(variable); > + > + break; > + } > + > + case HLSL_IR_IF: > + { > + struct hlsl_ir_if *iff = hlsl_ir_if(instr); > + > + copy_propagation_invalidate_from_block(ctx, state, &iff->then_instrs); > + copy_propagation_invalidate_from_block(ctx, state, &iff->else_instrs); > + > + break; > + } > + > + case HLSL_IR_LOOP: > + { > + struct hlsl_ir_loop *loop = hlsl_ir_loop(instr); > + > + copy_propagation_invalidate_from_block(ctx, state, &loop->body); > + > + break; > + } > + > + default: > + break; > + } > + } > +} > + > +static void copy_propagation_pop(struct copy_propagation_state *state) > +{ > + assert(state->depth > 0); > + rb_destroy(&state->variables[state->depth], copy_propagation_variable_destroy, NULL); > + --state->depth; > +} > + > +static bool copy_propagation_duplicate(struct hlsl_ctx *ctx, struct copy_propagation_state *state) > +{ > + struct copy_propagation_variable *var; > + > + if (state->depth + 1 == state->capacity) > + { > + unsigned int new_capacity = 2 * state->capacity; > + struct rb_tree *new_vars; > + > + new_vars = hlsl_realloc(ctx, state->variables, sizeof(*state->variables) * new_capacity); > + if (!new_vars) > + return false; > + state->capacity = new_capacity; > + state->variables = new_vars; > + } This looks like an open-coded hlsl_array_reserve(); any reason not to use that? > + ++state->depth; > + > + rb_init(&state->variables[state->depth], copy_propagation_variable_compare); > + > + RB_FOR_EACH_ENTRY(var, &state->variables[state->depth - 1], struct copy_propagation_variable, entry) > + { > + struct copy_propagation_variable *new_var = copy_propagation_create_variable(ctx, state, var->var); > + > + if (!new_var) > + return false; > + > + memcpy(new_var->values, var->values, sizeof(*var->values) * var->var->data_type->reg_size); > + } > + > + return true; > +} > + One potential alternative that occurred to me: instead of duplicating the whole state, we can search backwards in each parent scope in copy_propagation_get_variable(). Notably, in the case of a loop, we'd hit NULL before a "real" variable, and return NULL. Assuming that works it'd reduce the amount of allocation we have to do.