-
Notifications
You must be signed in to change notification settings - Fork 220
Understanding Lookahead:_compute_cfvs() #23
Copy link
Copy link
Open
Description
I understand in normal CFR how a parent node computes the CFV, but I'm struggling to understand how this is done in the lookahead's vectorized implementation.
Specifically, in Lookahead:_compute_cfvs():
function Lookahead:_compute_cfvs()
for d=self.depth,2,-1 do
local gp_layer_terminal_actions_count = self.terminal_actions_count[d-2]
local ggp_layer_nonallin_bets_count = self.nonallinbets_count[d-3]
self.cfvs_data[d][{{}, {}, {}, {1}, {}}]:cmul(self.empty_action_mask[d])
self.cfvs_data[d][{{}, {}, {}, {2}, {}}]:cmul(self.empty_action_mask[d])
self.placeholder_data[d]:copy(self.cfvs_data[d])
--player indexing is swapped for cfvs
self.placeholder_data[d][{{}, {}, {}, self.acting_player[d], {}}]:cmul(self.current_strategy_data[d])
torch.sum(self.regrets_sum[d], self.placeholder_data[d], 1)
--use a swap placeholder to change {{1,2,3}, {4,5,6}} into {{1,2}, {3,4}, {5,6}}
local swap = self.swap_data[d-1]
swap:copy(self.regrets_sum[d])
self.cfvs_data[d-1][{{gp_layer_terminal_actions_count+1, -1}, {1, ggp_layer_nonallin_bets_count}, {}, {}, {}}]:copy(swap:transpose(2,3))
end
endSo I understand we multiply empty (illegal) actions by the mask so their CFV is zero, but I'm lost about what the swap and transpose is doing, or how slicing the parent's cfvs_data copies things to the right place.
Can anyone explain more clearly?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels