Skip to content

question about APL calculation formula #4

@tang-ji

Description

@tang-ji

Hi,

I read your paper and your code, but I am confused about your "average_path_length" function in train.py.

As your definition in the paper, apl is Average-Path-Length, and "Path-Length counts how many nodes are needed to make a specific input to an output node in the provided decision tree."

In this code, your "average_path_length" function counts the average sum of indexes of all the nodes that one specific input passed, instead of its path-length. And you could see in your figure (d) in the 8th page of the paper, how could the value of APL be over 60? (which means your tree has more than $2^60$ leaves.)

And in my opinion, this function could be modified due to APL's definition in the paper:

def average_path_length(self, X_train, y_train):
tree = self.fit_tree(X_train, y_train)
#Compute average path length
path_length = np.sum(tree.decision_path(X_train)) / float(X_train.shape[0])
return path_length

I don't know whether I misunderstand APL's definition? Waiting for your reply, thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions