-
Notifications
You must be signed in to change notification settings - Fork 469
Abstract all primary key references #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
lib/ancestry/class_methods.rb
Outdated
| object | ||
| else | ||
| unscoped_where { |scope| scope.find(object.try(primary_key) || object) } | ||
| unscoped_where { |scope| scope.find(object.try(ancestry_identifier_column) || object) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, primary_key (:id), vs ancestry_identifier_column (:id) -- I'll keep them the same for now, but want to say this to highlight issues in the rest of this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and find inherently is a find_by(:id => object.id)
So this should read find_by(ancestry_identifier_column => object.read_attribute(ancestry_identifier_column))
As I'm thinking more about this, this getting unwieldy very quickly.
ods are pretty low.
By the way, this method could be defined as def to_node(id_or_object). The try is saying "if this is an Object, then lets fetch the id, otherwise lets assume it is an Integer id", which works because ruby dropped the Object#id maybe 15 years ago.'
And the unscoped_where is there so STI works.
lib/ancestry/class_methods.rb
Outdated
|
|
||
| nodes.each_with_object({}) do |node, arranged| | ||
| children = index[node.id] | ||
| children = index[node.ancestry_identifier_column] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| children = index[node.ancestry_identifier_column] | |
| children = index[node.read_attribute(ancestry_identifier_column)] |
not sure if we want read_attribute or send
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry — I replaced it too naively... I should've used read_attribute instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id is nice. behind the scenes it does smart stuff. wonder if something similar would work here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your suggestion, I introduced the method ancestry_identifier_value, which returns read_attribute(ancestry_identifier_column).
And id in Rails method implicitly returns the primary key value, so this ancestry_identifier_value can serve a similar purpose if I'm correct.
By using this method, I can avoid repeatedly writing read_attribute or send whenever I need to fetch the value, which helps keep the code more readable.
lib/ancestry/class_methods.rb
Outdated
| if !node.sane_ancestor_ids? | ||
| raise Ancestry::AncestryIntegrityException, I18n.t("ancestry.invalid_ancestry_column", | ||
| :node_id => node.id, | ||
| :node_id => node.ancestry_identifier_column, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| :node_id => node.ancestry_identifier_column, | |
| :node_id => node.read_attribute(ancestry_identifier_column), |
lib/ancestry/class_methods.rb
Outdated
| unless exists?(ancestor_id) | ||
| raise Ancestry::AncestryIntegrityException, I18n.t("ancestry.reference_nonexistent_node", | ||
| :node_id => node.id, | ||
| :node_id => node.ancestry_identifier_column, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| :node_id => node.ancestry_identifier_column, | |
| :node_id => node.read_attribute(ancestry_identifier_column), |
lib/ancestry/class_methods.rb
Outdated
| node.path_ids.zip([nil] + node.path_ids).each do |node_identifier, parent_identifier| | ||
| parents[node_identifier] = parent_identifier unless parents.key?(node_identifier) | ||
| unless parents[node_identifier] == parent_identifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should leave as node_id, parent_id -- otherwise this is pretty verbose. What do you think? (I may change my mind later. honestly asking your opinion)
this will also cut down on churn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comment. Sorry, I mixed up parent_identifier and parent_id...
Here's how I'm currently thinking about this:
-
I would not like to change the naming of interface-level fields like parent_id or node_id.
-
These names are compatible with existing behavior, and semantically they make sense as the identifiers used within the context of ancestry gem.
For this reason, I’ve reverted usage of parent_identifier back to parent_id. I'd appreciate your thoughts on this direction.
lib/ancestry/class_methods.rb
Outdated
| end | ||
| # ... save parent id of this node in parent_ids array if it exists | ||
| parent_ids[node.id] = node.parent_id if exists? node.parent_id | ||
| parent_ids[node.ancestry_identifier_column] = node.parent_id if exists? node.parent_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| parent_ids[node.ancestry_identifier_column] = node.parent_id if exists? node.parent_id | |
| parent_ids[node.read_attribute(ancestry_identifier_column)] = node.parent_id if exists? node.parent_id |
feel like these are getting mixed up. parent_id vs parent_identifier.
Maybe you are right and we need to distinguish between the two.
not liking this identifier vs id idea.
Feeling like the path/breadcrumb test may meet your needs better
lib/ancestry/class_methods.rb
Outdated
| # alias for primary_key | ||
| def ancestry_identifier_column | ||
| primary_key | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea to do this for the short term
lib/ancestry/instance_methods.rb
Outdated
| # Validate that the ancestors don't include itself | ||
| def ancestry_exclude_self | ||
| errors.add(:base, I18n.t("ancestry.exclude_self", class_name: self.class.name.humanize)) if ancestor_ids.include?(id) | ||
| errors.add(:base, I18n.t("ancestry.exclude_self", class_name: self.class.name.humanize)) if ancestor_ids.include?(ancestry_identifier_column) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you need to introduce ancestry_identifier_value - and any time you see id we use ancestry_identifer_value?
|
Yea, this is getting confusing for me. So parent_id is the id from the parent node, but that is technically the parent's ancestry_identifier_column. So when we have root_id and parent_id, then it is the root_identity field and the parent_identity field. So all lookups of any nodes in the tree need to change. I'm still thinking breadcrumbing may be the way you want. Do you have an example of the code that won't work with the current model? |
|
@a5-stable would your need be met if we had |
|
@kbrock UsecaseOne Example of the use case I’m considering involves a model that uses Temporal Modeling. Temporal models store historical data, so multiple records can represent the lifecycle of a single "resource." For example, using a bitemporal model might look like this: CREATE TABLE employee_department_bitemporal (
id SERIAL PRIMARY KEY, -- Surrogate Key
bitemporal_id INT NOT NULL, -- Identifier for the logical employee
valid_from DATE NOT NULL,
valid_to DATE NOT NULL,
transaction_from TIMESTAMP NOT NULL,
transaction_to TIMESTAMP NOT NULL,
UNIQUE (bitemporal_id, valid_from, transaction_from)
);In this case, id is unique across each version of the record, while bitemporal_id uniquely identifies the logical employee. Now, let’s say that we want to add a hierarchical association to this model. In some cases, it makes more sense for the ancestry column to store values like
In the use case above, the coexistence strategy (having both id/id and identifier/identifier) might work, but I wonder if storing the unused id/id hierarchy leads to unnecessary code and data overhead. public API such as
|
|
Ok, cool concept. So basically the TANGENT: I'd check if creating your indexes with So if you used But also, just saying If you are recording a hierarchy in a point in time, then you want When you move a node in a tree, you need to update a bunch of records. But in your case, you don't really want to modify any records (other than updating the transaction_to and possibly the valid_to dates). You'll need to ensure that updates in the callback do the necessary modification. Tricky edge case. Hopefully the update strategy system would work for you. TANGENT: I'd like to moving towards sql transactions rather than updating the records in ruby and sending those across N+1 style. But time has been limited so that may never occur. But I think it may affect you. TANGENT: Are you writing TANGENT: Wondering if thread local variables ref for All the I agree that you don't want TANGENT: look into how the multi tenant solutions solve this scoping issue since the behavior may be similar to what you want. I notice that multi-tenant gem uses the active support wrapper over thread local while acts_as_tenant uses Do keep in mind, in order to work with STI, the code does throw away default scopes, specifically when fetching a parent, but possibly in other cases, as well. So this may introduce issues depending upon how you implement your solution. A number of people over the years seem to be polar pro dropping scopes vs anti dropping scopes. Ok, that was a little bit of a ramble. Sorry I don't have more time to better organize the thoughts. Hopefully some of that is useful to you. This seems like it would require quite a bit of change. Need some time to process. Adding the |
|
ooh, I've used |
|
@kbrock In my domain, we do require both the transaction and valid time axes—not just for logical deletion like discard, but to properly track the lifecycle of a resource as a historical record. (even in unitemporal use cases, we might want Also, I’m planning to use a multi-tenant-like approach, perhaps with something like the multi-tenant gem to manage valid and transaction times via thread-local variables. Your thoughts on default scopes and STI behavior were particularly helpful—I'll take extra care there. Just to clarify, I shared this use case mainly to help illustrate the idea of supporting custom identifier columns for ancestry—not to require the entire bitemporal model to be supported out of the box. I completely understand that this could lead to significant internal changes, and I’d be happy to help. If you have thoughts on where such a change should start or what kind of interface would make sense, I’d love to hear them and contribute. Again, thank you for your time and insights. |
Yea, in your case, I feel |
|
@kbrock |
|
@a5-stable I am not sure if this is related or totally tangental, and it is still an idea and not a know commodity, but... - def parent_id
- ancestry.split("/").last
- end
+ def parent_id
+ self.class.calculate_parent_id(self, ancestry_column)
+ end
+ def self.calculate_parent_id(rec, col_name)
+ rec.send(col_name).split("/").last
+ endAnd further down the line... def has_ancestry
# ...
define_method(:parent_id)
self.class.calculate_parent_id(self, ancestry_column)
end
# ...
endDoes that change the way this PR is implemented? |
|
@kbrock Either way, I don't see any conflict between that idea and the current PR, so no problem from my side. |
Sample PR for #695
(This pull request introduces a significant refactor to the
ancestrylibrary by replacing direct references toprimary_keyandidwith a new method,ancestry_identifier_column)