This repository was archived by the owner on Apr 27, 2023. It is now read-only.
reduce reliance on mutation in conversion process#105
Open
jordanverasamy wants to merge 17 commits intomasterfrom
Open
reduce reliance on mutation in conversion process#105jordanverasamy wants to merge 17 commits intomasterfrom
jordanverasamy wants to merge 17 commits intomasterfrom
Conversation
surajreddy
reviewed
Oct 26, 2018
Contributor
surajreddy
left a comment
There was a problem hiding this comment.
I like the approach so far. 👍
06ff53a to
35edc8f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem I'm trying to solve
Anyone anywhere can mutate the Shopify record
Here's the way the convert pipeline works on
master:When the convert process goes through each pipeline stage in sequence and calls
stage.convert(row, record), that pipeline stage is expected to mutaterecord.This is accomplished through multiple different ways. Most
#convertfunctions userecord.merge!to mutate that record to add the new data that it parsed from the hash.However, additionally, it turns out that every
AttributesAccumulatormutates the variable that is used to initialize it. A lot of our code was relying on this fact, because the accumulator was adding data to the record that was not evermerge!'d in.In addition to that, some pipeline stages were mutating the record in other ways. For example, the
VariantImagepipeline stage was mutating both the parent object and the corresponding variant.Magento data always gets inserted into the Shopify record, then erased later
On a separate note, I noticed that sometimes our
recordwas polluted with Magento data. This made it very confusing to determine what was going on, since the Shopify record would be a giant hash with ~10 keys of Shopify-formatted data and 50+ keys of Magento-formatted data.It turns out that this was happening because the
VariantAttributespipeline stage was inserting the entire Magento hash for that simple product into the variants list of the Shopify record we were accumulating. This wasn't affecting the output, since we have a key whitelist in our CSV builder, but it made debugging very annoying.Proposed solution
I'm proposing a change to the way our pipeline stages are expected to work.
Restructure the way pipeline stages are wired together
The
#convertfunction of a pipeline stage is no longer expected to mutate anything. Instead, the return value of one pipeline stage is passed along as the input to the next pipeline stage.This is accomplished by changing our call to
@pipeline_stages.eachinto a call to@pipeline_stages.reduce.This means that if you want to know where some data gets added to the record, instead of having to look everywhere for what line of code could have mutated that record, one simply has to look at the data that is being passed between pipeline stages, since nothing else could be modifying that record.
The goal of this change is to ensure that you can ALWAYS simply look at the input and return records of any
#convertfunction and know what it did.Restructure the RecordBuilder to match the new functional design
One slightly weird implementation detail of this is that the
RecordBuilder, as written, actually expects that the record it yields will be mutated. I've added a bit of code to theRecordBuilderso that instead of expecting the record to be mutated with all the new data, it instead expects the block it is passed to return a new record with all the new data, and adjusts its instances table accordingly.Don't let accumulators ever mutate external objects
The
AttributesAccumulatorclass now calls.dupon the object passed to the initializer. This means that as the accumulator slurps up data from hashes and accumulates them in the@outputinstance variable, the internal state of theAttributesAccumulatoris being built up, but the variable that was initially passed in is no longer being modified.Rewrite pipeline stages to match the new structure
Remove the
VariantAttributespipeline stage. The purpose of this stage was simply to make sure that the right variants got populated to the right parent object, but this can actually be done very easily from theTopLevelVariantAttributespipeline stage instead. This means we can remove this pipeline stage entirely.A side effect of this is that the Shopify record is never deliberately polluted with keys that we don't expect to make it into the final CSV (except for the Magento product IDs, which are required for the association of simple products to configurable products.) This makes the record data easier to read as it gets incrementally built up by pipeline stages.
Rewrite the
VariantImagepipeline stage to not mutate anything.Checklist:
Related issues:
Closes: