-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let context be merged into import data #27
base: main
Are you sure you want to change the base?
Conversation
b32e050
to
bfb3d44
Compare
In some cases, the given import data will not be enough to create valid Active Record objects. Developers may need to seed the import data to allievate errors related to required `belongs_to` associations, and so on. This commit provides an interface for this.
bfb3d44
to
5bbd1b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening this. I'm interested in exploring this more, but in the meantime, I was wondering if the existing interface could support relationships?
csv_string = CSV.generate do |csv|
csv << %w[title1 content1 1]
csv << %w[title2 content2 1]
end
ArtVandelay::Import.new(:posts, rollback: true)
.csv(
csv_string,
headers: [:title, :content, :user_id],
)
end
You're right that the existing interface can support relationships by having a CSV column like First, here is why I think it would be valuable even if having a column with an ID (like
I intend to use Art Vandelay as the import/export backend to a self-serve import/export UI. It allows the end user to import and export the contents of their account(s). The end user does not know their own Maybe a self-serve imports/exports UI is beyond the scope what you envision Art Vandelay being used for. I would love to discuss that further. But even if that is not in your vision of what Art Vandelay should be used for, I think points 1. and 2. still make this a valuable feature. With regard to point 2., I can imagine a developer setting up a script to be reused for multiple accounts in an application with many tenants: ACCOUNT_NAME=art_vandelay_industries CSV=AVI.csv rake run_import
ACCOUNT_NAME=j_peterman_catalog CSV=JPC.csv rake run_import I hope this clarifies why I think this feature is valuable. I'm of course happy to discuss it further. |
Thank you for the additional context!
This is a valid point, but I imagine in this case, you could use a different unique identifier, if available. However, I recognize that not all records have a unique identifier other than the
In this case, wouldn't an end user still need to know something unique about the record, though? In the example provided in the tests, we create a record, but I'd imagine in practice we'd need to find a user. Most likely by their
In this case, I think there's an opportunity to parse the CSV before passing it to
Admittedly, the API you're introducing is more direct, but I worry that it might be a premature feature. |
I think this makes sense until the CSV has millions of rows. I expect to be importing CSVs with millions of rows. I am trying to reduce the amount of boilerplate code I have to manage (per new import type) to safely import large amounts of data (in batches). I don't know exactly how memory management works when you need to parse and transform a large CSV, so maybe I am wrong, but I think this leads to the developer having to manage a lot of the transformations using the CSV interface. This pull request is actually just a baby step in the direction I ended up going in my own implementation, where Here is how I am initializing an import in my application with changes to the gem in my fork, for example: ArtVandelay::Import
.new(Bookmark, filtered_attributes: FILTERED_ATTRIBUTES)
.csv(
contents,
attributes: ATTRIBUTES_MAPPING,
context: {
account: account,
private: ->(value) { value == "yes" }, # A boolean attribute.
read_at: ->(value) { value == "no" ? Time.now : nil }, # A datetime attribute that is a boolean in the CSV.
tags: ->(value) { find_or_create_tags_from(value) } # A has_many relationship that is a list of comma-separated tags in the CSV.
}
) I hope this demonstrates the benefits that context could provide. If context is a premature or unwanted feature for the importer, I completely understand and will close this pull request. |
Thank you for sharing that detailed example. I like the interface you've come up with, and can see its value. For now, though, I think I want to sit on this for a bit before pursing it. My main concern being that I'm not sure if there's a wider need for thus feature (yet). |
Hi, hope you all are well.
In some cases, the given import data will not be enough to create valid Active Record objects. Developers may need to seed the import data to allievate errors related to required
belongs_to
associations, and so on. This commit provides an interface for this.