Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Expose function for creating valid field names #30

Open
iainmwallace opened this issue Jun 24, 2017 · 2 comments
Open

Suggestion: Expose function for creating valid field names #30

iainmwallace opened this issue Jun 24, 2017 · 2 comments
Assignees

Comments

@iainmwallace
Copy link

Hi,

It would be really useful to expose the function that the import function uses to convert the column names in a file to the column name that appears in the solvebio.
This would avoid any issues when supplying custom field definitions.

Based on trial and error, I believe the function removes spaces and converts all text to lowercase, but I am not sure if there are other edge cases

Cheers,

Iain

@davecap
Copy link
Member

davecap commented Jun 24, 2017

Hi Iain,

Can you provide an example of what you're seeing? Is it from a CSV/TSV file?

It's possible that the field transform is only done on CSV/TSV headers, but not for other formats like JSON. If that's the case, it will do the following:

x.replace(' ', '_').lower()

It actually may not be necessary for us to do this, we'll look into it and see if we can leave the fields as-is.

@iainmwallace
Copy link
Author

Yeah, it was a CSV file. If it is possible to not change the field names, I would strongly recommend doing that.

I would also suggest using readr to guess the column types. The import auto-guess feature doesn't seem to work as well. I am planning on using readr to guess the column types, and then create the appropriate template. I can let you know how I get on.

Also, I tried to use the import function to import directly rather than via file, but that gave an error. It would be very nice if one could import a tibble directly from this.
Specifically, the error was:
DatasetImport.create(dataset_id=dataset$id, data_records=x)
Error in DatasetImport.create(dataset_id = dataset$id, data_records = x) :
Either an upload ID or manifest is required.

@davecap davecap self-assigned this Jun 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants