Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPSS: variable_value_label becomes None if any string value is longer than 8 characters #264

Open
wilfredrial opened this issue Jun 10, 2024 · 1 comment
Labels
bug Something isn't working requires changes in Readstat waiting for changes in the C library Readstat to be reported in Readstat

Comments

@wilfredrial
Copy link

wilfredrial commented Jun 10, 2024

Hi, not sure how to tell if this is an issue in the c library or pyreadstat. Apologies if it is the former.

As the title says, the "Values" column becomes "None" for a String variable in SPSS if any value is longer than 8 characters.

To reproduce:

import pandas as pd
import pyreadstat

var_values = {'mycol':{'-111':'err1'}}

# I thought setting var_format might have an effect but it seems it does not
# just leaving this here in case you want to compare
var_format = {} # {'mycol':'A50'} 

pyreadstat.write_sav(df=pd.DataFrame({'mycol':['-111', '123456789']}),  # 2nd value is 9 chars
                     dst_path='var_values_is_none.sav', 
                     variable_value_labels=var_values,
                     variable_format=var_format)


pyreadstat.write_sav(df=pd.DataFrame({'mycol':['-111', '12345678']}), # 2nd value is 8 chars
                     dst_path='var_values_is_NOT_none.sav', 
                     variable_value_labels=var_values,
                     variable_format=var_format)

Relevant versions:
python 3.12.3
pyreadstat 1.2.7
pandas 2.2.2

@ofajardo ofajardo added bug Something isn't working requires changes in Readstat waiting for changes in the C library Readstat labels Sep 2, 2024
@ofajardo
Copy link
Collaborator

ofajardo commented Sep 2, 2024

Thanks for the perfectly reproducible issue!
It looks similar to this. If it is indeed the same issue, the author of the C library Readstat says that it has not been implemented yet. I do not currently see an issue about this topic in the Readstat repo, so probably it would be good to file one there.
Once fixed there, the fix should cascade to here and the problem should disappear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working requires changes in Readstat waiting for changes in the C library Readstat to be reported in Readstat
Projects
None yet
Development

No branches or pull requests

2 participants