Skip to content

Commit

Permalink
merge dfs on 'protein symbol' instead
Browse files Browse the repository at this point in the history
  • Loading branch information
miseminger committed Jul 30, 2024
1 parent ed67400 commit 28ad84d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion bin/functional_annotation.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ def write_tsv(dframe):
index_cols_to_use = ['nucleotide position', 'nucleotide mutation', 'amino acid mutation', 'amino acid mutation alias',
'protein name', 'protein symbol', 'gene name']
dataFrame = dataFrame.drop(columns=index_cols_to_use)
merged_dataFrame = pd.merge(dataFrame, mutation_index, on=['original mutation description', 'gene symbol'], how='left') #, 'alias'
merged_dataFrame = pd.merge(dataFrame, mutation_index, on=['original mutation description', 'protein symbol'], how='left') #, 'alias'
#dups = mutation_index[mutation_index.duplicated(subset=['nucleotide position', 'original mutation description'], keep=False)]
#dups = dups.sort_values(by='nucleotide position')
#dups.to_csv('madeline_testing/dups.tsv', sep='\t', index=False)
Expand Down

1 comment on commit 28ad84d

@miseminger
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.