Skip to content

Commit

Permalink
Renamed pivot.
Browse files Browse the repository at this point in the history
  • Loading branch information
mihaeladuta committed Feb 22, 2024
1 parent 15c35b4 commit 17545d9
Show file tree
Hide file tree
Showing 3 changed files with 42 additions and 42 deletions.
Binary file modified data/processed/sheets/Results.parquet
Binary file not shown.
82 changes: 41 additions & 41 deletions logs/Results.log
Original file line number Diff line number Diff line change
@@ -1,41 +1,41 @@
2024-02-22 14:14:47,157 [INFO] Results - read sheet from 'data/raw/REF-2021-Results-All-2022-05-06.xlsx'
2024-02-22 14:14:47,548 [INFO] Results - parsed sheet: 7552 records
2024-02-22 14:14:47,549 [INFO] Results - rename 'Main panel' to 'Main panel code'
2024-02-22 14:14:47,551 [INFO] Results - replace '['/', ':']' with '_' in 'Institution name'
2024-02-22 14:14:47,552 [INFO] Results - add columns for panel names
2024-02-22 14:14:47,557 [INFO] Results - replace '-' with na in ['4 stars', '3 stars', '2 stars', '1 star', 'Unclassified']
2024-02-22 14:14:47,559 [INFO] Results - bin percentages for ['4 stars', '3 stars', '2 stars', '1 star', 'Unclassified']
2024-02-22 14:14:47,560 [INFO] Results - drop columns '['Institution code (UKPRN)', 'Institution sort order']'
2024-02-22 14:14:47,569 [INFO] Results - pivot to make wide format for ratings per profile to enable analyses
2024-02-22 14:14:47,569 [INFO] Results - make categorical ['Environment evaluation - Unclassified (binned)', 'Impact evaluation - 1 star (binned)', 'Overall evaluation - 2 stars (binned)', 'Outputs evaluation - 4 stars (binned)', 'Outputs evaluation - 2 stars (binned)', 'Environment evaluation - 4 stars (binned)', 'Environment evaluation - 2 stars (binned)', 'Overall evaluation - 1 star (binned)', 'Impact evaluation - 3 stars (binned)', 'Environment evaluation - 3 stars (binned)', 'Overall evaluation - 3 stars (binned)', 'Overall evaluation - Unclassified (binned)', 'Environment evaluation - 1 star (binned)', 'Outputs evaluation - 3 stars (binned)', 'Outputs evaluation - 1 star (binned)', 'Overall evaluation - 4 stars (binned)', 'Outputs evaluation - Unclassified (binned)', 'Impact evaluation - 4 stars (binned)', 'Impact evaluation - 2 stars (binned)', 'Impact evaluation - Unclassified (binned)']
2024-02-22 14:14:48,048 [INFO] Results - read dataset from 'data/processed/sheets/Outputs.parquet'
2024-02-22 14:14:48,064 [INFO] Results - added column 'Output submissions (added)'
2024-02-22 14:14:48,069 [INFO] Results - added column 'Output submissions - Chapter in book (added)'
2024-02-22 14:14:48,076 [INFO] Results - added column 'Output submissions - Journal article (added)'
2024-02-22 14:14:48,081 [INFO] Results - added column 'Output submissions - Authored book (added)'
2024-02-22 14:14:48,085 [INFO] Results - added column 'Output submissions - Edited book (added)'
2024-02-22 14:14:48,089 [INFO] Results - added column 'Output submissions - Exhibition (added)'
2024-02-22 14:14:48,092 [INFO] Results - added column 'Output submissions - Performance (added)'
2024-02-22 14:14:48,096 [INFO] Results - added column 'Output submissions - Digital or visual media (added)'
2024-02-22 14:14:48,100 [INFO] Results - added column 'Output submissions - Conference contribution (added)'
2024-02-22 14:14:48,103 [INFO] Results - added column 'Output submissions - Scholarly edition (added)'
2024-02-22 14:14:48,107 [INFO] Results - added column 'Output submissions - Other (added)'
2024-02-22 14:14:48,111 [INFO] Results - added column 'Output submissions - Working paper (added)'
2024-02-22 14:14:48,114 [INFO] Results - added column 'Output submissions - Patent/ published patent application (added)'
2024-02-22 14:14:48,118 [INFO] Results - added column 'Output submissions - Composition (added)'
2024-02-22 14:14:48,122 [INFO] Results - added column 'Output submissions - Website content (added)'
2024-02-22 14:14:48,125 [INFO] Results - added column 'Output submissions - Design (added)'
2024-02-22 14:14:48,129 [INFO] Results - added column 'Output submissions - Artefact (added)'
2024-02-22 14:14:48,132 [INFO] Results - added column 'Output submissions - Research report for external body (added)'
2024-02-22 14:14:48,136 [INFO] Results - added column 'Output submissions - Research data sets and databases (added)'
2024-02-22 14:14:48,139 [INFO] Results - added column 'Output submissions - Translation (added)'
2024-02-22 14:14:48,143 [INFO] Results - added column 'Output submissions - Software (added)'
2024-02-22 14:14:48,146 [INFO] Results - added column 'Output submissions - Devices and products (added)'
2024-02-22 14:14:48,263 [INFO] Results - read dataset from 'data/processed/sheets/ImpactCaseStudies.parquet'
2024-02-22 14:14:48,268 [INFO] Results - added column 'Impact case study submissions (added)'
2024-02-22 14:14:48,279 [INFO] Results - read dataset from 'data/processed/sheets/ResearchDoctoralDegreesAwarded.parquet'
2024-02-22 14:14:48,283 [INFO] Results - added columns '['Total number of doctoral degrees awarded (added)']'
2024-02-22 14:14:48,461 [INFO] Results - read dataset from '/Users/mihaela/Documents/work/ssi_work/ref-2021-analysis/data/processed/environment_statements/EnvironmentStatementsUnitLevel.parquet'
2024-02-22 14:14:48,469 [INFO] Results - merged with unit environment statements: 1888 records
2024-02-22 14:14:48,469 [INFO] Results - make categorical ['Joint submission', 'Main panel name', 'Multiple submission letter', 'Institution name', 'Multiple submission name', 'Unit of assessment name']
2024-02-22 14:14:48,933 [INFO] Results - write dataset to 'data/processed/sheets/Results.parquet'
2024-02-22 14:28:18,700 [INFO] Results - read sheet from 'data/raw/REF-2021-Results-All-2022-05-06.xlsx'
2024-02-22 14:28:19,089 [INFO] Results - parsed sheet: 7552 records
2024-02-22 14:28:19,091 [INFO] Results - rename 'Main panel' to 'Main panel code'
2024-02-22 14:28:19,093 [INFO] Results - replace '['/', ':']' with '_' in 'Institution name'
2024-02-22 14:28:19,096 [INFO] Results - add columns for panel names
2024-02-22 14:28:19,100 [INFO] Results - replace '-' with na in ['4 stars', '3 stars', '2 stars', '1 star', 'Unclassified']
2024-02-22 14:28:19,103 [INFO] Results - bin percentages for ['4 stars', '3 stars', '2 stars', '1 star', 'Unclassified']
2024-02-22 14:28:19,104 [INFO] Results - drop columns '['Institution code (UKPRN)', 'Institution sort order']'
2024-02-22 14:28:19,112 [INFO] Results - pivot to make wide format for ratings per profile to enable analyses
2024-02-22 14:28:19,113 [INFO] Results - make categorical ['Outputs profile - Unclassified (binned)', 'Impact profile - 4 stars (binned)', 'Outputs profile - 2 stars (binned)', 'Overall profile - 4 stars (binned)', 'Overall profile - 3 stars (binned)', 'Outputs profile - 1 star (binned)', 'Environment profile - Unclassified (binned)', 'Impact profile - 1 star (binned)', 'Environment profile - 4 stars (binned)', 'Environment profile - 2 stars (binned)', 'Outputs profile - 3 stars (binned)', 'Impact profile - 2 stars (binned)', 'Impact profile - 3 stars (binned)', 'Environment profile - 3 stars (binned)', 'Impact profile - Unclassified (binned)', 'Overall profile - 2 stars (binned)', 'Outputs profile - 4 stars (binned)', 'Overall profile - 1 star (binned)', 'Overall profile - Unclassified (binned)', 'Environment profile - 1 star (binned)']
2024-02-22 14:28:19,325 [INFO] Results - read dataset from 'data/processed/sheets/Outputs.parquet'
2024-02-22 14:28:19,340 [INFO] Results - added column 'Output submissions (added)'
2024-02-22 14:28:19,345 [INFO] Results - added column 'Output submissions - Chapter in book (added)'
2024-02-22 14:28:19,353 [INFO] Results - added column 'Output submissions - Journal article (added)'
2024-02-22 14:28:19,357 [INFO] Results - added column 'Output submissions - Authored book (added)'
2024-02-22 14:28:19,361 [INFO] Results - added column 'Output submissions - Edited book (added)'
2024-02-22 14:28:19,365 [INFO] Results - added column 'Output submissions - Exhibition (added)'
2024-02-22 14:28:19,369 [INFO] Results - added column 'Output submissions - Performance (added)'
2024-02-22 14:28:19,372 [INFO] Results - added column 'Output submissions - Digital or visual media (added)'
2024-02-22 14:28:19,376 [INFO] Results - added column 'Output submissions - Conference contribution (added)'
2024-02-22 14:28:19,380 [INFO] Results - added column 'Output submissions - Scholarly edition (added)'
2024-02-22 14:28:19,383 [INFO] Results - added column 'Output submissions - Other (added)'
2024-02-22 14:28:19,387 [INFO] Results - added column 'Output submissions - Working paper (added)'
2024-02-22 14:28:19,391 [INFO] Results - added column 'Output submissions - Patent/ published patent application (added)'
2024-02-22 14:28:19,394 [INFO] Results - added column 'Output submissions - Composition (added)'
2024-02-22 14:28:19,398 [INFO] Results - added column 'Output submissions - Website content (added)'
2024-02-22 14:28:19,402 [INFO] Results - added column 'Output submissions - Design (added)'
2024-02-22 14:28:19,405 [INFO] Results - added column 'Output submissions - Artefact (added)'
2024-02-22 14:28:19,409 [INFO] Results - added column 'Output submissions - Research report for external body (added)'
2024-02-22 14:28:19,413 [INFO] Results - added column 'Output submissions - Research data sets and databases (added)'
2024-02-22 14:28:19,416 [INFO] Results - added column 'Output submissions - Translation (added)'
2024-02-22 14:28:19,420 [INFO] Results - added column 'Output submissions - Software (added)'
2024-02-22 14:28:19,423 [INFO] Results - added column 'Output submissions - Devices and products (added)'
2024-02-22 14:28:19,541 [INFO] Results - read dataset from 'data/processed/sheets/ImpactCaseStudies.parquet'
2024-02-22 14:28:19,546 [INFO] Results - added column 'Impact case study submissions (added)'
2024-02-22 14:28:19,556 [INFO] Results - read dataset from 'data/processed/sheets/ResearchDoctoralDegreesAwarded.parquet'
2024-02-22 14:28:19,560 [INFO] Results - added columns '['Total number of doctoral degrees awarded (added)']'
2024-02-22 14:28:19,744 [INFO] Results - read dataset from '/Users/mihaela/Documents/work/ssi_work/ref-2021-analysis/data/processed/environment_statements/EnvironmentStatementsUnitLevel.parquet'
2024-02-22 14:28:19,753 [INFO] Results - merged with unit environment statements: 1888 records
2024-02-22 14:28:19,754 [INFO] Results - make categorical ['Institution name', 'Multiple submission name', 'Multiple submission letter', 'Joint submission', 'Unit of assessment name', 'Main panel name']
2024-02-22 14:28:20,196 [INFO] Results - write dataset to 'data/processed/sheets/Results.parquet'
2 changes: 1 addition & 1 deletion src/REF2021_processing/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -391,7 +391,7 @@ def pivot_results_by_profile(dset, sname):
columns_values.extend(
[f"{column}{cb.COLUMN_NAME_BINNED_SUFFIX}" for column in columns_values[2:]]
)
suffix = "evaluation"
suffix = "profile"

# columns to drop from the wide format because they are duplicates
columns_to_drop = [
Expand Down

0 comments on commit 17545d9

Please sign in to comment.