Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/test cases #89

Merged
merged 7 commits into from
May 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[run]
omit = app.py, test/utils/*
9 changes: 5 additions & 4 deletions CONTRIBUTION.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,11 @@ Thank you for considering contributing to our project! We welcome contributions
4. Install dependencies using `pip install -r requirements.txt`
5. Install pre-commit hook using `pre-commit install`
6. Run tests using `pytest tests/test_file_name.py ` or specific test name like `pytest tests/test_file_name.py::test_function_name`
7. Do not push changes without the tests and coverage passing
8. Commit your changes with **proper** commit messages in imperative form like `Add my best feature`, `Fix issues casusing whatever`, `Update docs` etc: [Good reference here](https://cbea.ms/git-commit/)
9. Make changes and push to your forked repository
10. Create PR to the forked repository as mentioned below
7. Ensure the features is passing the acceptance criteria by `pytest test_results.py ` or `python test_results.py `
8. Do not push changes without the tests and coverage passing
9. Commit your changes with **proper** commit messages in imperative form like `Add my best feature`, `Fix issues casusing whatever`, `Update docs` etc: [Good reference here](https://cbea.ms/git-commit/)
10. Make changes and push to your forked repository
11. Create PR to the forked repository as mentioned below


### Pull Requests (PRs):
Expand Down
Empty file added test/__init__.py
Empty file.
158 changes: 158 additions & 0 deletions test/test_results.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
import unittest
import sys
import os
import subprocess
import warnings

# Add the parent directory to the Python path
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

from test.utils.custom_tsv_parser import ParseTSVFile

PREF_CUTOFF = -4
samurato marked this conversation as resolved.
Show resolved Hide resolved


def get_scode_cscode_id(data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function just for this test or could have been used in the actual code ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the helpers for the test. Not very proud of what I have done but its a make shift hashmap or dictionary to identify collisions in centers and schools.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make it private or mark it as the fixture with pytest ?

# Create an id with scode and center with sorted values of scode and center
school_centers = []
for row in data:
scode = int(row["scode"])
center = int(row["cscode"])
sccode_center_id = sorted((scode, center))
sccode_center_id = "_".join(map(str, sccode_center_id))
school_centers.append(sccode_center_id)
return school_centers


class TestSchoolCenter(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project intends to use pytest, but for some reason these tests are written in unittest 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can convert it to pytest later on but pytest should be able to run unittests as well with no issues

Copy link
Contributor

@horrormyth horrormyth May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be great to use pytest as it brings a lot of power to fixtures management and testing.And, it's not about pytest being able to run unit tests; it's more of the above and consistency. Also, the fact that we intend to use pytest, so it would be nice not to have the disparity. But all good, one step at a time ✅

"""_Tests to validate the outcome of the output are matching
as per the requirements

Needs the ouptut from the results/school-center.tsv and
results/school-center-distance.tsv files to be present

"""

def setUp(self):
self.school_center_file = "results/school-center.tsv"
self.school_center_distance_file = "results/school-center-distance.tsv"
self.school_center_pref_file = "sample_data/prefs.tsv"
schools_tsv = "sample_data/schools_grade12_2081.tsv"
centers_tsv = "sample_data/centers_grade12_2081.tsv"
cmd = f"python school_center.py {schools_tsv} {centers_tsv} {self.school_center_pref_file}"
subprocess.run(cmd, shell=True)

def tearDown(self):
os.remove(self.school_center_file)
os.remove(self.school_center_distance_file)

def test_results_exists(self):
"""_Test if the application in running which output the results in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name should be descriptive enough so that one doesn't need to write any comments, if one needs to write a comment for a function/test, then usually there is a smell; it might not be the case here, but I think the descriptive test would be better off even looking at the console.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using the python auto-doc string so just writing the comment to stick to consistency makes it easier for autoamated documentations later on. I can remove it but basically its checking if the output file exisits or not :).

results filder_

Returns:
Pass: If the file exists in the results folder
Fail: If the file doesnot exists in the results folder
"""
self.assertTrue(os.path.exists(self.school_center_file))
self.assertTrue(os.path.exists(self.school_center_distance_file))
self.assertTrue(os.path.exists(self.school_center_pref_file))

def test_scode_student_count_not_more_than_200(self):
"""_Test if the student count is not more than 200_
Test case ID 001:- एक विद्यालयको परिक्षार्थी संख्या हेरी सकभर १००, २०० भन्दा बढी
परीक्षार्थी एकै केन्द्रमा नपर्ने गरी बाँढ्न पर्ने

Returns:
Pass: If the student count is not more than 100
Fail: If the student count is more than 100
"""
ptf = ParseTSVFile(self.school_center_file)
data = ptf.get_rows()
for row in data:
student_count = row["allocation"]
if int(student_count) > 200:
warnings.warn(f"student count is more than 200 for the school {row}")
samurato marked this conversation as resolved.
Show resolved Hide resolved


def test_scode_cscode_not_same(self):
"""_Test if the output of scode is not equal to cscode_
Test case ID :- आफ्नै विद्यालयमा केन्द्र पार्न नहुने

Returns:
Pass: If the scode is not same as cscode
Fail: If the scode is same as cscode
"""
scf = ParseTSVFile(self.school_center_file)
data = scf.get_rows()
failures = []
for row in data:
scode = row["scode"]
cscode = row["cscode"]
if scode == cscode:
failures.append(f"scode and cscode are same for row {row} {scode}")
assert len(failures) == 0, f'{len(failures)} rows failed. {chr(10).join(failures)}'

def test_no_mutual_centers(self):
"""_Test if the scode's center is not same as cscode's
centre and vice versa_
Test case ID :- दुई विद्यालयका परीक्षार्थीको केन्द्र एक अर्कामा पर्न नहुने, अर्थात् कुनै विद्यालयका परीक्षार्थीको केन्द्र परेको विद्यालयका परीक्षार्थीहरूको केन्द्र अघिल्लो विद्यालयमा पार्न नहुने ।

Returns:
Pass: If the scode's center is not same as cscode's center
Fail: If the scode's center is same as cscode's center
"""
scf = ParseTSVFile(self.school_center_file)
data = scf.get_rows()
scodes_centers = []

scodes_centers = get_scode_cscode_id(data)

# Check if there are any duplicates id if duplicate then there is a collision between school and center
duplicates = [
item for item in set(scodes_centers) if scodes_centers.count(item) > 1
]

self.assertFalse(
duplicates,
f"Duplicate values found in scode_center_code: {', '.join(duplicates)}",
)

@unittest.skip ("needs review")
def test_undesired_cscode_scode_pair(self):
"""_Test if the schools and the centers are not matched based on the
cost preferences defined in the prefs.tsv file_
Test case ID :-
1 एकै स्वामित्व / व्यवस्थापनको भनी पहिचान भएका केन्द्रमा पार्न नहुने
2 विगतमा कुनै विद्यालयको कुनै केन्द्रमा पार्दा समस्या देखिएकोमा केन्द्र दोहोऱ्याउन नहुने

Returns:
Pass: If the schools with undesired scodes are are not paired with its cscodes
Fail: If the schools with same management are each other's center
"""

scf = ParseTSVFile(self.school_center_file)
cpf = ParseTSVFile(self.school_center_pref_file)
data_scf = scf.get_rows()
data_cpf = cpf.get_rows()
for cpf_data in data_cpf:
if int(cpf_data["pref"]) < PREF_CUTOFF:
data_cpf.remove(cpf_data)

failures = []

scodes_centers = get_scode_cscode_id(data_scf)

undesired_csodes_centers = get_scode_cscode_id(data_cpf)
for undesired_cscodes_center in undesired_csodes_centers:
if undesired_cscodes_center in scodes_centers:
failures.append(
f"Schools with undesired centers {undesired_cscodes_center}"
)

assert len(failures) == 0, f'{len(failures)} rows failed. {chr(10).join(failures)}'



if __name__ == "__main__":
unittest.main()
Empty file added test/utils/__init__.py
Empty file.
50 changes: 50 additions & 0 deletions test/utils/custom_tsv_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import csv


class ParseTSVFile:
def __init__(self, file_path):
self.file_path = file_path

def read_file(self):
with open(self.file_path, "r", newline="", encoding="utf-8") as file:
reader = csv.DictReader(file, delimiter="\t")
return reader

def get_columns(self):
with open(self.file_path, "r", newline="", encoding="utf-8") as file:
Copy link
Contributor

@horrormyth horrormyth May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice not to open and close files everywhere in the code; this means that the testing will be pretty cumbersome. Not to mention the repetitions. This also leads to multiple opportunities for executing files all over the code more open doors for executing malicious file. I would like to know the rationale behind it. IO should be done only once as a data input, not wherever we can. Also, since the data size is small, we can load it in memory and pass it along so we don't have to worry about th reader being exhausted.

I think this TSV parser could be reformatted if we really need one; otherwise, just a couple of functions would have done the job.

class TsvFileParser:
    def __init__(self, file_path):
        self.file_path = file_path
        self.parsed_data : YourType = None
    
    def parse(self, ..) -> None:
        try:
         with open(self.file_path, "r", newline="", encoding="utf-8") as file:
            reader = csv.DictReader(file, delimiter="\t")
        except YourException as e:
            # log, exit depending on the need, and inform the user or raise with custom exception handled elsewhere. But logging, exiting, and telling the user should be enough here.

        data = []
        for row in reader:
            data.append(row)
        self.parsed_data = data
        return self


tsv_parser = TsvFileParser(lovely_file_name)
parsed_data = tsv_parser.parse().parsed_data
or
tsv_parser.parse()
parsed_data = tsv_parser.parsed_data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed will optimise this code in my Next PR

reader = csv.DictReader(file, delimiter="\t")
return reader.fieldnames

def get_row_count(self):
with open(self.file_path, "r", newline="", encoding="utf-8") as file:
reader = csv.DictReader(file, delimiter="\t")
return len(list(reader))

def get_column_count(self):
with open(self.file_path, "r", newline="", encoding="utf-8") as file:
reader = csv.DictReader(file, delimiter="\t")
return len(reader.fieldnames)

def get_rows(self):
with open(self.file_path, "r", newline="", encoding="utf-8") as file:
reader = csv.DictReader(file, delimiter="\t")
data = []
for row in reader:
data.append(row)
return data

def get_column_data(self, column_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem unnecessary functions; it would be good to not add any functions that we don't need. This becomes unmanageable down the line and then we have functions that don't have tests also which is going to be pretty heavy for anyone to manage these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I beg to differ though. These functions are decoupled and maintained inside utils. In this way we have many other objects inherit the same functionality with less repeated code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, are we sure that this function is used elsewhere? Otherwise, leaving around unused and untested functions is a relatively bad idea.

with open(self.file_path, "r", newline="", encoding="utf-8") as file:
reader = csv.DictReader(file, delimiter="\t")
data = []
for row in reader:
data.append(row[column_name])
return data

def get_row_data(self, row_number):
with open(self.file_path, "r", newline="", encoding="utf-8") as file:
reader = csv.DictReader(file, delimiter="\t")
data = []
for row in reader:
data.append(row)
return data[row_number]