Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - func subtract_temporals in operations.py summing a list + tuple in case the list is empty #368

Open
maklil opened this issue Mar 24, 2024 · 1 comment

Comments

@maklil
Copy link

maklil commented Mar 24, 2024

Software versions

  • Django: 5.0.2
  • mssql-django: 1.4
  • python: 3.11.7
  • SQL Server: Microsoft SQL Server 2022 (RTM) - 16.0.1000.6 (X64)
  • OS: Microsoft Windows 10 Entreprise (Version 10.0.19044 Build 19044)

Table schema and Model

class Project(models.Model):
    project_name = models.CharField(max_length=300, null=True)
    creation_date = models.DateTimeField(null=True, blank=True)
    hours_spent = models.FloatField()

Database Connection Settings

DATABASES = {
    'default': {
        'ENGINE': 'mssql',
        'NAME': 'somedb',
        'HOST': 'XYZ\SQLEXPRESS',
        'OPTIONS': {
            'driver': 'ODBC Driver 17 for SQL Server',
        }
    }
}

Problem description and steps to reproduce

I have a function that takes as a parameter a list of "Projects" (list of the model above instances), and calculates a score based on the two fields "creation_date" and "hours_spent", here is the code :

def score_projects(projects):
    difference_duration = ExpressionWrapper(
        F('creation_date__date') - datetime.strptime(CRITERE_DATE_MIN, '%Y-%m-%d').date(), 
        output_field=fields.FloatField())  
    days_difference = ExpressionWrapper(F('difference_duration') / MICROSECONDS_IN_DAY, 
        output_field=fields.DecimalField())
        
    scored_projects = projects\
            .annotate(difference_duration=difference_duration)\
            .annotate(days_difference=days_difference)\
            .annotate(score=ExpressionWrapper(
                ( (P1 * F('days_difference') + P2 * F('hours_spent') ) / (P1 + P2) )
                ,output_field=fields.DecimalField())
                )
    return scored_projects

Where :

CRITERE_DATE_MIN = '2015-01-01'
MICROSECONDS_IN_DAY = 86400000000
P1 = 1
P2 = 1

I'm using ExpressionWrapper in order to extend my model by adding some calculated fields :

  • difference_duration : calculates the difference in miliseconds between a reference date (here CRITERE_DATE_MIN) and instance's creation_date
  • days_difference : equals to difference_duration converted in days (by dividing by MICROSECONDS_IN_DAY)
  • score : a weighted mean between the added field days_difference and model's field hours_spent, the wieghts here are P1 and P2

a dataset example :

- Project(project_name='A', creation_date='2016-09-06', hours_spent=1000)
- Project(project_name='B', creation_date='2014-01-27', hours_spent=200)
- Project(project_name='C', creation_date='2020-03-11', hours_spent=300)
- Project(project_name='D', creation_date=None, hours_spent=400)
- Project(project_name='E', creation_date='2017-02-02', hours_spent=0)

Please note that my creation_date is a nullable field, and in the case of hours_spent, the field is not nullable but there can be >= 0 or <0 values. In my case, the bug occured without the presence of null values in creation_date (omit project D here).

The problem occurs when I'm calling the function to calculate the score of some dataset, an exception occurs specifically in operations.py in the function subtract_temporals

Expected behavior and actual behavior

Expected behavior : a list of Projects with the 3 fields annotated to it (difference_duration, days_difference, score)
Actual behavior : exception occuring when annotating the difference_duration

Error message/stack trace

Traceback (most recent call last):
  File "C:\Users\XXX\Documents\Django_projects\myprj\api\views\views_all_projects.py", line 94, in post
    p3 = list(score_projects(c1))
         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\query.py", line 400, in __iter__
    self._fetch_all()
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\query.py", line 1928, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\query.py", line 91, in __iter__
    results = compiler.execute_sql(
              ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\sql\compiler.py", line 1549, in execute_sql
    sql, params = self.as_sql()
                  ^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\mssql\compiler.py", line 212, in as_sql
    extra_select, order_by, group_by = self.pre_sql_setup()
                                       ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\sql\compiler.py", line 84, in pre_sql_setup
    self.setup_query(with_col_aliases=with_col_aliases)
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\sql\compiler.py", line 73, in setup_query
    self.select, self.klass_info, self.annotation_col_map = self.get_select(
                                                            ^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\sql\compiler.py", line 296, in get_select
    sql, params = self.compile(col)
                  ^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\mssql\compiler.py", line 442, in compile
    return super().compile(node, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\sql\compiler.py", line 546, in compile
    sql, params = node.as_sql(self, self.connection)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\expressions.py", line 1326, in as_sql
    return compiler.compile(self.expression)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\mssql\compiler.py", line 442, in compile
    return super().compile(node, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\sql\compiler.py", line 546, in compile
    sql, params = node.as_sql(self, self.connection)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\django\db\models\expressions.py", line 830, in as_sql
    return connection.ops.subtract_temporals(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\Documents\PythonVirtualEnvironments\dc_venv_3.11.7\Lib\site-packages\mssql\operations.py", line 540, in subtract_temporals
    params = rhs_params + lhs_params
             ~~~~~~~~~~~^~~~~~~~~~~~
TypeError: can only concatenate list (not "tuple") to list

Any other details that can be helpful

The problem seems caused by a non-initialized empty list (in this case either one of rhs_params or lhs_params), that instead of providing an empty list [], it's providing an empty tuple (), precisely it occurs in the (if internal_type == 'DateField') statement, here's what I got in my failing test when printing these values :

                rhs_params = ['2015-01-01']
                lhs_params = ()

I was able to fix this problem simply by editing the function subtract_temporals and adding intialization in the case of empty lists :

def subtract_temporals(self, internal_type, lhs, rhs):
        lhs_sql, lhs_params = lhs
        rhs_sql, rhs_params = rhs
        
        # ADDED this fix
        if not lhs_params:
            lhs_params = []
        if not rhs_params:
            rhs_params = []
        ###
            
        if internal_type == 'DateField':
            sql = "CAST(DATEDIFF(day, %(rhs)s, %(lhs)s) AS bigint) * 86400 * 1000000"
            # print(f"\n\n \t\trhs_params = {rhs_params}")
            # print(f" \t\tlhs_params = {lhs_params} \n")
            params = rhs_params + lhs_params
        else:
            SECOND = "DATEDIFF(second, %(rhs)s, %(lhs)s)"
            MICROSECOND = "DATEPART(microsecond, %(lhs)s) - DATEPART(microsecond, %(rhs)s)"
            sql = "CAST({} AS bigint) * 1000000 + {}".format(SECOND, MICROSECOND)
            params = rhs_params + lhs_params * 2 + rhs_params
        return sql % {'lhs': lhs_sql, 'rhs': rhs_sql}, params
@mShan0
Copy link
Contributor

mShan0 commented Mar 26, 2024

Thanks for the detailed info. We'll definitely take a look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants