Skip to content

Commit

Permalink
Merge pull request #347 from capitalone/develop
Browse files Browse the repository at this point in the history
Release v0.14.3
  • Loading branch information
fdosani authored Oct 30, 2024
2 parents e7cd7e3 + 9c1fe52 commit b23da8e
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 12 deletions.
2 changes: 1 addition & 1 deletion datacompy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
Then extended to carry that functionality over to Spark Dataframes.
"""

__version__ = "0.14.2"
__version__ = "0.14.3"

import platform
from warnings import warn
Expand Down
26 changes: 15 additions & 11 deletions docs/source/snowflake_usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,12 @@ For ``SnowflakeCompare``
- ``on_index`` is not supported.
- Joining is done using ``EQUAL_NULL`` which is the equality test that is safe for null values.
- Compares ``snowflake.snowpark.DataFrame``, which can be provided as either raw Snowflake dataframes
or the as the names of full names of valid snowflake tables, which we will process into Snowpark dataframes.
or as the names of full names of valid snowflake tables, which we will process into Snowpark dataframes.


SnowflakeCompare Object Setup
---------------------------------------------------
SnowflakeCompare setup
----------------------

There are two ways to specify input dataframes for ``SnowflakeCompare``

Provide Snowpark dataframes
Expand Down Expand Up @@ -66,11 +67,12 @@ Provide Snowpark dataframes
print(compare.report())
Provide the full name (``{db}.{schema}.{table_name}``) of valid Snowflake tables
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Provide the full name (``db.schema.table_name``) of valid Snowflake tables
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Given the dataframes from the prior examples...

.. code-block:: python
df_1.write.mode("overwrite").save_as_table("toy_table_1")
df_2.write.mode("overwrite").save_as_table("toy_table_2")
Expand Down Expand Up @@ -210,6 +212,7 @@ There are a few convenience methods and attributes available after the compariso
print(compare.df2_unq_columns())
# OrderedSet()
Duplicate rows
--------------

Expand Down Expand Up @@ -260,9 +263,10 @@ as uniquely in the second.

Additional considerations
-------------------------
- It is strongly recommended against joining on float columns (or any column with floating point precision).
Columns joining tables are compared on the basis of an exact comparison, therefore if the values comparing
your float columns are not exact, you will likely get unexpected results.
- Case-sensitive columns are only partially supported. We essentially treat case-sensitive
columns as if they are case-insensitive. Therefore you may use case-sensitive columns as long as
you don't have several columns with the same name differentiated only be case sensitivity.

- It is strongly recommended against joining on float columns or any column with floating point precision.
Columns joining tables are compared on the basis of an exact comparison, therefore if the values
comparing your float columns are not exact, you will likely get unexpected results.
- Case-sensitive columns are only partially supported. We essentially treat case-sensitive columns as
if they are case-insensitive. Therefore you may use case-sensitive columns as long as you don't have several
columns with the same name differentiated only be case sensitivity.

0 comments on commit b23da8e

Please sign in to comment.