0

I am using pandas read_csv() function to read some CSV content and want to use "high" or "round_trip" floating-point precision.

The following works in the Python REPL or running a program with the python interpreter dircectly.

import pandas as pd
from io import StringIO

df = pd.read_csv( StringIO('0.00042119645,3.4,8.8244e-5\r\n'), float_precision='round_trip' )

If I put in a pytest test it also works :)

However, if I use the pyfakes fixture then it fails !! For some reason the pd.read_csv() function uses the python engine instead of the default C engine. This is true even if I explicitly set engine='c'.

The error reported is:

ValueError("The 'float_precision' option is not supported with the 'python' engine")
import pandas as pd
from io import StringIO

def test_pandas_read_csv(
        #mocker,     #! pytest-mock test fixture
        fs,         #! pyfakefs test fixture
        ):
    try :
        df = pd.read_csv( StringIO('0.00042119645,3.4,8.8244e-5\r\n'), float_precision='round_trip' )
        assert True
    except Exception as exc:
        assert False

How do I use the pandas read_csv() function with the default c engine in my pytest tests that also require the pyfakes fixture?

Here is my pytest code. The tests fail - comment out the fs, line to get it to pass all tests.

import pytest

import pandas as pd
from io import StringIO

class Test_Pandas_Read_CSV :

    @pytest.mark.parametrize(
            'csv_str,kwargs,exp_status',
            [
                (
                    ('0.00042119645,3.4,8.8244e-5\r\n'),    #! csv_str
                    dict(                                   #! kwargs
                        float_precision='round_trip',
                        # engine='c',
                    ),
                    True,                                   #! exp_status
                ),
                (
                    ('0.00042119645,3.4,8.8244e-5\r\n'),    #! csv_str
                    dict(                                   #! kwargs
                        float_precision='round_trip',
                        engine='c',
                    ),
                    True,                                   #! exp_status
                ),
                (
                    ('0.00042119645,3.4,8.8244e-5\r\n'),    #! csv_str
                    dict(                                   #! kwargs
                        float_precision='round_trip',
                        engine='python',
                    ),
                    False,                                  #! exp_status
                ),
            ]
        )
    def test_pandas_read_csv(
            self,
            # mocker,                                         #! pytest-mock test fixture
            fs,                                             #! pyfakefs test fixture
            csv_str     : str,
            kwargs,
            exp_status  : bool,
        ) :

        try :
            df = pd.read_csv( StringIO(csv_str), **kwargs )
            status = True
        except Exception as exc:
            status = False

        assert status == exp_status
3
  • I upgraded (pip install -U pandas) to the latest pandas (1.3.5), which also upgrade numpy (1.21.6) and the tests now pass. I have/had pandas (0.23.3) and numpy (1.16.2) installed as site-packages on my Debian 10 Buster box. This is what is actually used on the target device, so I would prefer to also run my pytests with those versions. Not sure if that's possible now? Commented Feb 1, 2023 at 10:58
  • The minimum version of pandas that works is 1.3 (1.3.5), which requires numpy >= 1.17.3. Commented Feb 1, 2023 at 11:10
  • I have a work-around that involves mocking the pandas.read_csv() call with a function that first pops the float_precision key from the keyword arguments, and then calls the proper/original pandas.read_csv() function. I wrapped that in a customized pyfakes fixture to do the mocking. Though this makes the tests pass, I am concerned that the real C based functions aren't being called and thus the results may vary from what would run on the real target. Commented Feb 1, 2023 at 23:44

1 Answer 1

1

My workaround involves mocking the pandas.read_csv() call with a function that first pops the float_precision key from the keyword arguments, and then calls the proper/original pandas.read_csv() function. I wrapped that in a customized pyfakes fixture to do the mocking.

Though this makes the tests pass, I am concerned that the real C-based functions aren't being called and thus the results may vary from what would run on the real target.

orig_pandas_read_csv = pd.read_csv

def mock_pandas_read_csv( *args, **kwargs ):
    kwargs.pop('float_precision', None)
    return orig_pandas_read_csv( *args, **kwargs )


@pytest.fixture
def my_fs(mocker, fs):
    mocker.patch.object( pd, 'read_csv', new=mock_pandas_read_csv )
    yield fs

...

    def test_pandas_read_csv(
            self,
            # mocker,                                         #! pytest-mock test fixture
            my_fs,                                          #! pyfakefs test fixture (with mocked pandas functions)
            csv_str     : str,
            kwargs,
            exp_status  : bool,
            ) :
        try :
            df = pd.read_csv( StringIO(csv_str), **kwargs )
            status = True
        except Exception as exc:
            status = False

        assert status == exp_status
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.