pyfakefs fixture causes pandas.read_csv() to fail in pytest

Question

I am using pandas read_csv() function to read some CSV content and want to use "high" or "round_trip" floating-point precision.

The following works in the Python REPL or running a program with the python interpreter dircectly.

import pandas as pd
from io import StringIO

df = pd.read_csv( StringIO('0.00042119645,3.4,8.8244e-5\r\n'), float_precision='round_trip' )

If I put in a pytest test it also works :)

However, if I use the pyfakes fixture then it fails !! For some reason the pd.read_csv() function uses the python engine instead of the default C engine. This is true even if I explicitly set engine='c'.

The error reported is:

ValueError("The 'float_precision' option is not supported with the 'python' engine")

import pandas as pd
from io import StringIO

def test_pandas_read_csv(
        #mocker,     #! pytest-mock test fixture
        fs,         #! pyfakefs test fixture
        ):
    try :
        df = pd.read_csv( StringIO('0.00042119645,3.4,8.8244e-5\r\n'), float_precision='round_trip' )
        assert True
    except Exception as exc:
        assert False

How do I use the pandas read_csv() function with the default c engine in my pytest tests that also require the pyfakes fixture?

Here is my pytest code. The tests fail - comment out the fs, line to get it to pass all tests.

import pytest

import pandas as pd
from io import StringIO

class Test_Pandas_Read_CSV :

    @pytest.mark.parametrize(
            'csv_str,kwargs,exp_status',
            [
                (
                    ('0.00042119645,3.4,8.8244e-5\r\n'),    #! csv_str
                    dict(                                   #! kwargs
                        float_precision='round_trip',
                        # engine='c',
                    ),
                    True,                                   #! exp_status
                ),
                (
                    ('0.00042119645,3.4,8.8244e-5\r\n'),    #! csv_str
                    dict(                                   #! kwargs
                        float_precision='round_trip',
                        engine='c',
                    ),
                    True,                                   #! exp_status
                ),
                (
                    ('0.00042119645,3.4,8.8244e-5\r\n'),    #! csv_str
                    dict(                                   #! kwargs
                        float_precision='round_trip',
                        engine='python',
                    ),
                    False,                                  #! exp_status
                ),
            ]
        )
    def test_pandas_read_csv(
            self,
            # mocker,                                         #! pytest-mock test fixture
            fs,                                             #! pyfakefs test fixture
            csv_str     : str,
            kwargs,
            exp_status  : bool,
        ) :

        try :
            df = pd.read_csv( StringIO(csv_str), **kwargs )
            status = True
        except Exception as exc:
            status = False

        assert status == exp_status

I upgraded (pip install -U pandas) to the latest pandas (1.3.5), which also upgrade numpy (1.21.6) and the tests now pass. I have/had pandas (0.23.3) and numpy (1.16.2) installed as site-packages on my Debian 10 Buster box. This is what is actually used on the target device, so I would prefer to also run my pytests with those versions. Not sure if that's possible now? — user19007114
– user19007114, Commented Feb 1, 2023 at 10:58
The minimum version of pandas that works is 1.3 (1.3.5), which requires numpy >= 1.17.3. — user19007114
– user19007114, Commented Feb 1, 2023 at 11:10
I have a work-around that involves mocking the pandas.read_csv() call with a function that first pops the float_precision key from the keyword arguments, and then calls the proper/original pandas.read_csv() function. I wrapped that in a customized pyfakes fixture to do the mocking. Though this makes the tests pass, I am concerned that the real C based functions aren't being called and thus the results may vary from what would run on the real target. — user19007114
– user19007114, Commented Feb 1, 2023 at 23:44

tdy · Accepted Answer · 2023-02-04 02:19:32Z

My workaround involves mocking the pandas.read_csv() call with a function that first pops the float_precision key from the keyword arguments, and then calls the proper/original pandas.read_csv() function. I wrapped that in a customized pyfakes fixture to do the mocking.

Though this makes the tests pass, I am concerned that the real C-based functions aren't being called and thus the results may vary from what would run on the real target.

orig_pandas_read_csv = pd.read_csv

def mock_pandas_read_csv( *args, **kwargs ):
    kwargs.pop('float_precision', None)
    return orig_pandas_read_csv( *args, **kwargs )


@pytest.fixture
def my_fs(mocker, fs):
    mocker.patch.object( pd, 'read_csv', new=mock_pandas_read_csv )
    yield fs

...

    def test_pandas_read_csv(
            self,
            # mocker,                                         #! pytest-mock test fixture
            my_fs,                                          #! pyfakefs test fixture (with mocked pandas functions)
            csv_str     : str,
            kwargs,
            exp_status  : bool,
            ) :
        try :
            df = pd.read_csv( StringIO(csv_str), **kwargs )
            status = True
        except Exception as exc:
            status = False

        assert status == exp_status

Collectives™ on Stack Overflow

pyfakefs fixture causes pandas.read_csv() to fail in pytest

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related