1

I have two functions in a python file. I want to do some unit tests for these functions using Mock.

def col_rename(col_name):
    reps = ((' ', '_&'), ('(', '*_'), (')', '_*'), ('{', '#_'), ('}', '_#'))
    new_cols = reduce(lambda a, kv: a.replace(*kv), reps, col_name)
    return new_cols

def rename_characters(df):
    df_cols = df.schema.names
    for x in df_cols:
        df = df.withColumnRenamed(x, col_rename(x))
    return df

In the above function withColumnRenamed is a function in pyspark that will return a column after is renames the column name. df is a pyspark data frame.

I am able to do unit testing to the col_rename function.

I am able to do unit testing to the rename_characters function by creating data frames manually in pyspark.

Now I want to do the unit testing using Mock in python.

I have tried something like this below. I am not sure if this is correct or What I am doing is completely wrong

import unittest
from mock import patch

class Test(unittest.TestCase):
    @patch('mymodule.rename_characters')
    def test_func(self, rename_characters_mock):
        rename_characters_mock.return_value = 'mocked values'
        self.assertEqual(return_value, 'mocked_values'))

How can I do Mocking for the unit testing as in the above scenario

4
  • from mymodule import rename_characters, r u sure we can import a func? Commented Mar 15, 2018 at 2:39
  • @Gang In pycharm it gave me unused import statement error, I removed the import statement Commented Mar 15, 2018 at 2:46
  • almost there. it makes more sense if you want to mock pyspark.x, self.assertEqual(return_value, 'mocked_values')) the return_value is not defined, are u try to self.assertEqual(mymodule.rename_charaters(), 'mocked_value' ? Commented Mar 15, 2018 at 2:52
  • @Gang I want to try self.assertEqual(mymodule.rename_charaters(), 'mocked_value' Commented Mar 15, 2018 at 3:00

1 Answer 1

1

you might need this

import mymodule

Outside Test class define a local function

def local_rename_characters():
    return 'mocked_local_values'

This should work

@patch('mymodule.rename_characters')
def test_func(self, rename_characters_mock):
    rename_characters_mock.return_value = 'mocked values'
    self.assertEqual(mymodule.rename_characters(), 'mocked_values')

Alternatives using side_effect

@patch('mymodule.rename_characters')
def test_func(self, rename_characters_mock):
    rename_characters_mock.side_effect = local_rename_characters
    self.assertEqual(mymodule.rename_characters(), 'mocked_local_values')
Sign up to request clarification or add additional context in comments.

3 Comments

If I want to mock the withColumnRenamed method that is inside the rename_characters function how can I do that
DataFrame instance is already a variable, u do not really need to mock. but if u do want to play and learn, @patch('mymodule.pyspark.sql.DataFrame.withColumnRenamed') and then mocked_withColumRenamed.side_effect = local_create_dummy_df, using createDataFrame
could you please have a look at https://stackoverflow.com/questions/49420660/unit-test-pyspark-code-using-python

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.