Memory usage per function in python

Question

import time
import logging
from functools import reduce

logging.basicConfig(filename='debug.log', level=logging.DEBUG)



def read_large_file(file_object):
    """Uses a generator to read a large file lazily"""

    while True:
        data = file_object.readline()
        if not data:
            break
        yield data


def process_file_1(file_path):
    """Opens a large file and reads it in"""

    try:
        with open(file_path) as fp:
            for line in read_large_file(fp):
                logging.debug(line)
                pass

    except(IOError, OSError):
        print('Error Opening or Processing file')


    def process_file_2(file_path):
        """Opens a large file and reads it in"""

        try:
            with open(path) as file_handler:
                while True:
                    logging.debug(next(file_handler))
        except (IOError, OSError):
            print("Error opening / processing file")
        except StopIteration:
            pass


    if __name__ == "__main__":
        path = "TB_data_dictionary_2016-04-15.csv"

        l1 = []
        for i in range(1,10):
            start = time.clock()
            process_file_1(path)
            end = time.clock()
            diff = (end - start)
            l1.append(diff)

        avg = reduce(lambda x, y: x + y, l1) / len(l1)
        print('processing time (with generators) {}'.format(avg))


        l2 = []
        for i in range(1,10):
            start = time.clock()
            process_file_2(path)
            end = time.clock()
            diff = (end - start)
            l2.append(diff)

        avg = reduce(lambda x, y: x + y, l2) / len(l2)
        print('processing time (with iterators) {}'.format(avg))

Output of the program:

C:\Python34\python.exe C:/pypen/data_structures/generators/generators1.py
processing time (with generators) 0.028033358176432314
processing time (with iterators) 0.02699498330810426

In the above program I was attempting to measure the time taken to open a read a large file with iterators with that using generators. The file is available here. The time for reading the file with iterators is much lower than the same with generators.

I am assuming that If I were to measure the amount of memroy used by the functions process_file_1 and process_file_2 then generators will outperform iterators. Is there a way to measure memory usage per function in python.

Do a read that you simply discard before the 2 tests to make sure any caching of the file by the operating system applies to both runs. — tdelaney
– tdelaney, Commented Nov 27, 2016 at 22:05

Moinuddin Quadri · Accepted Answer · 2016-11-27 21:17:44Z

6

Firstly, using single iteration of the code for measuring it's performance is not a good idea. Your results might vary due to any glitch in your system performance (for example: background process, cpu doing garbage collection, etc). You should be checking it for multiple iterations of the same code.

For measuring the performance of the code, use timeit module:

This module provides a simple way to time small bits of Python code. It has both a Command-Line Interface as well as a callable one. It avoids a number of common traps for measuring execution times.

For checking the memory consumption of your code, use Memory Profiler:

This is a python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for python programs.

edited Nov 27, 2016 at 21:17

answered Nov 27, 2016 at 21:10

Moinuddin Quadri

48.4k13 gold badges101 silver badges137 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Memory usage per function in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related