CSV iteration addition algorithm Python

Question

I am currently trying to write an algorithm that reads in a CSV file, gets a list of names from column 0, a list of hours worked from column 6, then iterate through a list of staff names, if a name is equal to the current name, grab the relevant hours, add them to a total, once the name is fully checked in the whole file, the name is removed from the list and the next name gets checked.

main.py

from modules import gui_module as gui
from modules import csv_operations as csvcalc

if __name__ == "__main__":

    programWindow = gui.programGUI("CSV Calculator", 200, 150)
    programWindow.run()

    finalStaffHours = csvcalc.calculate_csv_data(gui.csv_file_path, 'staff', 'shift time ex. break (hr)') # Send the CSV file path to the iteration module.
    print (finalStaffHours)

csv_operations.py

import pandas as pd
import csv

def calculate_csv_data(csv_file_path, name_column, hours_column):
    """ Take CSV File, remove header row and find the length of said file."""

    dataFrame = pd.read_csv(csv_file_path)
    file_length = len(dataFrame) # Find length of file

    uncheckedNames = get_name_column(dataFrame, 'staff') # Get list of staff names - contains duplicates
    allHoursWorked = get_hours_column(dataFrame, 'shift time ex. break (hr)')

    finalStaffHours = calculate_staff_hours(allHoursWorked, uncheckedNames, int(file_length))
    return finalStaffHours

def calculate_staff_hours(all_hours_worked, unchecked_names, file_length):
    """ Calculates each staff members hours then returns the list to main."""
    staffHoursAndName = []
    currentHourTotal = 0.0
    uncheckedNamesRemovedDupe = list(set(unchecked_names))
    while uncheckedNamesRemovedDupe:
        currentNameToCheck = uncheckedNamesRemovedDupe[0]
        for i in range(file_length):
            if i == (file_length + 1):
                hourAndNameCombined = currentNameToCheck + str(currentHourTotal)
                staffHoursAndName.append(hourAndNameCombined)
                currentHourTotal = 0.0
                del uncheckedNamesRemovedDupe[0]
            if currentNameToCheck != unchecked_names[i]:
                 continue
            if currentNameToCheck == unchecked_names[i]:
                 currentHourTotal += all_hours_worked[i]
                 print(currentHourTotal)
    if not uncheckedNamesRemovedDupe:
        return staffHoursAndName

def get_hours_column(csv_file_path, hours_column):
    """ Pull all of the hours worked from column 6 and return that. """
    if hours_column in csv_file_path.columns:
        return csv_file_path[hours_column].tolist()
    
def get_name_column(csv_file_path, name_column):
    """ Pull all of the staff names from column 0 CSV and return them. """
    if name_column in csv_file_path.columns:
        return csv_file_path[name_column].tolist()
type here

I can read in the CSV, get the list from column 0 and column 6 - the problem lies in the calculate_staff_hours function in csv_operations.py

It should add the numbers together for one person, once done then remove that person from the list, move on to the next and so on.

It adds the numbers from the CSV file in a never ending loop.

If this is not a homework question, pandas will be able to read in your csv, and sum up the hours worked for each name using .groupby() in a couple of lines of code — Emi OB
– Emi OB, Commented Sep 18 at 6:43
It's not a homework question, I didn't know this about pandas and it has blown my mind! Thank you, saved hours of work — JackInDaBeanSock
– JackInDaBeanSock, Commented Sep 18 at 8:39
Maybe first use print() (and print(type(...)), print(len(...)), etc.) to see which part of code is executed and what you really have in variables. It is called "print debugging" and it helps to see what code is really doing. — furas
– furas, Commented Sep 18 at 12:15
some of your code seems too complicated. Why to create get_name_column and get_hours_column to get column. They do the same. Besides if column doesn't exists then it will return None but you never check if you get column or None. So you could at once run csv_file_path[staff'].tolist() without functions. It could make code shorter and more readable. — furas
– furas, Commented Sep 18 at 12:18

S.B · Accepted Answer · 2025-09-18 06:47:16Z

5

i == (file_length + 1) will never be True. Therefore the first element of uncheckedNamesRemovedDupe will never be deleted.

Thus, providing uncheckedNamesRemovedDupe was a non-empty list before entering the while loop, your code will run ad infinitum

edited Sep 18 at 6:47

S.B

17k12 gold badges38 silver badges73 bronze badges

answered Sep 18 at 5:44

jackal

29.1k3 gold badges9 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

JackInDaBeanSock Sep 18 at 8:40

I don't know how I didn't notice that it will never reach file_length +1 - thank you for that.

Emi OB · Accepted Answer · 2025-09-18 09:27:04Z

3

(As discussed in the comments) you can do this really easily using Pandas, avoiding complicated logic.

Use .read_csv() to read in your file, and .groupby() to sum up the hours:

import pandas
df = pd.read_csv('filename.csv')
df.groupby('Name Column')['Hours'].sum()

As an example:

df = pd.DataFrame({'Name Column':['Jim', 'Sarah', 'Jim', 'Judy', 'Jim', 'Sarah',   'Sarah', 'Judy'],
'Hours':[5, 7, 3, 2, 6, 8, 8, 4]})

  Name Column  Hours
0         Jim      5
1       Sarah      7
2         Jim      3
3        Judy      2
4         Jim      6
5       Sarah      8
6       Sarah      8
7        Judy      4

df.groupby('Name Column')['Hours'].sum()

Name Column
Jim      14
Judy      6
Sarah    23

answered Sep 18 at 9:27

Emi OB

3,3953 gold badges20 silver badges40 bronze badges

2 Comments

JackInDaBeanSock Sep 18 at 9:30

Thank you for the solution, after the previous comment I did some reading and came to the same sort of answer! Thank you, groupby() has saved 30+ lines of writing!

furas Sep 18 at 12:23

it this answer helped you then you could mark this answer as accepted. it will be also information for others that problem was solved.

JackInDaBeanSock · Accepted Answer · 2025-09-18 09:31:37Z

1

This was how I solved this!

import pandas as pd

def calculate_csv_data(csv_file_path, name_column, hours_column):
    """ Take CSV File, Staff Columnm and Hours Column, returns each staff member with their total hours. """
    dataFrame = pd.read_csv(csv_file_path)
    groupedData = dataFrame.groupby(['staff','shift time ex. break (hr)']).sum()
    finalData = dataFrame.groupby(['staff'], sort = True)['shift time ex. break (hr)'].sum()
    return finalData

answered Sep 18 at 9:31

JackInDaBeanSock

214 bronze badges

1 Comment

furas Sep 18 at 12:21

you create groupedData but later you never use it - so you could remove this line.

Collectives™ on Stack Overflow

CSV iteration addition algorithm Python

3 Answers 3

1 Comment

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related