1

I have an excel file with 2 columns. There are some labels in the left column sorted according to date. For each date, a list of labels appear along with some values on the right. I need to read each date, find some specific labels and print the values accordingly. I have posted an excerpt from the original file to give you a peek at how it looks.

Row Label 1    Row Label 2
7/21/2015      123
Label 1        10.5
Label 2        20.6
[.....]        15
Label 5        25.9
Label 6        30.5
[.....]        544
7/22/2015      456
Label 1        15.8
Label 2        52.8
[.....]        87
Label 5        99
Label 6        55
Goes on....

Now as you can see above, I need to find the date first, then print only Label 1,2 and 6 along with their values in the right column. These labels repeat for each date with different values. The excel has 1000's of line of this kind of text and I need to print each date, followed by those labels and their respective values.

The output should be something like this.

7/21/2015
Label 1       10.5
Label 2       20.6
Label 6       30.5

I'm quite new to python and I saw some posts that were using XLRD. I'm not sure how to approach this problem but if anybody can help me out with this, that would be great! Any sort of help is appreciated :)

1
  • 1
    I'm thinking Pandas is the right lib here. Commented Jul 21, 2015 at 6:02

3 Answers 3

2

The following script should get you started. It uses the openpyxl library to read an Excel spreadsheet in.

import openpyxl 

wb = openpyxl.Workbook()
wb = openpyxl.load_workbook(filename='input.xlsx')
ws = wb.active

for row in range(2, ws.get_highest_row() + 1):
    row_label_1 = ws['A%d' % row].value
    row_label_2 = ws['B%d' % row].value

    if row_label_1.find("/") != -1: # Simple test for date
        print row_label_1
    elif row_label_1 in ["Label 1","Label 2","Label 6"]:
        print "%-20s  %s" % (row_label_1, row_label_2)

Tested using Python 2.7

Sign up to request clarification or add additional context in comments.

Comments

1

The below script uses xlrd which will only work on older excel files with extensions of '.xls'. For excel files of type '.xlsx' then openpyxl will work.

Also the below example assumes that all the data in the first column is of data type TEXT. otherwise the below could be modified to work of cell data types.

Tested with python 2.7

import xlrd


header_column = 0
value_column = 1
accepted_labels = ['Label 1', 'Label 2', 'Label 6']
output = {}
output_child = {}

with xlrd.open_workbook("C:\\temp\\book1.xls") as work_book:
    work_sheet = work_book.sheet_by_index(0)

    num_rows = work_sheet.nrows - 1
    current_row = 0
    # loop through rows
    while current_row < num_rows:
        if 'label' not in work_sheet.cell_value(current_row, header_column).lower():
            date_header_value = work_sheet.cell_value(current_row, header_column)
            current_row += 1

            while 'label' in work_sheet.cell_value(current_row, header_column).lower() and current_row < num_rows:
                if work_sheet.cell_value(current_row, header_column) in accepted_labels:
                    output_child[work_sheet.cell_value(current_row, header_column)] = work_sheet.cell_value(current_row, value_column)

                current_row += 1

            output[date_header_value] = output_child
            current_row -= 1

        current_row += 1
print output

Comments

1
C:>pip install pandas

after you install pandas(python data analysis library) like above

import pandas as pd

df = pd.read_excel(filename, sheetname, skiprows=[0, 1], header=None, index_col=0)
df.index.name = '7/21/2015'
df.columns = ['Data']
writer = pd.ExcelWriter('result.xlsx', datetime_format='yyyy-mm-dd')
df.to_excel(writer)

If you want to handle xls, csv and many other type of dataset file I highly recommand pandas.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.