0

Apologies if this is a basic question, but let us say I have a tab delimited file named file.txt formatted as follows:

Label-A    [tab]    Value-1

Label-B    [tab]    Value-2

Label-C    [tab]    Value-3

[...]

Label-i    [tab]    Value-n

I want xlrd or openpyxl to add this data to the excel worksheet named Worksheet in the file workbook.xlsx such that the cells contain the following values. I do not want to affect the contents of any other part of workbook.xlsx other than the two columns that are affected

A1=Label-A

B1=Value-1

A2=Label-B

B2=Value-2

[etc.]

EDIT: Solution

import sys
import csv
import openpyxl

tab_file = sys.stdin.readlines()

reader = csv.reader(tab_file, delimiter='\t')
first_row = next(reader)
num_cols = len(first_row)

try:
    workbook = sys.argv[1]
    write_sheet = sys.argv[2]
except Exception:
    raise sys.exit("ERROR")

try:   
    first_col = int(sys.argv[3])
except Exception:
    first_col = 0

tab_reader = csv.reader(tab_file, delimiter='\t')
xls_book = openpyxl.load_workbook(filename=workbook)
sheet_names = xls_book.get_sheet_names()
xls_sheet = xls_book.get_sheet_by_name(write_sheet)
for row_index, row in enumerate(tab_reader):
    number = 0
    col_number = first_col
    while number < num_cols:
        cell_tmp = xls_sheet.cell(row = row_index, column = col_number)
        cell_tmp.value = row[number]
        number += 1
        col_number += 1
xls_book.save(workbook)
4
  • 1
    Is not basic, but you should show some effort on what you tried to do yourself. That way, people looking to answer this question will have something to work with (otherwise you risk getting downvoted too) Commented Apr 16, 2014 at 21:56
  • 2
    Thanks for the advice. I have added in what I have done, but this is my first time using Python (generally work in Bash) so I'm unsure how helpful it will be to readers. Commented Apr 16, 2014 at 22:29
  • If you don't set use_iterators=True then you can modify an existing file. Commented Apr 17, 2014 at 14:07
  • 1
    One (slight)more recommendation for Stack Overflow: You don't need to edit the title of your question to mark it as solved (as other forums recommend you to do) The fact that the post has an answer selected as the Chosen answer will tell other people that. BYW, kuddos for clearly labeling your EDIT with an EDIT, and not just changing the original question. Well done :-) Commented Apr 17, 2014 at 20:18

3 Answers 3

1

Since you said you are used to working in Bash, I'm assuming you're using some kind of Unix/Linux, so here's something that will work on Linux.

Before pasting the code, I'd like to point a few things:

Working with Excel in Unix (and Python) is not that straightforward. For instance, you can't open an Excel sheet for reading and writing at the same time (at least, not as far as I know, although I must recognize that I have never worked with the openpyxl module). Python has two well known modules (that I am used to working with :-D ) when it comes to handling Excel sheets: One is for reading Excel sheets (xlrd) and the second one for writing them (xlwt) With those two modules, if you want to modify an existing sheet, as I understand you want to do, you need to read the existing sheet, copying it to a writable sheet and edit that one. Check the question/answers in this other S.O. question that explain it with some more detail.

Reading whatever-separated files is much easier thanks to the csv module (its prepared for comma-separated files, but it can be easily tweaked for other separators). Check it out.

Also, I wasn't very sure from your example if the contents of the tab-separated file indicate somehow the row indexes on the Excel sheet or they're purely positional. When you say that in the tab-separated file you have Value-2, I wasn't sure if that 2 meant the second row on the Excel file or it was just an example of some text. I assumed the latest (which is easier to deal with), so whatever pair Label Value appears on the first row of your tab-separated file will be the first pair on the first row of the Excel file. It this is not the case, leave a comment a we will deal with it ;-)

Ok, so let's assume the following scenario:

You have a tab-separated file like this:

stack37.txt:

Label-A Value-1
Label-B Value-2
Label-C Value-3

The excel file you want to modify is stack37.xls. It only has one sheet (or better said, the sheet you want to modify is the first one in the file) and it initially looks like this (in LibreOffice Calc):

enter image description here

Now, this is the python code (I stored it in a file called stack37.py and it's located in the same directory of the tab-separated file and the excel file):

import csv
import xlwt
import xlrd
from xlutils import copy as xl_copy

with open('stack37.txt') as tab_file:
    tab_reader = csv.reader(tab_file, delimiter='\t')
    xls_readable_book = xlrd.open_workbook('stack37.xls')
    xls_writeable_book = xl_copy.copy(xls_readable_book)
    xls_writeable_sheet = xls_writeable_book.get_sheet(0)
    for row_index, row in enumerate(tab_reader):
        xls_writeable_sheet.write(row_index, 0, row[0])
        xls_writeable_sheet.write(row_index, 1, row[1])
    xls_writeable_book.save('stack37.xls')

After you run this code, the file stack37.xls will look like this:

enter image description here

What I meant about not knowing what you exactly wanted to do with the values in your tab-separated file is that regardless of what you name your items in there, it will modify the first row of the excel sheet, then the second... (even if your first Value is called Value-2, the code above will not put that value on the second row of the Excel sheet, but on the fist row) It just assumes the first line in the tab-separated file corresponds with the values to set on the first row of the Excel sheet.

Let explain with an slightly modified example:

Let's assume your original Excel file looks like the original excel file on my screenshot (the full of | Hello-Ax | Bye-Bx |) but your tab-separated file now looks like this:

stack37.txt:

foo bar
baz baz2

After you run stack37.py, this is how your Excel will look like:

enter image description here

(see? first row of the tab-separated file goes to the first row in the Excel file)

UPDATE 1:

I'm trying the openpyxl module myself... Theoretically (according to the documentation) the following should work (note that I've changed the extensions to Excel 2007/2010 .xlsx):

import csv
import openpyxl

with open('stack37.txt') as tab_file:
    tab_reader = csv.reader(tab_file, delimiter='\t')
    xls_book = openpyxl.load_workbook(filename='stack37.xlsx')
    sheet_names = xls_book.get_sheet_names()
    xls_sheet = xls_book.get_sheet_by_name(sheet_names[0])
    for row_index, row in enumerate(tab_reader):
        cell_tmp1 = xls_sheet.cell(row = row_index, column = 0)
        cell_tmp1.value = row[0]
        cell_tmp2 = xls_sheet.cell(row = row_index, column = 1)
        cell_tmp2.value = row[1]
    xls_book.save('stack37_new.xlsx')

But if I do that, my LibreOffice refuses to open the newly generated file stack37_new.xlsx (maybe is because my LibreOffice is old? I'm in a Ubuntu 12.04, LibreOffice version 3.5.7.2... who knows, maybe is just that)

Sign up to request clarification or add additional context in comments.

15 Comments

Thanks. This works on my local machine (I did not intend for the string values to correspond to Excel row values), however, since my system administrators have only imported openpyxl and xlrd those are the only ones I can use if I want it to run remotely (as would be my strong preference). I'm going to use a version of this for now, but I'd like to also see if this can be done with openpyxl (this thread has some promise: stackoverflow.com/questions/15004838/…)
openpyxl does support modifying an existing file.
@user2385133: Yeah, I was trying it right now... but I don't know what it did (or I did) that when I tried to adapt the code to openpyxl, it corrupted the new file (libreoffice won't open it)... Boh? I'll keep trying, though
Hmmm..okay. Would it be possible then, based on your understanding, to create a new workbook entirely based on portions of the old workbook? So let's say I have two tabs in an existing workbook, "Keep" and "Change." I want to create a new workbook that has "Keep" exactly as it was in the old workbook, and then "Change" which is completely distinct (and places in the data from the text file). Based on my project needs, a solution like that would also work.
@user3543052, with openpyxl or the "old" xlrd/xlwt? (I'm hoping for the latest, because I don't seem to be able to produce a working .xlsx excel sheet with openpyxl... at least, nothing LibreOffice likes :-D If it's the oldies (xlwt), sure, you can do that (by tabs you mean Sheets, right? Not columns?... is not that it matters that much, just checking... xlrd and xlwt are powerful!! :-) )
|
0

That's a job for VBA, but if I had to do it in Python I would do something like this:

import Excel
xl = Excel.ExcelApp(False)
wb = xl.app.Workbooks("MyWorkBook.xlsx")
wb.Sheets("Ass'y").Cells(1, 1).Value2 = "something"
wb.Save()

With an helper Excel.py class like this:

import win32com.client

class ExcelApp(object):
    def __init__(self, createNewInstance, visible = False):
        self._createNewInstance=createNewInstance

        if createNewInstance:
            self.app = win32com.client.Dispatch('Excel.Application')
            if visible:
                self.app.Visible = True
        else:
            self.app = win32com.client.GetActiveObject("Excel.Application")

    def __exit__(self):
        if self.app and self._createNewInstance:
            self.app.Quit()

    def __del__(self):
        if self.app and self._createNewInstance:
            self.app.Quit()

    def quit(self):
        if self.app:
            self.app.Quit()

1 Comment

This approach requires Excel for Windows. A lot of folks work with the files but not with Excel and not on Windows.
0

You should use the CSV module in the standard library to read the file.

In openpyxl you can have something like this:

from openpyxl import load_workbook
wb = load_workbook('workbook.xlsx')
ws = wb[sheetname]
for idx, line in enumerate(csvfile):
    ws.cell(row=idx, column=0) = line[0]
    ws.cell(row=idx, column=1) = line[1]
wb.save("changed.xlsx")

2 Comments

So I like the simplicity of this but I'm running into some trouble. Here is what I am doing: 1) Convert my file to csv and save it as "csvfile.txt" 2) set csvfile='csvfile.txt' 3) Run this code. Am I doing something wrong?
@user3543052 without seeing your code and a test file I have no idea what is going right or wrong. What are you converting to CSV?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.