2

How can I replace a string/word in a Word Document via ASP.NET? I just need to replace a couple words in the document, so I would like to stay AWAY from 3rd party plugins & interop. I would like to do this by opening the file and replacing the text.

The following attempts were made:

I created a StreamReader and Writer to read the file but I think that I am reading and writing in the wrong format. I think that Word Documents are stored in binary?? If word documents are binary, how would I read and write the file in binary?

    Dim template As String = Request.MapPath("documentName.doc")
    If File.Exists(template) Then
        Dim sr As New StreamReader(template)
        Dim content As String = sr.ReadToEnd()
        sr.Close()
        Dim sw As New StreamWriter(template)        
        content = content.Replace("@ T O D A Y S D A T E", Date.Now.ToString("MM/dd/yyyy"))
        sw.Write(content)
        sw.Close()
    Else
1
  • Did you get "@ T O D A Y S D A T E" from a Hex-dump? If os, loose the extra spaces. Commented Mar 16, 2010 at 22:32

4 Answers 4

2

Word binary format is proprietary to Microsoft. The specification to read the binary format is complex and will take you ages to learn about the document structure and the internal bit and byte structure. I really dont think you will save yourself anytime going down this path, so consider the below:

  • Use Open XML
  • Automate Word
  • Use third party library like Aspose
  • Use RTF rather than Doc. You can then look for specific RTF tag with your text and replace it with another set of RTF text block. This is probably the simplest for what you want to do if RTF is an acceptable format.

Personal experience, automating Word isn't as bad as it sounds. It is really not suitable for server high volume environment, but for smaller load, it works well of course if you write your code well to manage the application object and handling exceptions.

EDITED: Corrected about my initial NDA comment mentioned. This was the case when I worked on this back in 2005/6 and didnt realize Microsoft had decided to publish that in the recent year.

Sign up to request clarification or add additional context in comments.

Comments

1

Lots of choices:

  1. Some of them expensive (Apose)
  2. Some of them hard (binary formats)
  3. Some of them require Interop (VSTO) or newer formats (Open XML)
  4. Some of them not mentioned yet, like
    1. running Word on the server and just writing to that (not recommended by MSFT, but probably your only real choice for a) cheap, b) simple
    2. OfficeWriter.

1 Comment

I did mention point 4.1, with 'DO NOT' in front of it. I agree it may be the last resort, but add a restarting schedule for the server too.
0

If word documents are binary, how would I read and write the file in binary?

They are, and that's why you should use a third party library to program against them.

I would like to stay AWAY from 3rd party plugins & interop

This requirement makes the task extremely hard. If your documents are in the "old Word format" (.doc), I will almost say that you are out of luck. If you can use Word 2007 documents (.docx) instead, you should be able to solve the problem by unzipping the file (it's essentially a ZIP archive), do search/replace in contained XML files and zip the document up again.

See also: Generating a Word Document with C#

4 Comments

Thank you for your input. Unfortunately I am working with XP Word documents and cannot upgrade them to go the XML route (company can't upgrade all their XP Office to newer version). I know for a fact that this can be done and I created a program to do something similar to this back in the day using VB3.
@jReedInc, there exists a plugin to make Word/XP read and write DOCX.
@jreedinc: Of course it can be done, but it might just be very very hard. How much do you charge per hour? How much does Aspose Words cost? :)
$899 for Aspose, THEN learning the syntax and dealing with a another animal (even more hours). Jørn Schou-Rode, have you done this before? I really do not think that it's hard at all. Like I said before, A LONG TIME AGO I wrote a simple script to edit a string in a doc. Maybe I should just dig through old hard drives...
0

You could perform Word automation on the server to easily do it, but that route is fraught with danger. Automation is not designed to run server side and you will find it regularly hangs when Word pop's up a prompt or confirmation box waiting for input that nobody can see.

You have to make a trade off, use Word automation and accept it may hang pretty regularly (anything from daily to weekly), or buy a third party solution. I use Aspose and it has solved a lot of problems.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.