Online Technical Support
An Overview of converting Microsoft Word documents to ASCII - Examples
For Our Example, assume the MS Word Document looks like this.
In this example, we used the Pleading Wizard. Line Numbering is enabled.
We will assume this is a typical Court Transcript style.
MS Word Example
You can open this MS Word sample file by Clicking Here
This MS Word document is named WordSample1.doc (It will open in a new window)

Now, let's examine some of the important MS Word SAVE AS types
We see this when we select File then Save As from the Microsoft Word menu.
MS Word Example

When we use Save as "Text Only with Line Breaks (*.txt)"
This format converts soft Line Breaks correctly. Line Numbers and Page Breaks are lost,
but we fix that problem fairly easily when we open the converted text file in ProEDIT.
The text file is converted as Single Space, but that will work to our benefit in ProEDIT.
MS Word Example

When we use Save as "MS DOS Text with Line Breaks (*.txt)"
This format converts soft Line Breaks correctly. Line Numbers and Page Breaks are lost,
but we fix that problem fairly easily when we open the converted text file in ProEDIT.
The text file is converted as Single Space, but that will work to our benefit in ProEDIT.
MS Word Example

When we "Print to File" using the "Generic / Text Only" printer driver
We had to select "Tools", "Options", "Compatibility", "Use printer metrics to lay out document".
  • If the entire Word document uses a single, fixed pitch font (like Courier New):
    This is usually the BEST METHOD
    Page Breaks are preserved - this is good.
    Line Numbers are normally preserved - this is good.
  • If the Word document contains mixed fonts:
    Page Breaks are preserved - this is good.
    Line Numbers can be re-arranged as shown in this picture. This is hard to clean up.
MS Word Example




Below are some Save As types that DON'T convert to ASCII well from MS Word

When we use Save as "MS DOS Text with Layout"
This format doesn't work well. Lines don't wrap like the original document.
Line Numbers are lost, Page Breaks are lost. This is bad.
Two carriage returns are inserted after each line (by MSWord) to simulate double spacing.
MS Word Example

When we use Save as "Hypertext Markup Language (*.htm)"
This format doesn't work well. Line Numbers and Page Breaks are Lost.
Additionally, the html Markup codes are a big mess. Don't use this format.
MS Word Example

When we use Save as "Rich Text Format (*.rtf)"
This format doesn't work well. All soft returns are completely lost.
Paragraphs can generate single lines that contain hundreds of characters. What a mess.
Page Breaks and Line Numbers are lost. It is impossible to clean up this ASCII file.
MS Word Example