How To Convert Microsoft Word 2010 Documents to HTML

500912 How To Convert Microsoft Word 2010 Documents to HTML

Converting Word documents to HTML web pages can be useful for publishing content online or making it easier to edit in a web editor. While Microsoft Word 2010 has a built-in feature to save documents as HTML, it produces less-than-ideal code that can cause issues down the road.

In this tutorial, we’ll cover how to convert Word docs to HTML using Word itself, as well as better alternative options for cleaner conversions.

Overview of Converting Word to HTML

Here’s a quick overview before we dive into the step-by-step instructions:

  • Word’s Save as HTML – Word has a Save as type option for HTML that converts documents. However, the resulting HTML contains Word-specific code and extra styling that can cause problems later on.
  • Alternatives – Better options exist for publishing web pages long-term, like using a dedicated web editor or converter tool to produce cleaner HTML markup.
  • Manual Copy-Paste – You can copy-paste content from Word into an HTML editor, then manually style it with CSS. More work but gives you full control.

Now let’s look at how to actually convert Word documents to HTML.

Using Word’s Save As HTML

Here is how to save a Word document as an HTML web page:

  1. Open the Word document you want to convert
  2. Click File > Save As
  3. Choose the location to save your HTML file
  4. For the Save as type option, select Web Page (*.htm; *.html) Save as type HTML in Word
  5. Click Save

This will convert and save your Word document as an HTML web page.

However, the HTML Word produces leaves much to be desired…

Limitations of Word’s HTML

While handy for quick conversions, using Word’s Save as HTML has some notable downsides:

  • Strange CSS styles that can override other styles on a web page
  • Non-standard HTML tags starting with “w:” that don’t render properly
  • Extra XML code blocks that slow page loading and performance
  • Issues displaying properly on mobile devices

Additionally, you have very little control or ability to customize the resulting HTML with Word’s converter.

For these reasons, it’s best to avoid using Word’s HTML option if you plan to publish web pages long-term. The extra code and styles it adds can cause problems down the road.

Alternative Options for Converting Word to HTML

Here are better options for converting Word to clean, publish-ready HTML:

Use a Web Editor

Dedicated web editors like Dreamweaver or Visual Studio Code make converting Word docs to HTML seamless. You can copy-paste content from Word then use the editor’s tools to style and format the HTML.

This gives you fine-grained control over the HTML markup without any extra Word code getting added.

Convert with Notepad++

Notepad++ is a free text editor for code and markup. To convert a Word doc:

  1. Copy-paste the text into Notepad++
  2. Manually format it with HTML tags
  3. Clean up any formatting issues caused by pasting
  4. Add a .html file extension when saving

While more work than an automated converter, this results in clean and valid HTML free of Word-specific code.

Save as Filtered HTML in Word

Word has a Save as Web Page, Filtered option that produces cleaner HTML than the normal exporter. It strips out some Word-specific tags and code.

However, it is still less customizable than using a dedicated web editor or converter tool. But the filtered HTML works better than the standard HTML format if you must convert docs within Word itself.

Use VBA Macros to Clean Up HTML

For batch converting multiple Word documents, you can automate the process with Visual Basic for Applications (VBA) macros.

Write a macro to:

  1. Open each Word doc
  2. Save as HTML
  3. Strip out unwanted markup
  4. Save the cleaned up HTML

This takes more technical know-how but handles large volumes of Word to HTML conversions while removing unwanted code.

Convert with Open XML SDK

The Open XML SDK is an API from Microsoft that allows programmatically interacting with .docx Word files.

Developers can use the SDK to extract document contents and save them as HTML. This allows full control over the output HTML.

However, it requires writing code and more advanced programming skills.

Manually Convert Word to HTML

If going the manual route, here is one way to convert Word to HTML by copy-pasting:

  1. Copy the text content from a Word document
  2. Paste it into a plain text editor like Notepad or TextEdit
  3. Manually add HTML tags like <h1> headings around content
  4. Wrap paragraphs in <p> tags
  5. Add other text formatting with HTML
  6. Clean up any weird formatting caused by pasting from Word
  7. Save the file with a .html extension

Although more labor intensive, copy-pasting content piece-by-piece gives you complete control over the structure and semantics of the resulting HTML.

Conclusion

Converting Word documents to HTML can be easy but also problematic if using Word’s built-in exporter. While handy, it produces bloated HTML code that can cause issues publishing online.

For the best quality and cleanest conversions, use a dedicated web editor or converter tool. Or go the manual route of copy-pasting content into HTML tags yourself.

The goal is HTML that is:

  • Clean and valid markup without Word-specific code
  • Easily customizable with CSS styling
  • Lean code for fast page loads
  • Works on all devices like mobile and tablets

Taking the extra time up front to convert Word to publish-ready HTML pays off in the long run compared to quick and dirty exports using Word’s flawed Save as HTML feature.

About The Author