MS Word documents are one of the most popular formats for the reporting. It allows presenting information with different styles and formatting exactly such as it should look on a paper. Often MS Word is not installed on the server/computer, nevertheless a developer wants to process these reports inside C#/VB.NET/ASP.NET project. The best way is using a professional .NET library that includes various Word API functions. One of these libraries is introduced by Elerium Software.
Elerium Word .NET Reader presents an easy way to read data and formatting of Word documents. Here are the basic steps of getting the text of the document.
First off all, a developer must install Elerium Word .NET Reader to the project:
using Docs.Word;
After that developer can easily read data from the Word document.
C# example:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Docs.Word;
namespace OpenDocument
{
class Program
{
static void Main(string[] args)
{
// Creates an instance of Document class
Document Doc = new Document();
// Reads a .doc file into internal document structure
Doc.ReadDoc(@"..\..\Data\DocFile.doc");
// Gets text of 1st paragraph of 1st section of the document
string Text = ((Paragraph)Doc.Sections[0].Nodes[0]).Text;
// Writes gotten text to console
Console.WriteLine(Text);
Console.ReadKey();
} } }
VB.NET Example:
Imports Docs.Word
Module Module1
Sub Main()
' Creates an instance of Document class
Dim Doc As New Document()
' Reads a .doc file into internal document structure
Doc.ReadDoc("..\..\Data\DocFile.doc")
' Gets text of 1st paragraph of 1st section of the document
Dim Text As String = DirectCast(Doc.Sections(0).Nodes(0), Paragraph).Text
' Writes gotten text to console
Console.WriteLine(Text)
Console.ReadKey()
End Sub
End Module
This sample demonstrates the reading of different text formatting such as Font Name, Size, Color, Background color, Footnotes etc.
C# Example:
using System;
using System.Collections.Generic;
using System.Text;
using System.Windows.Forms;
using Docs.Word;
namespace TextRun_Styles
{
private void Form1_Load(object sender, EventArgs e)
{
// Creates a new instance of Document class and reads a .doc file into this structure
Document Doc = new Document();
Doc.ReadDoc(@"..\..\Data\WordTextFormatting.doc");
// Gets two first text runs, in this example - two sentences
for (int i = 0; i < 2; i++)
{
// Gets text run
TextRun tTextRun = ((Paragraph)Doc.Sections[0].Nodes[0]).TextRuns[i];
// Writes its properties
textBox1.Text += "=== Text run " + (i+1) + " ===" + "\r\n";
textBox1.Text += "Text : " + tTextRun.Text + "\r\n";
textBox1.Text += "Font name : " + tTextRun.Style.FontName + "\r\n";
textBox1.Text += "Font size (in half-point) : " + tTextRun.Style.FontSize + "\r\n";
textBox1.Text += "Text color : " + tTextRun.Style.TextColor + "\r\n";
textBox1.Text += "Bold : " + tTextRun.Style.FontStyle.Bold + "\r\n";
textBox1.Text += "Italic : " + tTextRun.Style.FontStyle.Italic + "\r\n";
textBox1.Text += "Underlined : " + tTextRun.Style.FontStyle.Underlined + "\r\n";
textBox1.Text += "Strike-out : " + tTextRun.Style.FontStyle.StrikeOut + "\r\n\r\n";
} } } }
VB.NET Example:
Imports Docs.Word
Public Class Form1
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
' Creates a new instance of Document class and reads a .doc file into this structure
Dim Doc As New Document()
Doc.ReadDoc("..\..\Data\WordTextFormatting.doc")
' Gets two first text runs, in this example - two sentences
For i As Integer = 0 To 1
' Gets text run
Dim tTextRun As TextRun = DirectCast(Doc.Sections(0).Nodes(0), Paragraph).TextRuns(i)
' Writes its properties
textBox1.Text += "=== Text run " & (i + 1).ToString & " ===" & vbCr & vbLf
textBox1.Text += "Text" & vbTab & vbTab & vbTab & ": " + tTextRun.Text & vbCr & vbLf
textBox1.Text += "Font name" & vbTab & vbTab & ": " + tTextRun.Style.FontName & vbCr & vbLf
textBox1.Text += "Font size" & vbTab & "(in half-point)" & vbTab & ": " + tTextRun.Style.FontSize.ToString & vbCr & vbLf
textBox1.Text += "Text color" & vbTab & vbTab & vbTab & ": " + tTextRun.Style.TextColor.ToString & vbCr & vbLf
textBox1.Text += "Bold" & vbTab & vbTab & vbTab & ": " + tTextRun.Style.FontStyle.Bold.ToString & vbCr & vbLf
textBox1.Text += "Italic" & vbTab & vbTab & vbTab & ": " + tTextRun.Style.FontStyle.Italic.ToString & vbCr & vbLf
textBox1.Text += "Underlined" & vbTab & vbTab & ": " + tTextRun.Style.FontStyle.Underlined.ToString & vbCr & vbLf
textBox1.Text += "Strike-out" & vbTab & vbTab & vbTab & ": " + tTextRun.Style.FontStyle.StrikeOut.ToString & vbCr & vbLf & vbCr & vbLf
Next
End Sub
End Class
About Elerium Software
Elerium Software develops professional solutions for use in .NET projects (C#, VB.NET, ASP.NET) that aimed to read/write/convert different office/web documents and formats. Elerium Software components are based on the unique design and fast algorithms that allow being independent from the third-party applications and libraries.