Converting RTF to HTML

Have you ever had the desire to convert some RTF text into HTML? Probably not. But if you do, then you are in luck! I recently had the need to do this conversion and after some searching found out a way to do it by enhancing a sample distributed in the MSDN library.  The sample is called: XAML to HTML Conversion Demo

The sample has code which converts HTML to and from a XAML Flow Document.  But this doesn’t make things easier until you realize that there is a way to convert RTF to XAML easily. The key is to use System.Windows.Controls.RichTextBox which can load RTF from a stream and save it as XAML.  This conversion is shown below:

private static string ConvertRtfToXaml(string rtfText)
{
    var richTextBox = new RichTextBox();
    if (string.IsNullOrEmpty(rtfText)) return "";
    var textRange = new TextRange(richTextBox.Document.ContentStart, richTextBox.Document.ContentEnd);
    using (var rtfMemoryStream = new MemoryStream())
    {
        using (var rtfStreamWriter = new StreamWriter(rtfMemoryStream))
        {
            rtfStreamWriter.Write(rtfText);
            rtfStreamWriter.Flush();
            rtfMemoryStream.Seek(0, SeekOrigin.Begin);
            textRange.Load(rtfMemoryStream, DataFormats.Rtf);
        }
    }
    using (var rtfMemoryStream = new MemoryStream())
    {
        textRange = new TextRange(richTextBox.Document.ContentStart, richTextBox.Document.ContentEnd);
        textRange.Save(rtfMemoryStream, DataFormats.Xaml);
        rtfMemoryStream.Seek(0, SeekOrigin.Begin);
        using (var rtfStreamReader = new StreamReader(rtfMemoryStream))
        {
            return rtfStreamReader.ReadToEnd();
        }
    }
}

With this code we have all we need to convert RTF to HTML. I modified the sample to add this RTF To XAML conversation and then I run that XAML through HTML converter which results in the HTML text. I added an interface to these conversion utilities and converted the sample into a library so that I would be able to use it from other projects.  Here is the interface:

public interface IMarkupConverter
{
    string ConvertXamlToHtml(string xamlText);
    string ConvertHtmlToXaml(string htmlText);
    string ConvertRtfToHtml(string rtfText);
}

public class MarkupConverter : IMarkupConverter
{
    public string ConvertXamlToHtml(string xamlText)
    {
        return HtmlFromXamlConverter.ConvertXamlToHtml(xamlText, false);
    }
    public string ConvertHtmlToXaml(string htmlText)
    {
        return HtmlToXamlConverter.ConvertHtmlToXaml(htmlText, true);
    }
    public string ConvertRtfToHtml(string rtfText)
    {
        return RtfToHtmlConverter.ConvertRtfToHtml(rtfText);
    }
}

With this I am now able to convert from RTF to HTML.  However, there is one catch – the conversion uses the RichTextBox WPF control which requires a single threaded apartment (STA).  Therefore in order to run your code that calls the ConvertRtfToHtml function, it must also be running in a STA.  If you can’t have your program run in a STA then you must create a new STA thread to run the conversion. Like this:

MarkupConverter markupConverter = new MarkupConverter();

private string ConvertRtfToHtml(string rtfText)
{
   var thread = new Thread(ConvertRtfInSTAThread);
   var threadData = new ConvertRtfThreadData { RtfText = rtfText };
   thread.SetApartmentState(ApartmentState.STA);
   thread.Start(threadData);
   thread.Join();
   return threadData.HtmlText;
}

private void ConvertRtfInSTAThread(object rtf)
{
   var threadData = rtf as ConvertRtfThreadData;
   threadData.HtmlText = markupConverter.ConvertRtfToHtml(threadData.RtfText);
}


private class ConvertRtfThreadData
{
   public string RtfText { get; set; }
   public string HtmlText { get; set; }
}
This entry was posted in C#, HTML, RTF, WPF, XAML. Bookmark the permalink.

Leave a Reply