Word, WordPad and RTF
In yet another stunning victory for Microsoft's cross-compatibility, their RTF system is incompatible between their own products. Let me begin by explaining that I have a Delphi app that uses a mail merge system to merge database text into a document. Naturally, you can do this a lot of different ways but, for my particular instance, I need to merge from a non-ODBC compliant database and then automatically E-mail the resulting document to the correct person. This is a management tool we use fairly heavily.
In the past, I've always just used raw HTML formatting because it's handy and relatively standard. The particular request I've been working on is to format the resulting merge so that it can be printed in a "one summary page per item" format. HTML has the ability to do this with a style sheet as such:
and use it via:
Thanks to HTML Goodies for the information.
Now, the downside to this is that HTML print formatting appears to be considered optional by... well... everyone. I initially sent the HTML to my Gmail account. It appears fine but without the page breaks. I would expect this considering Gmail has it's own HTML enclosure. I would not expect the PRINT to ignore all of the HTML formatting, but it does. I suppose this makes some sense in light of the header information they put at the top of the output. The process of placing the header must remove the rest of the print formatting. Keep in mind this is the equivalent of an HTML E-mail, not an HTML attachment. When I converted over to an HTML attachment, Gmail dutifully opens the attachment and FF3 prints the document out just fine.
On to Outlook. One would think that Outlook should honor HTML formatting since, unlike Gmail, it doesn't actually use any HTML for display or printing. This is not so. HTML prints from Outlook without the page breaks from the normal reader pane. Thinking I could force the issue, I dbl-clicked the message in Outlook (which of course puts it in Word) and voila - no difference. This is turning into a major hassle. Additionally, because Outlook is so clever, it won't really let you send an HTML attachment. Instead, it displays it in-line with the same problems printing as with an HTML E-mail.
On to the solution...
I had to select some format other than HTML but it must be 7-bit text-based because of the way my Delphi merge app works. That pretty much leaves me with PDF or RTF as easy standards that everyone can open. I settled on RTF because it's easy to create an RTF file in OpenOffice, save it out and hack it apart with TSE Pro. My merge codes are just enclosed in << >> (ala MS Word style) so a name would be
This is not to say that Word doesn't open the document. It does, it just ignores the page breaks. I tried the same RTF file in Word 07 and Word 03 with the same results. Even more annoying, if I open the file in WordPad and then save it back out as RTF, Word still won't have the page breaks. After reading the RTF documentation, I was left with the impression that \page should always generate a new page. It turns out that Word won't take just a \page like the documentation, WordPad and OpenOffice. It HAS to be "\par \page \par " and please don't forget the ending space. In Word, this works fine. In WordPad and OO, it generates a leading blank line. In my case, I can call this close enough sine the summary pages are so short anyway. If I was pressed for space though, it would be an issue that would probably force me to PDF.
In another quirky little thing, when I created my file in OO and saved it out as RTF, my file size was 38k. If I open the file in WordPad and save it out, it shrinks to 13k. If I open the original in Word 07 and save it out, it grows to 87k. How can the same RTF encoding size vary that much from the same company? Even more annoying, why aren't WordPad and Word interoperable using RTF?
In the past, I've always just used raw HTML formatting because it's handy and relatively standard. The particular request I've been working on is to format the resulting merge so that it can be printed in a "one summary page per item" format. HTML has the ability to do this with a style sheet as such:
<STYLE TYPE="text/css">
P.breakhere {page-break-before: always}
</STYLE>
and use it via:
<p class="breakhere">
Thanks to HTML Goodies for the information.
Now, the downside to this is that HTML print formatting appears to be considered optional by... well... everyone. I initially sent the HTML to my Gmail account. It appears fine but without the page breaks. I would expect this considering Gmail has it's own HTML enclosure. I would not expect the PRINT to ignore all of the HTML formatting, but it does. I suppose this makes some sense in light of the header information they put at the top of the output. The process of placing the header must remove the rest of the print formatting. Keep in mind this is the equivalent of an HTML E-mail, not an HTML attachment. When I converted over to an HTML attachment, Gmail dutifully opens the attachment and FF3 prints the document out just fine.
On to Outlook. One would think that Outlook should honor HTML formatting since, unlike Gmail, it doesn't actually use any HTML for display or printing. This is not so. HTML prints from Outlook without the page breaks from the normal reader pane. Thinking I could force the issue, I dbl-clicked the message in Outlook (which of course puts it in Word) and voila - no difference. This is turning into a major hassle. Additionally, because Outlook is so clever, it won't really let you send an HTML attachment. Instead, it displays it in-line with the same problems printing as with an HTML E-mail.
On to the solution...
I had to select some format other than HTML but it must be 7-bit text-based because of the way my Delphi merge app works. That pretty much leaves me with PDF or RTF as easy standards that everyone can open. I settled on RTF because it's easy to create an RTF file in OpenOffice, save it out and hack it apart with TSE Pro. My merge codes are just enclosed in << >> (ala MS Word style) so a name would be
. Anyway, this all works, although I don't really recommend reading RTF as a form of entertainment. Many trial and error attempts later I had a document that works fine in WordPad and OpenOffice. It just won't work in Word.
<<name>>
This is not to say that Word doesn't open the document. It does, it just ignores the page breaks. I tried the same RTF file in Word 07 and Word 03 with the same results. Even more annoying, if I open the file in WordPad and then save it back out as RTF, Word still won't have the page breaks. After reading the RTF documentation, I was left with the impression that \page should always generate a new page. It turns out that Word won't take just a \page like the documentation, WordPad and OpenOffice. It HAS to be "\par \page \par " and please don't forget the ending space. In Word, this works fine. In WordPad and OO, it generates a leading blank line. In my case, I can call this close enough sine the summary pages are so short anyway. If I was pressed for space though, it would be an issue that would probably force me to PDF.
In another quirky little thing, when I created my file in OO and saved it out as RTF, my file size was 38k. If I open the file in WordPad and save it out, it shrinks to 13k. If I open the original in Word 07 and save it out, it grows to 87k. How can the same RTF encoding size vary that much from the same company? Even more annoying, why aren't WordPad and Word interoperable using RTF?