Chris De Herrera's Windows CE Website

About
Discuss.Pocket PC FAQ Forum

Add Pocket PC FAQ to your Favorites
RSS    RSS Feeds
Wiki    Lost?
Custom Search
Subscribe    Print
Miscellaneous
Table of Contents
Mobile Format
News

[an error occurred while processing this directive]


 
Pocket PC Magazine Best Site

Website Awards
Website Updates

By Chris De Herrera 
Copyright 1998-2007
 All Rights Reserved
A member of the Talksites Family of Websites

Windows and Windows CE are trademarks of Microsoft
Corporation
and are used
under license from owner.
CEWindows.NET is not
associated with Microsoft 
Corporation.

All Trademarks are owned
by their respective companies.

A Quick Guide to Making Microsoft Reader Books
By Doug Clapp, 6/2000
This document is placed in the public domain by its author. Do whatever you want with it.
[an error occurred while processing this directive]

 

First, Thanks

I’d first like to thank The Amazing Chris De Herrera for asking me to write this brief spiel about making Microsoft Reader books. And I’d like to thank Chris for hosting my getting-to-be-sizeable collection of Reader conversions.

When thinking about "where to post?" my first and only thought was www.pocketpcfaq.com. I think it’s the single best site about Windows CE and the Pocket PC’s – and the amount of work and love that obviously went into Chris’s always-evolving site is breath-taking.

He’s a deserved "Microsoft MVP" and a great guy. Thanks again, Chris…

The Big Idea

Despite everything that I say later on, always remember this;

It’s fun and easy to make Microsoft Reader books using ReaderWorks software. Download FREE ReaderWorks Standard

If you’d like, you can stop reading now, go to ReaderWorks Website, download the free "Personal" ReaderWorks software, and have at it. It’s not hard, truly.

So far, I’ve made about 60 Reader books, so how hard can it be? And consider how amazing this is:

On my Jornada, tucked onto my 48 meg CompactFlash card, are many of the greatest works ever written:

  • Hamlet
  • Gibbon’s Decline and Fall of the Roman Empire (well, Vol I, anyway)
  • The Koran
  • The Imitation of Christ by Thomas a Kempis
  • The Varieties of Religious Experience
  • Darwin’s The Voyage of the Beagle
  • Homer’s Illiad
  • Lao Tze’s Tao te Ching
  • The Bhagavad-Gita
  • James Joyce’s Ulysses (which I haven’t released due to copyright concerns)

And, though not as noteworthy, I’ve also got the Rules of Golf, some good frequency lists (I love radio), the complete NFL schedule for the upcoming season and, of course, the "good parts" from the Kama Sutra – which Chris De Herrera probably wisely declined to make available on www.pocketpcfaq.com.

And more still. In my pocket. Thanks to Microsoft, amazing computer hardware, OverDrive Systems’ ReaderWorks software – and the many people who tirelessly scanned so that we can joyfully read.

Wanna make some books?

 The Process of Reader Bookmaking

First, the painful and nasty part. After you’ve downloaded ReaderWorks, download the documentation. Now read it. Sorry. Now open ReaderWorks and, from the Help Menu, choose Help topics.

Now read the Help Topics. Again, sorry.

This should take less than an hour. When you’ve finished, you’ll know far more than I’ll say here – and probably more than I know! It’s a short-term memory thing…

Or let me condense the process like this:

  1. Get a public domain text from the Internet (more to say about that in a bit)
  2. Munge it in Word until it’s just the way that you and ReaderWorks like it
  3. Place the soon-to-be book in its own folder somewhere (ReaderWorks creates a number of files; you really need each book and related files in a single folder)
  4. Open ReaderWorks
  5. Add the book file
  6. Set the Properties
  7. Make the Table of Contents
  8. Save your work using Save As and the .rwp extension (so you don’t have to start from the very beginning if the next step takes away all your Hit Points)
  9. Choose Build eBook from the File Menu

Finding Public Domain Texts

The most well-known site for public domain texts is Gutenberg at www.gutenberg.net. You can also go to Yahoo, or my favorite, www.google.com and search for "etext public domain ebooks" or somesuch. You’ll soon find yourself with more URLs and more books than you’ll ever have time to convert.

A few good sites are:

The On-Line Books Page at http://digital.library.upenn.edu/books/

Bibliomania at http://www.bibliomania.com/

Books on the English Server at http://eserver.org/books/

The Windows CE Archives of (as they call them) E-Text sites at http://www.pda-archives.com/wince/12.htm

An even better list of sites at http://gort.ucsd.edu/jj/1/book.html

The wonderful Internet Classics Archive at http://classics.mit.edu/

Another very good listing of public domain etexts at http://utenti.tripod.it/libridigitali/publicdomain.htm

A still even better list at http://www.lib.utexas.edu/Libs/PCL/Etext.html

And another at http://dmoz.org/Arts/Literature/Electronic_Text_Archives/ where you’ll find many good links, including "The Society for the Appreciation of the Post-Dialogic Novel" where you can ponder their manifesto; which ends with the ringing declaration "That David Foster Wallace's distinction between recursive and referential writing is decidedly valuable, and our belief is that the post-dialogic reflects an essential blend of the two."

And so on. So many sites, so many books, so little time.

Is it Really Public Domain?

As we enter the Media Everywhere Age, the notion of copyright deserves some attention. We’ve got Napster, which allows anyone to gleefully circumvent – hell, break! – the law. We’ve got Barnes and Noble and other booksellers wanting to sell you eBooks – but not wanting you to make copies for all your friends. As you read this, people around the world are ripping CD’s, ripping DVD’s, converting media files from this format to that, and filling up hard drives and web sites with books, music and video.

There’s a lot of thievery going on. And that’s exactly what it is: Theft. It’s illegal and, more important, it’s morally wrong. Don’t do it.

Sadly, life before the monitor will soon be more complex because of the need for Digital Rights Management. DRM deserves more space than I can give it, but the short story is that software will become more complex – and harder to use – as Microsoft and other companies build new applications, and add new layers in existing apps to prevent you from illegally reproducing media. Not that you would, of course…

Given that, how do you know if something is truly in "the public domain" and eligible to be reproduced and distributed?

(Note: I am not a lawyer. I am not a lawyer)

First, in general, any work that’s was published more than 75 years ago is in the public domain. In general, of course. 1850 A.D. or before, not to worry. Homer has no lawyers anymore.

And, of course, if the author – or copyright holder – freely puts the work in the public domain, that’s that. Unless, of course, the copyright holder adds: "…but, you have to contact me for permission before reproducing" or other caveats.

There’s also a gray area. When I went to find Hesiod’s Works and Days -- it’s Greek, it’s old, it’s obviously in the public domain! -- I found it at Berkeley.edu, with this at the bottom:

Copyright © 1995. All rights reserved.

Document maintained on server: http://sunsite.berkeley.edu/ by the SunSITE
Manager.
Last update 11/10/95. SunSITE Manager: manager@sunsite.berkeley.edu

Can they do that? Copyright something written before Christ took a breath? Beats me. (I am not a lawyer.)

So I wrote the SunSite manager a nice email; and got a nice email back, giving me permission to reproduce for Microsoft Reader, and away I went.

When I made the Reader eBook, I included, at the top, under a "Copyright and Permissions" heading:

Hesiod, the Homeric Hymns and Homerica
Housed at
[Berkeley Digital Library SunSITE]
Copyright © 1995. All rights reserved.
Document maintained on server: http://sunsite.berkeley.edu/ by the SunSITE Manager.
Last update 11/10/95. SunSITE Manager: manager@sunsite.berkeley.edu
Permission given to convert into "not for profit" Microsoft® Reader™ format

When in doubt, inquire about permission. And don’t steal. Most writers and musicians have house payments, same as you. And I don’t care how much money Madonna has, and neither should you.

For more about copyright, the U.S. Copyright Office has a good "Frequently Asked Questions" page at http://lcweb.loc.gov/copyright/faq.html

Sometimes It’s So Easy

In the first version of this Quick Guide, this is where you’d start to read about Word wrestling.

But that’s not always the case. It’s possible to find lovely and informative web pages, with excellent formatting, tables of contents, footnotes – all that – which fly into ReaderWorks with no additional effort needed. Really.

Here’s an example. The Rand Corporation has a excellent report, "The Cyber-Posture of the National Information Infrastructure" at http://www.rand.org/publications/MR/MR976/mr976.html.

It has a detailed table of contents, footnotes, bulleted lists…all that you’d expect in a thorough research paper. From your favorite browser, you can Save As HTML, toss it into ReaderWorks, and press Start.

Out comes a Reader .lit file that looks great on your Pocket PC. The table of contents is clean, properly nested, and functional. The headings look good. The text looks good. The footnotes even work!

And you did nothing but Save As HTML and run it through ReaderWorks. It can happen.

Well, okay. There was one thing. There’s a section titled Acronyms. In the report, a line looks like this:

DARPA Defense Advanced Research Projects Agency

On the Pocket PC, it looks like this:

DARPADefense Advanced Research Projects Agency

Still, not bad. I’ve done conversions where I .literally (sic) did nothing to the text before turning to ReaderWorks. But, the more likely case is still…

The Word Massage

Ok, you’ve got a text and it’s truly public domain. It’s probably also a plain text file. If it happens to be "Palm format," file with a .prc or .pdb file extension, you need MobiPocket Publisher from www.mobipocket.com. It’s free. It will convert those Palm files to HTML files, which ReaderWorks (after a spin through Word) will accept.

Microsoft Word – or another high-powered word processor or layout program – is where you’ll format the text. ReaderWorks makes eBooks, but it doesn’t format them…much. (Later on, FrontPage gets an informal recommendation from Steve Potash, President of OverDrive Systems as the "tool of choice" for making Reader eBooks.)

When it comes to formatting, the better you are with Search and Replace, the better. In my experience, making books is 90% search and replace.

Why? Because most public domain texts come with hard breaks at the end of every line. This will create a truly ugly Reader book, unless those line breaks are carefully expunged.

The line breaks will either be paragraph marks ( ^p in Word ) or end-of-line marks ( ^l in Word ). To complicate things, Word often "can’t see" end-of-line marks – though Word craftily displays them as paragraph marks, just to get your hopes up. And since it won’t replace what it can’t see, you’ve got a problem.

To determine if your task is simple or a bit tedious, Find ^p (the ‘p’ must be lowercase). Find again, and see if Word happily skips over what looks to be a bona fide paragraph mark. If so, I do this: save the file as a Word doc, close it, open it, and try again. Still not working? Save as a RTF file, close, open, try again. That should do it.

Now, change all ^l to ^p.

Next select all, make everything Normal style and 12 point type (any smaller and ReaderWorks may make your text smaller than you’d like) and save as a Word doc.

Now the interesting part.

Look at your text to determine if it’s what I call "well formed." A well-formed document is the easiest to convert, no matter how long it is. Length isn’t what makes for difficulty. Inconsistency in spacing and other formatting weirdness makes for difficulty. Also, tables that are too wide to fit across an eBook display are nightmarish. Plain text tables; work of the devil. I’ve been know to "give up," in fact. I’m not proud.

Ideally, you want a beautiful chain of paragraphs, separated by a consistent number of paragraph marks and a manageable number of chapter and other headings – because you’ll have to make a style for each heading, so that ReaderWorks can create your table of contents.

Still game? Here’s the fun part. First, save. Next, we’re going to get rid of the paragraph marks between paragraphs. Although this procedure differs from book to book, generally I:

  1. Convert three or more ^p to two ^p. Repeat as needed.
  2. Next I convert the two ^p to @@@ (or another sequence that isn’t found in the text)
  3. Next I see if individual lines end with a paragraph mark, or a space then a mark. (usually the former)
  4. Now I ditch Every Single Paragraph Mark (except the final one, which Word requires) and add in a necessary space (so you won’t have hundreds of runtogether words) by Finding ^p and replacing with a single space (just hit the space bar once in the Replace with field)
  5. Finally, I change @@@ back into two ^p.

Now, either save so Word can collect itself after that bruising exchange, or Undo some or all of what you just did and try again, in a slightly different manner.

If you don’t want spaces between paragraphs, make that final change @@@ into a single ^p followed by your choice of spaces. ReaderWorks converts tabs to spaces, but you might want to give it a little heads up here.

Using Macros in Word

To make this process easier, make a Macro. Use Word help if you need Macro basics. Simply put, you choose "Record," then do a sequence of steps, then Stop Recording. Not hard. The steps listed above, for example, become a Macro that looks something like this:

Sub ditchardbreaks()
'
' ditchardbreaks Macro
' Macro recorded 05/24/2000 by Doug Clapp
'
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p^p"
.Replacement.Text = "@@@"
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "^p"
.Replacement.Text = " "
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "@@@"
.Replacement.Text = "^p^p"
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub

Now, I’ll admit: This is a very simple Macro. It could do much more. But, for now, remember that any repetitive unchanging sequence is a prime Macro candidate.

After searching and replacing – whether by hand or with power-assist – you may only now need a little futzing here and there – adding or deleting spaces, etc. – before the next step.

Adding Styles

Although ReaderWorks can accept multiple files, and create tables of contents from filenames, let’s merely consider making a book from a single file. If that’s the case, you need to create headings, which ReaderWorks uses to make a table of contents.

Again, Search and Replace is your friend. If every chapter begins with the word CHAPTER, for example, you can search for CHAPTER (with Match Case checked) and replace it with CHAPTER (and the format of heading whatever..3, say).

Instant table of contents. (Another Macro candidate??)

More likely, you’ll be doing a bit of work "by hand" to properly format the headings, deciding on levels for the headings, and so on. Remember: you’re not outside getting skin cancer; it’s a good thing.

Make sure that the title of your book is not a heading. ReaderWorks takes the filename and puts it in the table of contents. If you make the title a heading, you’ll have two titles atop your table of contents. (Been there…)

Ready for ReaderWorks

Done? If so, save your file as a Web Page, and check that the Page Title in the Save As panel is what you wish for the name of your book. (You can change this in ReaderWorks, but do it now, ok?)

You are now ready to meet your maker.

Using ReaderWorks

This part’s easy. It’s the formatting in Word that’s tedious and often difficult.

In ReaderWorks, you

  1. Add the file
  2. Set the book’s properties
  3. Make a table of contents, and
  4. Build your eBook.

From the Source Files window, click Add to add your HTML-saved file. Make sure the file isn’t still open in Word.

Next, set properties by clicking on the Properties button at left. Add as much information as you can here. Your eBook may live a long and strange life and future generations might like to know if this is a cookbook or a feminist tract.

Next, Click on Table of Contents and run the easy-to-use TOC Wizard. Heck, run it a few times to see which format you prefer.

Now save as, to save your project. That way, if ReaderWorks crashes in the next step (a rare thing), you’ve still got your table of contents and property information.

Finally, choose Make eBook from the File menu to make your eBook as a Reader .lit file.

Did I say "Finally"? Actually, you next copy your eBook to your Pocket PC. (Make sure that Reader is closed when you do this – on the Jornada, go to Today, click the Task Switcher icon, choose "Close Window," then Microsoft Reader. Or do a soft reset after copying your eBook.)

Open Microsoft Reader, choose your book from the Library, and discover what you forgot to do in Word. Go back to Word, etc. Repeat until pleased with what you’ve made.

Comments from Steve Potash

Before sending this off to Chris De Herrera, I ran this piece by Steve Potash, founder and President of Overdrive Systems: the makers of ReaderWorks. (By the way: Nice guy. And I’ve found Overdrive’s support to be excellent.)

Steve had a couple of comments. There are:

1. Reader and ReaderWorks supports CSS (most CSS tags but not all) – great for eliminating the embedded tagging and can be applied against an entire title or library.

2. FrontPage is the preferred editor for our eBook folks - pretty easy for layout.

Both good points. Thanks, Steve.

CSS means "Cascading Style Sheets." You can learn all the nitty-gritty about CSS by reading the W3C Recommendation "Cascading Style Sheets, level 1" at http://www.w3.org/TR/REC-CSS1. When you get done with that, you’ll realize why ReaderWorks doesn’t yet support "all" tags.

A good list of links to more CSS information is at http://webopedia.internet.com/TERM/C/Cascading_Style_Sheets.html

FrontPage? Makes sense. If you’re comfortable working in FrontPage, give it a whirl. Report back.

Conclusion

Again, despite these honest remarks, making eBooks with ReaderWorks is a fairly simple and greatly rewarding process. A eBook well-made (ok, even poorly made) is a great gift; to yourself and to anyone who uses what you created.

Feel free to contact me with any corrections to or suggestions for this document. And, since you can’t change a Reader .lit file unless you have the original source files, contact me if you’d like the word files for any of the books I’ve made. Each can be improved, and I’d love to see that happen.

Have fun!

[an error occurred while processing this directive]

Return to Chris De Herrera's Windows CE Website