This is html2ps 0.1 beta, an HTML-to-PostScript converter. THIS SOFTWARE IS PROVIDED "AS IS" WITH ABSOLUTELY NO WARRANTY. The present version of html2ps is written in Perl. Perl is available from any comp.sources.misc archive. The latest version of html2ps is available by anonymous ftp from "ftp://ftp.tdb.uu.se/pub/sources/html2ps/". You can either fetch the tar file, or the files individually. Use binary mode when retrieving the files. Some PostScript code and some ideas have been taken from the PostScript generator in NCSA's Mosaic, by Ameet A. Raval & Frans van Hoesel. The program has been developed and tested on different Suns running Solaris 2.3 with perl version 5.000, and SunOS 4.1.3 with perl version 4.0. Author: Jan Karrman, Dept. of Scientific Computing, Uppsala University, Sweden, e-mail jan@tdb.uu.se. Features: * Most HTML tags are handled, see Notes/Bugs below for exceptions. * Scaling of the text to any size is possible (the line and page breaks will off course be adjusted to fit the page). * It is possible to change the sizes and styles for all the 6 header levels individually. * The font size used for preformatted text may be changed. * The size of the page can be adjusted. The defaults are adapted to the A4 paper size. * The margin sizes may be changed. * Different fonts can be selected. You can easily add new fonts, an example is given in the Perl script. * Printing in landscape mode is supported. * Anchor texts are underlined by default, this can be turned off. * No syntax check of the HTML code is done by the converter, but it is possible to call an external HTML checker, specified via the command line options. The default syntax checker is weblint. * Page numbers can be inserted. * A heading tag will cause a page break if the text is close to the end of a page. * Highlighting tags is additively interpreted. For example, the HTML code "some text" would produce bold italic text. This can be turned off so only the innermost tag is interpreted (here, the italics). * You can force a page break by including the comment in the HTML document, at the point you want the page break. This action is not defined in the HTML specification. I would like to have a special character (eg &page;), ignored by screen browsers, but used to force a page break when printing a document. * The generated PostScript code is very compact, it will be less than the size of the HTML file plus the size of a PostScript header (presently about 8 kilobytes). Notes/Bugs: * In-line images are not handled. -- If the ALT attribute of is present, the corresponding text is written in place of the image. If there is no ALT attribute, the text "[IMAGE]" is written (this text can be changed via the command line options). * The tag is not implemented. -- Ignored. * The
and associated tags are not implemented. -- Ignored. *
 is not implemented. -- Handled as 
.

  * The text between two HTML tags or special characters will be converted
    into one PostScript string. If there is a large chunk of text between
    two successive HTML tags (typically between 
 and 
tags), the length of the PostScript string may exceed some limitations in the PostScript printer/viewer. As a work around, you can insert a few HTML comments in the document. * The string "" in an HTML document terminates the HTML entity, even if it is within an <XMP> or <LISTING> element. The tags <XMP>, <LISTING> and <PLAINTEXT> are obsolete, you should use <PRE> instead. * The PostScript code generated by html2ps is not compliant with the Document Structuring Conventions. This is because the line breaking (and hence the page breaking) is done within the PostScript code itself. So the %%Page and %%Pages comments cannot be generated in advance. * A line break may occur at the position of a tag. For example, the code "It looks <EM>awful</EM>. Hopefully it will be fixed.", may give a line break between the word "awful" and the period. ToDo: Try to fix as much as possible in the Notes/Bugs section. The lack of support for in-line images is perhaps the major deficiency of html2ps. It will probably not be implemented in the near future though. Installation (Unix): I have only tested html2ps on Unix systems, I would like to hear your experiences with installing it on other platforms. Make the Perl script executable (chmod 755 html2ps). To convert an HTML file to PostScript, call the script with the HTML-file as parameter. Use the -o option to save the PostScript code to a file, or redirect the output in standard UNIX manner. For example: html2ps test.html > test.ps If you want to use the syntax check feature, you have to install some HTML syntax check program. The program called by default is weblint by Neil Bowers, available via anonymous ftp from ftp.khoros.unm.edu in /pub/perl/www. The file html2ps.1 is a manual page, move it to a directory where the manual pages are kept, and read it with 'man html2ps'. There is also a plaintext version in the file manpage.txt. History: * 12 Dec 1994: Version 0.1 beta released.