SGML is a very important system for document storage and interchange, but it has no formatting features; its companion ISO standard DSSSL (see http://www.jclark.com/dsssl/) is designed for writing transformations and formatting, but this has not yet been widely implemented. Some SGML authoring systems (e.g., SoftQuad Author/Editor) have formatting abilities, and there are high-end specialist SGML typesetting systems (e.g., Miles33’s Genera). However, the majority of SGML users probably transform the source to an existing typesetting system when they want to print. TeX is a good candidate for this. There are three approaches to writing a translator:
If these packages don’t meet your needs for an average SGML typesetting job, you need the big commercial stuff.
Since HTML is simply an example of SGML, we do not need a specific system for HTML. However, Nathan Torkington developed html2latex from the HTML parser in NCSA’s Xmosaic package. The program takes an HTML file and generates a LaTeX file from it. The conversion code is subject to NCSA restrictions, but the whole source is available on CTAN.
Michel Goossens and Janne Saarela published a very useful summary of SGML, and of public domain tools for writing and manipulating it, in TUGboat 16(2).