The Standard Generalized Markup Language (SGML) is a meta-language for writing Document Type Definitions (DTD). A DTD describes how a document conforming to it should be marked up: the structural tags that may occur in the document, the ordering of the tags, and a host of other features. As an electronic publisher, OCLC has several tagged data sources for its reference databases. While this tagged text appears to be SGML, it does not always have or adhere to a DTD. Despite this, OCLC must build data transformations, databases and interfaces for this tagged text. The SGML Document Grammar Builder project is an ongoing research effort involving the manipulation of tagged text. This project has resulted in the construction of a C++ engine library, the Grammar Builder Engine (GB-Engine), that can be used to automatically create reduced structural representations of tagged text (DTDs), translate tagged text, automate database creation, and automate interface design--all from sample tagged text.