Type Preservation

Written by Typographica on December 10, 2004

Luc Devroye, noting the inevitable demise of digital languages (i.e. PostScript), proposes a method of storing fonts in text format to “preserve type at a time scale running in tens to hundreds of years”.

12 Comments

  1. Raph Levien says:

    Interesting essay. A few thoughts.

    While I agree with the need for preservation as much as anyone, I’m not at all sure that data encoded into digital form is necessarily so fragile. Yes, there’s quite a bit of digital data from a few decades ago that’s lost now (just try to find an example of a Kingsley/ATF Type3 PostScript font with optical scaling!), but I think several things have changed.

    First and foremost, the free software movement has made significant inroads in the battle between open and proprietary formats. Nearly all data formats now in widespread use can now be read using free software. In the case of font formats, FreeType can read nearly everything produced today. Digital fonts in formats such as S3 may well be lost to the sands of time, however.

    Second, storage technology has gotten a lot better with respect to archiving. For one, CD-R material is much more likely to last a long time (and still be readable) than the various magnetic formats once dominant. Also, the unbelievable abundance of hard disk storage now means that it makes sense to have plenty of copies of files all over the place.

    Third, it’s increasingly easy to find the file you want, thanks to the Internet and search engines.

    Fourth, while professional type designers wishing to make money from their work may hate file sharing, for starving design students and future historians it is absolutely vital.

    That said, I’m not opposed to textual formats. But I do think a bit more thought needs to go into exactly what problems are being solved. Are hints important, for example? A textual description of TT hints doesn’t sound much more practical or useful than working code that can run TT bytecode.

    I can propose an especially straightforward textual description of PostScript fonts. For the metrics, .afm files are probably quite adequate–they are already a standard, and probably self-explanatory enough for our hypothetical future historian. For the outlines, just print a PostScript path description using moveto / lineto / curveto. Anyone with a passing knowledge of late-20th / early-21st century graphics technology should be able to puzzle that out.

    Incidentally, the format I’m designing for Cornu-spiral based outline masters is textual and particularly simple to parse. With luck, that will indeed become the format to supplant Beziers, and Luc can breathe easier about preservation.

  2. 42ndSSD says:

    I don’t see keeping, say, Type 1 fonts around as any different from encoding them in an arbitrary text format. The Type 1 standard is extremely well-documented and likely to remain so; the same is true for OpenType. (Unless everyone unanimously decides to delete every Adobe Type 1 standards document in existence, of course. Then we’re all in trouble.)

    If anyone is seriously worried about current font formats becoming “unreadable”, archive a copy of libfreetype sources along with the relevant Adobe and Micro$hit documentation. That’s going to be more productive in the long run than trying to convert every single freaking font out there into some arbitrary format, because our mythical historian 100 years from now is going to find that one font which never got converted (Comic Sans MS) and will be pulling her hair out ’cause she can’t read it.

    Anyone particularly worried about Amiga bitmap fonts becoming unreadable? No? :)

    Every time I’ve run across a mystery file format from eons ago, it’s always been something which was originally totally undocumented and proprietary. Of course the worst-case examples are things like tapes and disk drives which require hardware to read them; I have a collection of old Adobe fonts on 9-track. Not very useful anymore. (I find it amusing that 6-track, 7-track, 8-track and 9-track are all dead formats now.)

  3. John Butler says:

    I miss my Font Mechanic fonts from Beagle Bros.

  4. 42ndSSD says:

    Oh, does that bring back memories.

    ..though I always seem to get them confused with the Smith Brothers, purveyors of fine cough drops and long beards since 1847. Ask for it by name.

    Hmmm. I wonder if anyone’s written a Font Mechanic to Type 1 converter? (Now there’s an application with a burgeoning market.)

  5. Toby says:

    They are fragile because companies like Adobe insist on “obsoleting” formats like Type1 for marketing reasons. There are no good technical reasons. They EOL’d Type1 a long time ago, it seems hopeless to protest, so Luc has a point, we need to take the matter into our own hands.

  6. Toby – Are you telling us we can’t benefit from a font format with Unicode, cross-platform, and contextual glyph substitution capabilities? Or are you saying that Type 1 fonts won’t exist in the near future because Adobe made the format unusable? I think neither is true.

    Font formats, like any other digital format, fall out of use because better technology comes along. Adobe isn’t going to prohibit font makers from creating and releasing Type 1 fonts, nor have they said they’ll drop support for Type 1 in Adobe apps. The reason OpenType is growing in popularity is simply because it’s superior to Type 1 in nearly every way.

    And surely an even more superior font format will come along in the next 20 years (a number I admit was extracted from my bottom).

  7. John Butler says:

    Toby, what languages do you speak?

  8. Dan Reynolds says:

    I think it was in Praue that I Microsoft’s representative say that PostScript was obsolete because it is 8-bit software, and all 8-bit software is/will become obsolete, as 16-bit is now the standard. Or something like that.

    I actually know too little about computers, and couldn’t tell you what the difference between 8-bit and 16-bit software is.

    I think that Luc’s thesis in the above article, that you should print out big, very high quality bitmap renderings of of your designs is in keeping with another of his future-predictions. Mainly, that someday computer technology will advance to the points where computers will be able to autoscan anything and turn it automatically into a font (with metrics, kerning, and all). Sort of like ScanFont, but futurized. If you make adequate images of your work now, future people will be able to regenerate the font on their own, without having to read your out-dated file formats. At least, this is what I think he thinks.

    Just imagine how much fun future type designers will have reviving FF Meta (or anything else contemporary, for that matter…)

  9. Sergej Malinovski says:

    Everything that is worth preserving is preserved in the hearts and minds of people.

  10. Su says:

    Sergej: Oh, come on.

    (I apologize for the length of this, but I just kept finding more stuff. I finally just gave up.)

    I have to admit I’m baffled. This piece is nice as a bit of thinking-out-loud, but I wouldn’t put too much stock in it. Luc is making some, uh…interesting(that’s it) observations, which seem to stem largely from conflating obsolescence(and not even that, really, but more like being surpassed/replaced) with death.

    It’s perfectly reasonable to claim a file format will outlive another. You might be utterly wrong for preposterous reasons, but that’s another matter. The fact that I can only name maybe ten type formats is probably more than enough counterargument for this idea.

    Rat-like lifecycle: Conflation. The data on any given computer doesn’t have to die with the computer. This is why we back up our information to external formats(like, say, publishing an essay on-line) and copy them to the new one. The claim that future hardware won’t be able to read current media makes the assumption(again) of either some catastrophic event that simultaneously destroys every DVD drive and manufacturing plant on the planet, or that people are, for whatever reason, not transferring their data into the new formats as they come out. So, yes, the computers of 2100 probably won’t read DVDs. But it’s not likely there will be any DVDs or anybody using them then.
    Also, rumors of the death of Abu, my Gateway P2-400 are greatly exagerrated. This is not Logan’s Run. Where’s the citation on this claim?

    Genetic weakness: D’whut?

    So, what will we do with our favorite truetype or postscript files when there is no more software to read or display them?
    Conflation. New software or a version of does not require you to nullify the existence of the old version. I keep old versions around specifically because newer ones sometimes remove functionality I liked. Last year, I had to hunt down a copy of Freehand 4 or so for PK, who was trying to resurrect some 10year-old work files that nothing current could open properly.

    because almost every programming language ever invented has bitten the dust.
    This is a meaningless statement. Almost every anything ever invented has bitten the dust. It’s just the way things work. I’ve seen about fifty new programming languages be invented this year. They’ll be dead within months. LISP has been around since 1958 and continues to evolve. The difference again being evolution and not quantum leaps.

    For example, in Computer Modern, we have virtually limitless accuracy in positioning the points. The type 1 version must round to the nearest thousand
    Okay, no. You can’t compare a number with “virtually limitless.” What’s the accuracy of Metafont? There’s also a limit to the amount of difference the eye can discern. Whether PostScript still exceeds that, I don’t know, but “virtually limitless” is only theoretically better.
    At any rate, this is only a loss because nobody has bothered(?) coming up with another format with also-(virtually)-limitless accuracy(You backed up those MetaFont files, right?) But, according to the entire premise here, this is futile, because that’ll be gone in a hundred years anyway. So, let’s just sit in the corner and cry instead.

    And then we get to the good part: Text files will last! Because they have lasted. So far. Now, far be it from me to call bull—-, but if you page up about two screens, you’ll see Mr. Devroye say this is a preposterous claim. I suppose he has exemption. But let’s look at a few glaring omissions, anyway, shall we?

    software-independent one-to-one connection between such files and printed versions
    No, there isn’t. Windows, Unix and the MacOS don’t even use the same characters to denote line breaks. There is, of course, even more inconsistency than that, and the problem of interference(ie: programs can be told to convert tabs to a configurable number of spaces).

    just to be sure, we should also explicitly state how each sequence of 8 bits in a file corresponds to a letter
    …and screw the Asians.
    And for the sake of those who like to talk about eggs and chickens: How is that information going to be stored?

    not at all certain that Bezier curves will still be technologically or mathematically relevant a hundred years from now[…]preserve glyphs as pictures in a format, that unlike jpg or gif, is again universally readable, a true bitmap format
    I’m not a math theorist, but as far as I’m aware mathematical concepts don’t just go away, so I don’t follow why curve information would be a bad idea to put in these files. So let’s use a bitmap representation instead. Of course. And then let’s go up about two screens again to the quibbling over three digits of accuracy in placing points in PostScript version MetaFont.

    Then let’s take pictures of the fonts, and make huge bitmap files. Because image formats apparently don’t die like type formats do.

  11. John Hudson says:

    The Lettererror guys, Erik van Blokland and Just van Rossum, have been working on and with a Unified Font Object (UFO) for some time now. The idea, so far as I understand it, is to have a tool-indepdent, text-based source file for fonts, which would also address various issues raised by Luc Devroye.

    I spoke briefly, a couple of years ago, with Stan Nelson of the Smithsonian Institute about the documenting of type design in the digital age. My concern was less with the archiving of designs than the documenting of the design process, since I’m very much aware that in most of my work — directly in FontLab — I don’t leave any kind of paper trail. For a large amount of modern type design, we lack any development documentation: no sketches, no working drawings, no revisions, only the final design. So an area that interests me is the idea of tools that capture the changes to outlines, which can be stored and replayed: a sort of saveable undo/redo history.

  12. Thanks for the cue, John. I’ve been working with Just van Rossum and Tal Leming on a python library, RoboFab, which implements a standardised object model for fonts and glyphs, as well as a XML based font source format, the Unified Font Object or UFO.

    FontLab can export and import these UFO source files, either as a whole font at once, or a glyph by glyph. The glyph desriptions in XML can store cubics and quadratics but are independent of postscript, truetype formats. The UFO format is again independent of robofab, the format is documented (at the LTR Wiki), a shorter summary of the format is here.

    Both robofab and the format are free, public and documented. I’m quite aware that doesn’t mean much in terms of whether a format will be accepted and used. But rather than lofty ideals, this stuff exists because we have a need for it and by writing generic code and using generic formats. Robofab and UFO allow use to write font, glyph and metrics tools which integrate with a FontLab or FDK based production flow without being totally dependent on FontLab’s selection of Dialogs.

    There will always be applications which can read (or even write) older binary font formats. The problem is that binaries do not contain all the data that is needed for a design. Glyphnames, commented code for features, interpolation settings, interpolating contour constructions which are merged before they get to the binary. UFO addresses the need to store font and glyph data in a future proof way. The FontLab source format is NOT public or documented and it is not likely that a third party will reverse engineer that format (or that Yuri will document and publish the format). Same for Fontographer 4 sources – these are such complicated binary messes that only the application which created them can read them. Applications go extinct faster than designs.
    Obviously, XML readers are just code too and therefor exposed to time. But: there are more of them and they are implemented in different languages for different purposes. I’ll take a bet that in 10 years the XML in a UFO will be readable by a then current application.

    We’ve have very good experience with checking the glyph source files into a standard text source server. cvs. Each change to a glyph is accompanied by a short email documenting the change and reasons for it. The history of the glyphset can be reviewed, older versions checked out, current versions compared to older versions. Very basic but very useful stuff. It shouldn’t be too much work (but interesting!) to store the history of a glyph as a small animation.

    The current public release of RoboFab had some unforseen dependencies on Python 2.3 which FontLab on Windows doesn’t work with (yet). In the upcoming release of RoboFab these dependencies have been addressed. Also new in this release are export and import of PostScript hints (stored with the glyphs), many more font and glyph attributes make the roundtrip to XML, some bugfixes. Out soon.

    Erik

Post a Comment

Comments at Typographica are moderated and copyedited, just like newspaper “Letters to the Editor”. Abusive or off-topic comments are not published. We appreciate compliments, but don’t publish them unless they add to the dialog. Thank you!