Internet Publishing Handbook - Copyright © 1995 by Mike Franks

CHAPTER 10: Putting It All Together

This chapter profiles and interviews administrators of exemplary and interesting sites from around the world to show you how they've put it all together. The interviews were done by e-mail, telephone, or in person. The profiles are in alphabetical order, but check out Table 10-1 to find summaries for all the sites.

Britannica Online

<http://www.eb.com/>

For a company that has been in business for 225 years, Encyclopaedia Britannica certainly got off the mark quickly when it came to publishing on the Internet. According to Doug Paul, executive vice president and general manager of Britannica Publishing Division, and Anne Long, executive director, electronic products, Britannica first started experimenting with multimedia in 1988, with its creation of Compton's Multimedia CD-ROM. Britannica felt strongly then, as now, that you don't experiment with your flagship product. They sold the Compton's New Media Division in 1993 when they saw the CD-ROM business moving toward entertainment instead of education. The initial text-only CD-ROM version of the Encyclopaedia Britannica ($995) has enjoyed sales stronger than expected and the recently released illustrated version is also proving to be successful.

The Compton's period was a crucial experience, however, in that it made Britannica aware of the value of electronic publishing and brought the company a joint relationship with an advanced programming group in La Jolla, California. These engineers had close ties to high-level academic computer research, including search-and-retrieval development.

In December 1992 Britannica began exploring the possibility of putting the text of the Encyclopaedia Britannica on college and university servers. But they quickly realized that proprietary servers were not the way to go. Their technical advisers then connected them with WAIS, and 18 months later Britannica Online debuted on the Internet. Along the way Britannica garnered the advice of an advisory group of college and university librarians and system administrators.

Britannica is still working out its marketing strategies, adding individual subscriptions and other institutions to what had been only college and university access. It has been using IP address validation but has recently added passwords, which will allow the company to sell individual subscriptions as well as site licenses.

In addition to copyright statements and legal notices scattered here and there, Britannica uses a variety of means to secure its intellectual property rights and those of its third-party providers. Britannica has embedded many of their images with copyright notice instead of putting that information in a line of text beneath the graphic.

Technical Notes

Britannica Online consists of the full range of articles in the print set with a growing number of links (3,000) to outside resources. A section on scuba diving, for example, might link to a scuba-diving home page.

Britannica Online dynamically serves every article. That is, headers and footers are added onto each retrieval, and search terms are bolded in the text. They can also add and suppress links for market and presentation variability without changing the core content.

Britannica has been using a Sun Microsystems Sparc 10 with 64MB of memory as its server behind a firewall. Their server is running Netscape's Secure Server with WAIS, Inc., server software modified to optimize relevance ranking. They enhanced the WAIS engine particularly in searching for phrases, e.g. red herring.

Advice

Harold Kester, vice president and general manager of the Advanced Technology Group of Encyclopedia Britannica North America, urges entrepreneurs to "understand the utility of what you're doing and develop a business plan around that." He notes that brand name identity is almost always a key to success. But for marketing or
delivering products on the Internet, it's probably ten times more important.

Burlington Coat Factory

<http://www.coat.com/>

Burlington Coat Factory's Web site offers company information, a map to its stores, and pictures of some of its products. According to Percy S. Young III, director of store systems:

We had been using the Internet for several years for e-mail and occasional FTP access. When we heard about Mosaic and the Web, we started exploring the Web. In April 1994 I got the idea that Burlington should be able to easily put up a Web presence and should also be able to turn it into a commercial site. In addition, I was interested in the possibilities for internal uses of the Web behind the firewall.

By June I had a functioning server in-house behind the firewall and was creating internal content. By July we had the prototype of the external pages pretty well set up and went to the marketing side to get approval. This was finally set by the end of August, and on September 1, the external server was installed and publicized.

Burlington promoted its site with a coupon good for a $5 discount on a purchase of $50 or more to anyone who filled in an online form commenting on the site. In two and a half months the coupon brought in more than 2,000 responses, which were overwhelmingly positive. At the peak of the offer Burlington was getting more than 100 responses a day and serving an average of 10 documents per visit, Young said.

Burlington plans to use the Net as an active sales site but is waiting for a security protocol to be established. In the meantime the company may use Netscape, "given the wide distribution and acceptance of their browser," Young said.

Actual out-of-pocket costs directly attributable to the Web pages are essentially zero--Burlington's firewall machine and Internet link were already in place and budgeted for other uses, Young said. "We added a one-Gig disk to the machine for the additional storage, so I guess that is extra. The software so far has all been from the Net. We just added a new scanner ($1,000) and are looking at some UNIX-based scanning software ($1,500). In terms of people time, we are looking at about four man months, probably of a senior analyst, about half of the time in learning," Young continued.

Burlington is working on using a Web site to disseminate information to employees; an employee telephone directory and some graphics already are online. Burlington wants to add linked biographies, skill matrices, and weekly status reports and store them as Web documents for easy reference, Young said. The company's long-range goals include integrating its training and documentation on the Web or similar multimedia graphical environments.

Technical Notes

Burlington runs its Web server on a Sun Sparcstation with a firewall and runs the NCSA HTTPD server. The images were scanned on PCs and then moved to UNIX with some hand editing done via simple pixel editors. Burlington is looking at more sophisticated imaging solutions.

All the scripts for forms processing were written in C based on the models provided with the NCSA server. One set of linked entry forms has the first form run a program that creates the next form and so on, passing "session" information along to ensure continuity across four linked entry screens.

Advice

Young thinks that anyone who sees the Net as a pot of gold "is probably dreaming right now, but for those who look to learn from their adventures in Web publishing, and for those who seek to further the culture of the 'Net, the minimal costs are far outweighed by the potential future opportunities. . . . By having started early, we hope to keep ahead of the competition."

CareerMosaic

<http://www.careermosaic.com/>

CareerMosaic, launched by Bernard Hodes Advertising, provides employers with a cost-effective way to tell prospective employees about the company. According to Tim Gibbon, executive vice president, and Bruce Moore, director of systems and planning for the Western region, companies can present as much information as they want, including graphics, annual reports, and the full text of a pension plan. With the aid of Bernard Hodes Advertising, companies can communicate their "corporate personality" in hypermedia, using text files, graphics, audio, photographs, and video.

Companies represented include Bellcore, First USA (financial services), the Good Guys! (electronic retail chain), Intel, Intuit, National Semiconductor, NeXT, Sun Microsystems, Symantec, Union Bank, Wells Fargo Bank, and Cedars-Sinai Medical Center of Los Angeles.

Online job application forms are in the works. CareerMosaic's Web site includes an example of the "ultimate résumé," a multimedia extravaganza by Joe Mack. Moore said the most innovative thing CareerMosaic does is publish so that the material is equally accessible from all types of Web browsers. The challenge has been to keep CareerMosaic's Web designers from using all the experimental HTML features that Netscape and others are devising. CareerMosaic needs to ensure that its business is delivered to the largest audience possible.

Technical Notes

To write HTML pages on the fly CareerMosaic created a custom database application built on the 4th Dimension database package running on PowerPCs. CareerMosaic's Web server is a Sun Sparcstation 5 with 64MB of RAM running NCSA HTTPD and a modified freeWAIS. The server is rarely taxed and has about 3 million connections per month.

Advice

Moore said that the most important lesson is not to forget what you know about your own area. Consider your objectives, and don't be confused by the technology. People look for the 100 percent solution, but forget the 80 percent solution that can be implemented quickly.

CICA Shareware Repository

<http://www.cica.indiana.edu/>, <gopher://gopher.cica.indiana.edu>, and <ftp://ftp.cica.indiana.edu>

The Center for Innovative Computer Applications (CICA) at Indiana University is one of the world's largest collections of Microsoft Windows public domain and shareware applications, tips, utilities, drivers, bitmaps, and such. The center has more than 5,300 files totalling 1.5GB. Its Web site sees approximately 50,000 logins per month requesting more than 70GB of those files, for an average of 2.3GB per day of file transfers. The site is considered so valuable that it is mirrored by at least 15 other sites around the world, including sites in England, Holland, Germany, Finland, Sweden, Switzerland, Israel, Australia, Taiwan, Thailand, Japan, Singapore, and Poland.

According to FTP librarian and administrator Michael Regoli, the site is maintained by volunteers on an old machine. His real job is director of publications for the Organization of American Historians, also at Indiana University. He recounts some problems in running the CICA FTP site:

On the administrative computing side of things, network bandwidth has been the biggest obstacle. People think we are limited to 45 connections [from the Internet] during the day because we haven't the horsepower on the desktop to service that many users. They couldn't be further from the truth. If left unchecked, and wide open, the FTP site could easily consume half, if not more, of Indiana's three T-1 links.

Technical Notes

The FTP server is a Sun Sparcstation 1 with 28MB of RAM. It will be moving to a 90Mhz Pentium (PCI bus) with 64MB of RAM and a PCI ethernet card for faster network access. Regoli said that CICA uses Washington University's FTP daemon (FTPD) for FTP services and has been pleased with its performance. CICA's Gopher server uses GN, and Regoli says it is an intelligent, load-aware server. CICA's Web server runs NCSA HTTPD on a Silicon Graphics Challenge M running Irix 5.3, with 64MB of main memory (85 MIPS).

The bandwidth is effectively increased by the more than 15 mirror sites because they provide an entirely new set of lines into the archive. CICA finds they are extremely important in handling demand.

CMP TechWeb

<http://techweb.cmp.com/techweb/>

CMP publishes 16 computer trade magazines; CMP TechWeb, which started November 1994, is its platform for interactive publishing, which the company expects will become a big business itself. TechWeb offers recent headlines, a resource guide for technical job hunters, links to current issues of its magazines, and an extremely useful full-text search of its issues published since January 1994. Since July 31, 1995, TechWeb has required that you register with the company for these services, but all that involves is filling out an online survey. TechWeb also has ads and links to its sponsor sites.

The CMP magazine list includes Communications Week, Comm
Week International, Computer Reseller News, Computer Retail Week, Electronic Buyers News, Electronic Engineering Times, Home PC, Information Week, Interactive Age, Internet Business Report,
Max CD-ROM, Netguide, Network Computing, OEM Magazine, VAR Business,
and Windows Magazine. Most are "controlled subscription" magazines, which means that they are free to people who hold jobs that put them in the target populations the magazines' advertisers want to reach.

Users query TechWeb's WAIS searcher by filling out an online form that permits a user to specify which (or all) of the 16 CMP magazines to search, as well as starting and ending dates, title, author, section, column, and text. The results screen has been formatted nicely--it displays title, source magazine, date, relevance ranking, and file size.

Technical Notes

TechWeb's site runs on UNIX with Netscape and WAIS server software. It has a T-1 connection to the Internet and handles 175,000 connections per week.

Advice

Mitchell York, publishing director of CMP Interactive Media, says:

It's wonderful that literally anyone can now be a publisher with a potential audience of millions. But the fact is most Web sites don't have staying power--that is, the ability to attract one user more than once or twice. . . . It's quite okay for someone to put up a Web site for personal enjoyment, but if the purpose is to build an audience, you're in a different league. In this arena, the key is to have targeted content aimed at an audience that needs the information you supply, and that the information is refreshed often. (In our business, this means at least daily, if not several times a day.)

CyberWire Dispatch

<http://cyberwerks.com/cyberwire> and <gopher://cyberwerks.com/11/cyberwire>

In September 1993 readers of the WELL's Wired conference were greeted with a new topic that audaciously began with:

This is the place for breaking news from the CyberFront. There's no press card issued; you hammer out the copy, jack in and upload. In some cases, you'll read news "as it happens." In other cases, there could be a string of dispatches running throughout the day as an issue or situation develops.

Quick, off-the-cuff, interactive journalism.

You don't need credentials, a degree or editorial approval.

You do need attitude, desire and nerve.

It was followed by the first issue of what became Brock N. Meeks's CyberWire Dispatch with its trademark opening, "Jacking in from . . ." Over subsequent weeks the Dispatch quickly developed a reputation on and off the WELL for brash, occasionally off-color, but always accurate reporting on federal telecommunications policy and the antics of those working to change cyberspace for better or worse.

Meeks, an award-winning veteran journalist who is Washington bureau chief for INTER@CTIVE WEEK, understands better than anyone the speed, reach, and freedom associated with online publishing. CyberWire Dispatch consistently scoops the print media. Although Meeks's early dispatches were used without acknowledgment in the bylined articles of others, the now-copyrighted Dispatch has been cited by national news wires and publications like the Economist.

Dispatch was originally distributed only on the WELL and over the com-priv mailing list. In mid-1994 Meeks asked Liberty Hill Cyberwerks's Eric Theise what it would take to set up a dedicated Dispatch mailing list. Theise, a supporter of Meeks's work, set up a list and searchable Web/Gopher archives of back issues in a matter of days at no charge. The Web site has links to organizations Meeks thinks you should know about, including the Electronic Privacy Information Center, Voters Telecomm Watch, and other organizations working to preserve constitutional freedoms on the Internet.

The e-mail mailing list has about 7,100 subscribers, many of which are local mail exploders (which resend to many other addresses). Dispatch is known to circulate on internal U.S. government e-mail networks, including the Pentagon's.

Technical Notes

Combined Web and Gopher traffic through the Dispatch's archives average roughly 1,500 hits per day. Because Meeks uses few graphics, most of those hits are document retrievals.

The CyberWire Dispatch archive server runs GN for Gopher/WWW services on Liberty Hill Cyberwerks's main server, an Intel 486/66 with 32MB of RAM, running BSD UNIX.

The mailing list is managed with Majordomo.

Liberty Hill Cyberwerks uses a simple sed script to reformat the mailings into a form usable by the Web and Gopher server. Appropriate links are added by hand to HTML versions of Dispatch.

Advice

Theise says:

Never underestimate the amount of time it's going to take to manage a large mailing list. Despite the fact that Majordomo, LISTSERV, and Listproc [three different e-mail list server programs] automate many of the functions needed to keep a list going, you'll be amazed--you'll despair?--over what can happen with a large list. People abandon their e-mail accounts without unsubscribing. Bounced mail comes back with headers that take minutes to decipher, if they're decipherable at all. People try to spam [send unrelated messages to] your list. And, as anyone who's been on a list for more than two days knows, there's a steady stream of people who don't know how to unsubscribe from the list, even if you've told them repeatedly.

Hearts of Space Radio/Records

<http://www.hos.com/>, <gopher://hos.com>, <mailto:info@hos.com>

Liberty Hill Cyberwerks' Eric Theise tells this story about driving from San Francisco to Minneapolis for the 1993 Gopher Developers' Conference:

In the middle of Nebraska in the middle of the night, a National Public Radio affiliate began broadcasting the syndicated Hearts of Space program. After a very long day of country and western and classic rock, the HoS mix of space, ambient, electronic, and ethnic music was a welcome change to these tired ears.

At the end of the program, producer/host Stephen Hill announced that program playlists were available in the WELL's radio conference. . . . As soon as I arrived, I sent him e-mail asking if he'd be interested in making his playlists available beyond the WELL's subscription-only doors.

Within weeks, the duo had opened the Hearts of Space section of the GN-based WELL Gopher, offering a searchable archive of play lists from mid-1990 to the present, a list of the nearly 300 stations carrying the program, a catalog of CD and cassette releases on Hearts of Space's record labels, and a simple "About Hearts of Space" file. They uploaded new playlists every week and began to develop a presence on the Usenet newsgroup rec.music.newage, Theise recalled.

A year after they teamed up, the Web was becoming the information space of choice. After some discussion Liberty Hill Cyberwerks was selected to create and maintain the standalone Hearts of Space Internet site, hos.com. The Gopherspace and listener mailing list came online in fall 1994; early 1995 saw the opening of the Web site and the installation of an e-mail gateway between Hearts of Space's office/studio network and the Internet, Theise said.

From a marketing perspective, one of the most valuable features of the Hearts of Space Web site is the ability to link selections in the Hearts of Space Radio playlists to descriptions, cover art, and sound samples from releases on Hearts of Space's own labels. Hearts of Space is arguably the most active New Age label on Usenet, monitoring rec.music.newage, rec.music.ambient, alt.radio.networks.npr, and other groups for discussion and questions about its programming and artists.

Technical Notes

Marking up five years' worth of playlists by hand would have been impossible, so Theise created a relatively simple awk (UNIX utility) script to recognize HoS releases by title and insert links in addition to the boilerplate HTML. Simple awk and sed scripts were used extensively in the initial formatting of the release sheets for the approximately 70 releases from Hearts of Space, and they still use sed scripts to facilitate the weekly site updates.

The Hearts of Space Internet server is an Intel 486/66-based computer with 16MB of RAM. It runs the BSD/OS UNIX operating system, uses Majordomo as its mailing list manager and info@hos.com autoresponder, and GN as its Gopher and Web server. It's a textbook example of using GN to deliver materials via Gopher and the Web and makes fairly extensive use of GN's built-in search facilities so that browsers can search the play list archives and the recordings catalog. The server is housed at Liberty Hill Cyberwerks and connected to the Internet at T-1 speeds.

As of July 1995 nearly 2,100 listeners receive the weekly playlist via
e-mail, and the autoresponder--mentioned at the end of each radio broadcast--averages dozens of inquiries per day. Combined Web and Gopher traffic averages around 1,100 hits per day. Credit card orders for CDs and cassettes are accepted through Web forms and e-mail; customers concerned with security can fax their orders or send them through regular mail.

Hearts of Space's internal QuickMail system exchanges e-mail with the Internet using the StarNine MailLink Remote UUCP gateway, allowing staff to be in communication with artists, radio stations, distributors, and others worldwide.

InfoUCLA

<http://www.ucla.edu/>

UCLA's Web site is similar to many university Web sites in that it provides central access to campus resources. These include the campus e-mail and phone directory, campus map, library hours and policies, library card catalog access, admissions and records information, administrative policies, central computing, associated students, bookstore, and student newspaper, and a diverse set of departmental Web, Gopher, and FTP servers. Those departmental sites are rich with content and will become more so as faculty, staff, and students learn new tools. Many research centers are putting at least some of their research material up, and several national journals based at UCLA are thinking about turning electronic.

The interesting thing that UCLA is doing, though, is providing virtually free dial-in SLIP/PPP access to the Internet and e-mail for all students, staff, and faculty. Called BruinOnline (after the campus mascot), this ambitious project started as a reaction to an increase, from one year to the next, of 10,000 undergraduates who wanted e-mail accounts. At the time those were mainframe e-mail accounts, and that huge increase began to provoke worries of affecting mainframe performance.

The implications of every student's having e-mail access (from campus labs, if a student didn't have a computer at home) are just starting to hit the faculty. The possibilities are fascinating: receiving homework assignments by e-mail, creating local Usenet newsgroups or mailing lists for every class, starting online discussion groups and holding online office hours, building class Web sites, and sending and updating problem sets by e-mail or FTP. Some faculty members are already preparing their lesson plans in HTML, and a Slavic languages professor had his students learn enough HTML to annotate a Russian novel.

Technical Notes

InfoUCLA sits on an IBM RISC system/6000, model 59H, with 256MB of RAM, running NCSA HTTPD, freeWAIS, Gopher, and QI (a campus directory service). This machine also provides Usenet News for the campus.

Internet Movie Database

<http://www.msstate.edu/Movies/>, <http://www.msstate.edu/Movies/alternative_access.html>

The Internet Movie Database WWW server is a repository of movie details, reviews, ratings, and trivia as well as biographies and filmographies of actors and directors. The original Cardiff, Wales, site is mirrored by servers in Australia, the United States, Germany, Japan, Korea, and South Africa.

The Internet Movie Database began in 1989 with periodic postings to the Usenet newsgroup rec.arts.movies. Since then it's grown into an "international volunteer effort whose principal objective is to provide useful and up-to-date movie information freely available on-line, across as many systems and platforms as possible," according to its FAQ. It covers more than 47,000 movies with more 630,000 filmography entries and is expanding continuously. The database is available via FTP, WWW, and e-mail and grows by about 7,000 filmography additions and 500 new titles per week. It includes more types of movie information than you can imagine in an amazing amount of detail.

All data are contributed freely by movie lovers around the world, and the management is all volunteer. The link to rec.arts.movies and other Usenet Newsgroups continues. The author of the Web interface to the Internet Movie Database (Rob Hartill) was inducted into the WWW Hall of Fame in 1994, and the database itself received an honorable mention in the Best Entertainment category. The highly coordinated international volunteer pursuit of a common interest epitomizes what is possible on the Internet.

The users maintain the database and police its accuracy. Anyone who uses the database can fill out an online form or submit corrections or updates by e-mail. Database managers have special scripts to help them update the database and keep them consistent. One script warns managers when updates are being made to database items that they have marked "complete" or "verified complete." This is a method of ensuring that good data don't get corrupted by accident or malice while relying on user-supplied data and input.

Hartill, who manages WWW interface to the database, and Col Needham, overall coordinator and creator of the Internet Movie Database, write:

The database is maintained by a core team of 15 people at the moment. For most of them, it's a case of giving up a few hours of their spare time each week and being able to quickly respond to a problem that needs fixing. In the first four months of 1995 about 1,500 different people contributed new data to the system--some just a few lines and others tens of thousand additions.

Hartill adds, "A few of the managers spend 10 or more hours a week, just trying to stay up to date with submissions, and then there's Col, who probably spends more time working on the database than the rest of us put together."

Technical Notes

The voting system (users get to vote on the ratings of any movie they look up) is coordinated independent of the different mirror sites because it actually predates them. Votes are e-mailed to a vote collection address, and updates are processed weekly and distributed to the mirror sites. Mirroring between the sites is done by simply FTPing the updates.

The original interface with the movie database was developed in UNIX shell script language and over the years was converted to faster C code. When the WWW interface was added, the data were rearranged for optimal speed for the most common types of query, but that meant losing the wider range of query types possible via the traditional interface. The database was also indexed differently to save space and allow for quicker response by the Web server. Special programs were written in C and Perl for different types of queries.

The mail server runs on a Pentium machine running SCO (Santa Cruz Operation, Inc.) UNIX; the main FTP site is a NeXT machine. Cardiff's WWW server runs a Sparc 5 donated by Sun Microsystems Ltd. Until January 1995 Cardiff's Web server was running from an overused Sparc 10. Serving the Web brought it to its knees at busy periods.

All time, computers, and effort are donated. The mail server and the server that collects contributions to the database are provided courtesy of the PC Users Group in England.

From Cardiff alone the database gets more than 60,000 connections a day. The combined number of requests to the database via WWW exceeds 150,000 a day, and with new U.S.-based mirror sites in the pipeline, these figures are sure to grow dramatically. For current statistics at the Cardiff Web site see <http://www.cm.cf.ac.uk/htbin/Graphs/todays_stats>.

Advice

Rob Hartill: Don't be put off by the size of the Internet. It's a friendly place with plenty of people willing to give good advice or offer help. . . . Don't be fooled by the silky smooth duck gliding across the pond--there's a hell of a lot of kicking going on underneath. Things don't always run smoothly.

Col Needham: Start small and keep at it. . . . Be wary of an endless series of e-mail questions from users who can't be bothered to read the FAQ.

Kaleidospace

<http://kspace.com/>, <gopher://gopher.kspace.com>, <ftp://ftp.kspace.com>

This arts-oriented Web, Gopher, and FTP site allows artists in all media to showcase and even sell their work on the Internet. Artists pay a set-up fee of $100 and then either a monthly fee of $25 or a 10 percent commission on their sales. Artists can then announce future exhibits and installations, list their biographies, allow sample downloads of their music, writings, photographs, paintings, or rotating video clips of their sculpture or ceramics. Also, any Kaleidospace artist may set up a time for online meetings using the Kaleidospeak chat room.

Before Kaleidospace had a secure online transaction system running, its procedure was to take orders via online forms or e-mail and then call back to confirm the order and take credit card information. Now Kaleidospace is running Netscape's Commerce Server for all the artists, so it can take secure orders directly over the Internet. The site also supports phone, fax, and e-mail orders and maintains the call-back option for unsecured orders.

Jeannie Novak, Kaleidospace founder and principal programmer and designer, started the Santa Monica, California-based company in January 1994 because, as an independent musician, she wanted an alternative way to distribute her album of classical and acoustic piano music. The Internet, particularly WWW, seemed ideal and she thought other artists would pay to join her. As of July 1995 Kaleidospace had Web pages for more than 200 artists and had moved into Web consulting with nearly 100 commercial clients.

Novak said, "We do ALL the HTML for the artists. We also do all the scanning/digitization, though some artists provide us with predigitized material. This way we create a consistent interface for the users and are able to support artists who don't have computer access (about 40 percent)." Pete Markiewicz, who does general support at Kaleidospace, adds that Novak writes all the HTML by hand but is working in Perl to create auto-entry forms specifically targeted for Kaleidoscape Web page formats. They are also setting these up so that client companies can add material to their own servers easily.

Technical Notes

The primary Web server runs on an accelerated Sun Sparc 2 with 64MB of RAM running a modified version of NCSA HTTPD. It was modified to enhance security.

The Netscape Commerce Server runs on a Pentium 90 with 64MB of RAM. The combined Gopher-FTP server was run off a Macintosh Quadra with 24MB of RAM for a year, but Kaeidospace is moving it to a UNIX machine. Quadra's performance was fine, but the Macintosh is more useful as a development platform. Novak and Markiewicz claimed that Web site development is easier on Macintoshes, so they build them there and transfer them to UNIX later.

They frequently get several thousand individual users a day (with "hits" in the hundreds of thousands).

The chat room is done with WebChat server software. <http://www.irsociety.com/wbs.html>

Kaleidospace runs a virtual host system for its commercial clients to allow each to have its own host name.

Advice

Jeannie Novak advises:

The main way to cope with high traffic is to increase RAM to 64 or 128 MB. If access becomes higher in the future, we will probably create a single "virtual" server from several machines.

Style and design are much more important than fancy HTML tricks. Make sure you cross-promote and cultivate relationships with other sites on the Internet. And be prepared to work very hard.

Make sure you have focused, real content not available elsewhere. Too many "cybermalls" put up anything they can get money for rather than offering variety within a central theme. The resulting "hodgepoge.coms" are jarring and ultimately irritating to the users. On our own site we work with artists, musicians, writers, CD-ROM authors, performers, filmmakers, animators, software developers, and artisans--but they're ALL independent [meaning not under contract]. Real content on the Internet is always valuable.

Los Angeles Murals Home Page

<http://latino.sscnet.ucla.edu/murals/>

This Web site is attempting to document and display some of the 1,500 murals in Los Angeles. It is a combined effort of the Social and Public Arts Resource Center, the Mural Conservancy of Los Angeles, UCLA Chicano Studies Research Center, Social Sciences Computing at UCLA, and Robin Dunitz, who is contributing the contents of her book, Street Gallery: A Guide to 1,000 L.A. Murals to the Web site.

Plans include providing a clickable map of Los Angeles, with links to murals by location, artists' biographies, and lists of murals by artist, sponsor, neighborhood, and subject. It will also include related documents by artists and art organizations.

Technical Notes

This site is run on a Sun Sparcstation 10, with 36MB of RAM and running NCSA's HTTPD server.

MicroSemanario

<gopher://gopher.uba.ar/11/microsem>, <ftp://ftp.informatik.uni-muenchen.de/local/rec/argentina/micros>, and <http://www.informatik.uni-muenchen.de/rec/argentina>

MicroSemanario (semanario means weekly in Spanish) was born to serve the many Argentine ex-patriates around the globe. Because of the country's economic woes, many Argentine scholars and scientists left during the last 30 years to continue their careers in the United States and Europe, with smaller numbers in Brazil, Israel, and Australia. In 1989 and 1990 this situation sharply worsened because of hyperinflation. Because detailed news about Argentina can be difficult to obtain outside of Argentina, especially in Spanish, the School of Sciences (Facultad de Ciencias Exactas) of the University of Buenos Aires decided in November 1990 to start sending out Argentine news by e-mail over the Internet.

This free e-mail newsweekly is divided into sections dealing with politics, economics, society, culture, education, science, and sports. According to Guillermo Gimenez de Castro, originator and director of MicroSemanario, "MicroSemanario is a weekly summary, and as such we take our sources from all the forms of press available: oral, written, and televised. We select the news that seems relevant and write summaries of it. Whenever necessary, we indicate the source. But we DO NOT transcribe the news. We also send news of the academic world and job opportunities, sports scores, and something of the cultural life."

MicroSemanario is distributed through two mailing lists totalling more than 3,000 subscribers and has an estimated readership, including family members and friends of the recipients, of 6,000 to 9,000. Among their subscribers are American and British high schools and Latin American studies departments at universities. MicroSemanario is also read by non-Argentines, who usually have had some relation with Argentina. The size of the weekly edition has grown to roughly 60K so they split it into two parts to avoid e-mail problems.

MicroSemanario gets some support from the university but relies to a large extent on volunteer labor for the writing and editing of the news summaries. The director's job is also unpaid. Current and back issues (more than 200) are archived on the University of Buenos Aires's Gopher site. There are also Web and FTP archives in Germany. (See URLs at beginning of entry.)

According to an article for Internet News by Ricardo Bravo (Centro de Comunicacion Cientifica-Universidad de Buenos Aires) and director Gimenez de Castro, "The main aim of MicroSemanario is not that of journalism, but to provide fellow-countrymen something to stay in touch with their society, alleviating in part the feeling of losing links to Argentina. Moreover, Micro usually provides information on living conditions in Argentina, and attends--as far as possible--to questions of readers wishing to come back."

To subscribe to MicroSemanario send e-mail to majordomo@ccc. uba.ar and in the body of the message put subscribe micro.

Technical Notes

MicroSemanario is sent out entirely in low-end ASCII (32-127) to avoid e-mail translation problems. Accents are omitted, which they say generally causes little confusion. Special characters are replaced with #, and though it doesn't look good, it does cross e-mail systems consistently.

The University of Buenos Aires's Gopher and WWW server runs on a Sparc Sparcstation.

Midnight Special Bookstore

<http://msbooks.com/msbooks>

Midnight Special is an independent bookstore in Santa Monica, California, that has its own WWW site. The store rents Web space for $75 per month and is busy converting store sections into Web pages with resident experts, e-mail discussions, book reviews, and employee recommendations. Midnight Special is a social and cultural bookstore that presents "books and ideas to change the world." The store's motto is taken from Bertolt Brecht: "Hungry man, reach for the book: it is a weapon."

Midnight Special's Web site advertises its calendar of events and offers weekly lists of best-sellers. It also sells videos of its regularly scheduled readings by, and interviews with, authors. Its inventory is searchable by title, author, and ISBN number for 90,000 titles and leads to order forms along with the search results. The bookstore also sells political t-shirts and posters. Midnight Special's Web site also offers opinions and solicits opinions from users (just fill out an online form). And if you can't think of anything to say, check out the list of issues, with links to related books.

Tony Cappelli, the bookstore's Webmaster, has even started a Web column called "Tools of Dissent" in which he comments on the latest Internet technology and how it might be used for social and political action. The bookstore used the Internet to expand its market after several book superstores have moved in down the block from Midnight Special. As Cappelli says, Midnight Special is "using technology in some unbookstore-like ways."

Technical Notes

Midnight Special's WWW server is running NCSA HTTPD on a Pentium computer with 24MB of RAM and BSD UNIX. The Web provider service writes Midnight Special's CGI scripts. Approximately 80 different IP addresses (distinct machines) hit Midnight Special's WWW site, accumulating about 200 hits per day.

Monster Board

<http://www.monster.com>

The Monster Board is a career development site and a center for human resources information on the World-Wide Web. Anyone may post a résumé to the "Resume On-Ramp" for free. In May 1995 the Monster Board had more than 10,000 résumés and expected that number to skyrocket with its increased investment in equipment, advertising, and faster Internet feed.

This combination Web-WAIS site is a branch of ADION Human Resource Communications, a New England recruiting and advertising company. Job searches and browsing human resource information are free. Job listings cost money, and companies can pay more for more elaborate Web spaces. Companies are also charged for the right to review the résumé database. These aren't traditional résumés, however. Posting a résumé is a matter of filling in online forms, with questions that include all the traditional résumé subjects like education, work experience, technical skills, willingness to move, and salary requirements. The data go right into database fields that feed the search programs. There is even room for a brief cover letter message. Once users fill out the form, they receive a password good for updating at any time during the next 12 months. After that their résumés are dropped from the database, which helps to ensure a certain currency. Whenever users update their résumé, the expiration date is extended for another 12 months.

The Monster Board provides companies with two separate résumé-related services:

The colorful monster images, meant to symbolize big and unique ideas, were actually in use by ADION in other projects but carried over well to the Web. ADION also claims that the monsters represent the characteristics its subscriber companies want in their employees: creativity, innovation, communication, energy, and positive results.

John Kirby, project manager for the Monster Board, claims, "The Monster Board will literally change the face of human resources because jobs will be posted to the world, and you'll be collecting applicants from all over the world as well. Applicants can take advantage of the searching capabilities to narrow their focus considerably."

Technical Notes

The Monster Board runs on two Sun Sparc 20s (it started with a Sparc 2) running CERN HTTPD and a custom WAIS gateway based on freeware. It sees 10,000 connections per day and has a 10megabit Internet feed, upgraded from 56K.

Advice

Kirby says, "Make sure your site is unique. Make sure your idea is unique. If it's commercial, you should question hard if someone is going to buy it online. If it's a side business, don't rely on the Web to make your business for you unless you have a well-defined niche. It will certainly add to your business, though."

1990 Census Lookup

<http://cedr.lbl.gov/cdrom/doc/cdrom.html> and <http://cedr.lbl.gov/cdrom/lookup>

The Lawrence Berkeley Laboratory and University of California at Berkeley are working together to create the world's largest online database of federal government statistics. Lookup is an experimental WWW server for retrieving data from 1990 U.S. Census summary tape files. Lookup uses the University of California's CD-ROM Information System. Lookup source code is publicly available, and development of compatible modules is strongly encouraged.

Although not all levels of census data are available, the Web site offers cross-tabulations of answers to various combinations of census questions by national, state, and urbanized metropolitan regions. The lab's goal is to help others set up similar servers with the same or different data. Lookup is receiving 80,000 URL requests (from 7,000 users) each month (as of April 1995). Usage is doubling every month.

OncoLink

<http://cancer.med.upenn.edu>, <gopher://cancer.med.upenn.edu>

OncoLink is a patient-physician cancer information resource sponsored by the University of Pennsylvania Medical Center and available via Gopher, WWW, and telnet. OncoLink attempts to provide one-stop shopping for the patient, family member, health care provider, researcher, or browser searching for cancer-related information. OncoLink was awarded the 1994 International Best of the Web Award for best professional service. Since its inauguration on March 7, 1994, OncoLink has been accessed 350,000 times from more than 75 countries and averages 4,000 accesses a day.

According to a paper presented to the American Medical Informatics Association in the fall of 1994, the OncoLink Web and Gopher sites were originally ordered around the academic specialties of the faculty contributors. This made sense to the faculty, but users (primarily laypeople who either knew someone with cancer or had it themselves) soon e-mailed urging a different arrangement. They requested, and the center quickly added, a menu of various cancer types, so the layperson can easily find all the relevant material for a particular cancer. This is one example of the need to build feedback mechanisms into even the most thoroughly planned servers. The paper also discusses the center's study of server logs, particularly in terms of what links were being followed and what patterns they revealed. The paper is available at <http://cancer.med.upenn.edu/manuscripts/amia.html>.

Statistical reports through April 30, 1995, show that 83.4 percent of all transactions are from Wide-World Web clients and 16.6 percent from Gopher clients. Approximately one-third of the WWW clients accessing OncoLink appear to be text-only clients. (This might be surprising to those who think that everyone is using the WWW for the graphics.)

OncoLink also has found that whether a link is text or a small image makes a difference in how often that link is picked. The image links appear to be more popular than text links.

Technical Notes

OncoLink is running a DEC station 5000/25 with 40MB of RAM. The model 25 is roughly a 20 MIPS (millions of instructions per second) machine. OncoLink's Gopher server software is GN, and its WWW server runs NCSA HTTPD.

Advice

According to Dr. Joel W. Goldwein, co-editor in chief of OncoLink, you should be sure to

  1. Clarify your mission.
  2. Develop a strong infrastructure.
  3. Select good people to help.
  4. Turn your e-mail off after 6 p.m. and marry a saint.

On-line Books Page

<http://www.cs.cmu.edu/Web/books.html>

The On-line Books Page Web site, based at Carnegie Mellon University in Pittsburgh, is an index of hundreds of online books. It also points to some common repositories of online books and other documents, including the Project Gutenberg texts, as well as specialty or foreign-language repositories and book catalogs and retailers.

According to site administrator John Ockerbloom,

The On-Line Books Page started in 1993, when the Web was still a novelty in most places. . . . [A staff member] had made up some nice HTML versions of some Project Gutenberg texts, and as part of our initial departmental Web, I made up a page that pointed to his texts and included pointers to a few major book repositories like Gutenberg. Later I noticed that there was no overall list of titles and that this would be useful to avoid having to check each archive individually. So I expanded the listings and eventually added other features like a search capability.

Special exhibits include Banned Books On-line and Celebration of Women Writers.

Technical Notes

The Web servers are Sparcstations running NCSA HTTPD 1.3 (with local modifications). The pages are served from AFS (Andrew File System), which Carnegie Mellon uses as its campuswide filing system.

The On-Line Books Page only indexes the books; all texts are stored elsewhere, so it uses relatively little space.

Advice

Ockerbloom lists the following steps for publishing information:

  1. See what's already out there on the topic. Decide whether you should work on putting up new material, work on indexing or explaining existing material, move into a different or more specialized niche, or some combination of these things.
  2. Find a place to publish your pages. If you can't do it through your school or workplace, many public access providers now give people the opportunity to publish Web pages. Find one that is reasonable (in price and in administration), seems to have an easily reachable Web server, and can provide you the space you need. Note that if you're indexing information, like us, rather than supplying a lot of new material yourself, you don't necessarily need a lot of space.
  3. Construct your pages. Make them well organized and easy for people to find the content they want. You don't need fancy graphics or markup; those often get in the way more than they help.
  4. Tell people and indexes like Yahoo about your pages and commit to maintaining your pages. Respond to e-mail in a timely fashion, and leave some indicator on your pages as to how often they're updated. Word of mouth will do much of the rest; once word gets around people will add their own links to you and suggest additional information you can put on your page.

Palo Alto Real Estate

<http://none.coolware.com/real/realestate.html>

This Web site offers some real estate listings in the Palo Alto area of northern California as well as a map and information about the community. Agents and their telephone numbers are listed; users are asked to tell agents they saw their listings on the Internet. Each house gets its own page with color photos.

Technical Notes

Palo Alto Real Estate runs its Web site from a Sun Sparcstation. The site is accessed about 70,000 times a week.

Advice

Keith Cooley of Coolware, the sponsor of the Palo Alto site, says, "Spend two hours learning HTML and then do it--break the mold and do what feels right."

PhoNETic

<http://www.soc.qc.edu/phonetic/>

PhoNETic is a Web service that converts phone numbers to letters and vice versa. That's all it does, but it gets 3,000 to 6,000 hits per day. Either people just have to see it to believe it, or converting phone numbers to letters is more useful than you might think. This is a textbook case of using CGI scripts (or programs) behind a Web server to perform a task for whoever wants it.

PhoNETic was designed by Nick Sklavounakis and Nikolay Uglov, LAN/UNIX system administrators of the Department of Sociology at Queens College of the City University of New York. Sklavounakis recounts the origin of PhoNETic: "It was a rare (very rare) day when my assistant and I didn't have much to do. I thought it would be a good idea to experiment with writing CGI scripts and implementing interactive forms on our Web pages. Our little experiment turned out to be quite popular! I am still amazed at the type of attention it has gained."

Technical Notes

PhoNETic runs NCSA HTTPD from a DEC Alpha 3000/300X (175mhz) with 96MB of RAM. It shares Queens College's partial T-1 link to the Internet. The CGI scripts were written in DEC OSF/1-C (standard UNIX C). The graphics for the page were created on a Macintosh Centris 650 using Photoshop 3.0 and a Connectix QuickCam. The day PhoNETic was "Cool Site of the Day'' (awarded by Glen Davis at InfiNet), it had more than 70,000 hits. Its server handled the traffic with no trouble.

Advice

Nikolay Uglov: Make sure your server can handle the load in case of a huge success; be friends with your network administrator. Write the CGI script first [and the] HTML script second.

Nick Sklavounakis: I think most important is to determine how much activity you think your server must deal with on a daily basis. Based on that, the appropriate hardware and software can be determined, as well as the type of link to the Internet you will have. At this stage, I wouldn't go any less than a T-1 if you plan to have an active server.

The users will enjoy quick response time, and the server will have a less overall load (people can connect and disconnect quicker when data is transferred faster). As for hardware, there are many options available now, and for newcomers I would probably recommend looking at the alternatives to operating in a UNIX environment, like using a Macintosh Internet Server, Windows NT, or OS/2.

Playboy

<http://www.playboy.com/>

Playboy Magazine's Web site home page, with its faint background pattern of the Playboy bunny logo, includes excerpts from the magazine, cartoons, and Playmate images, links to Playboy news and announcements, the full text of some interviews from Playboy archives as well as the Playboy Advisor FAQ file. In addition, the Web site accepts online questions to the Playboy Advisor and is conducting an online search for a photo feature of the "Girls of the Internet."

According to Eileen Kent, who started and runs Playboy's Web site, the magazine originally got an Internet connection in order to see what was being said about it on the Playboy newsgroup and mailing list (alt.mag.playboy and playboy-request@lovesexy.com). She says that as soon as she saw WWW she knew it was a natural for Playboy. But just doing an electronic version of the magazine never appealed to her, although Playboy updates the Web site simultaneously with the newsstand version. Playboy has had a "phenomenal response with a great deal of positive feedback from users," Kent says. When Hugh Hefner, founder of Playboy, saw a collection of e-mail that came in from the Web site, he said it reminded him of the letters readers sent when the magazine first started.

Kent says that to her, "The best, the really good stuff [on the Internet] is the homegrown stuff." Mirsky's Worst of the Net <http://turnpike.net/mirsky/Worst.html> is one site she finds particularly hilarious. "WWW is the greatest thing since Gutenberg. As a bottom-up publishing phenomenon it's excellent," says Kent. "This is a shot at getting your stuff out there to the world without having a million dollars."

One concern she has is that Net users accord copyrighted material insufficient respect. She's convinced that "every year a new crop of freshmen come in with Internet accounts, and they aren't getting educated about copyright issues as they should be. They don't seem to be aware that there are laws (in the U.S.) that make application software infringement a felony." Playboy often finds illegal archives of Playboy images and contacts site administrators about these copyright infringements, Kent said.

Playboy has been using Netscape Server and is moving to Netscape Commerce Server. Playboy created its HTML files in-house and gets more than 800,000 hits per day--"and not just downloading the Playmate images," Kent says.

Advice

According to Kent, "Nobody should limit the Net to just thinking of it as a marketing phenomenon. The clutter is terrible. There is too much stuff happening on most Web pages. You need a purity of design that makes it comfortable for the user. There's lots of publications and Web sites [that] aren't used to having to listen. They should be smart enough to listen to their users. The users will tell you what they want."

Project Gutenberg

<http://jg.cso.uiuc.edu/pg/pg_home.html>, <ftp://uiarchive.cso.uiuc.edu/pub/etext/gutenberg>, and <ftp://ftp.etext.org>

The philosophy of Project Gutenberg is to make information, books, and other materials available to the general public in forms that most computers, programs, and people can easily read, use, quote, and search. That means plain vanilla ASCII text, and the folks at Gutenberg are proud of it. They reason that years from now, when programs and systems have changed, plain old ASCII, readable on Macintoshes, PCs, and UNIX systems, will still be usable, if not any prettier than it is today.

Project Gutenberg has been publishing works in the public domain with the help of volunteers since 1971 when Michael Hart, professor of electronic text and executive director of Project Gutenberg at Illinois Benedictine College in Lisle, Illinois, decided to "earn" the equivalent of $100 million-worth of computer time he had been given on a mainframe at the University of Illinois. (See the University of Illinois's Web site above for the full-story.) Hart figured there was no "normal computing" he could do that would be worth that amount of money. So he looked for something that computers could do that would have an equivalent value. His answer was the storage, retrieval, and searching of what is on library shelves.

Starting with the U.S. Declaration of Independence, Project Gutenberg has compiled 250 online books entirely through volunteer labor. The goal is 10,000 books by the year 2000. Project Gutenberg selects public domain books in the categories of light literature, heavy literature, and references.

Technical Notes

Project Gutenberg runs a modified version of the NCSA HTTPD server, which runs Mach NextStep v3.0 on an original 68000 NeXT cube. The modifications were designed to provide a firewall that has serving capability similar to what is available in the CERN HTTPD.

The Web site enjoys about 13,000 hits a week, well distributed among all its pages, which is considered a good sign that people are actually reading them and not just browsing. Statistics about machine load are available at <http://jg.cso.uiuc.edu/jg.cso>.

Advice

John M. Koontz (aka Kahn), the system administrator, advises:

Provide good content. Contribute something to the Net soup, don't just add more water.

Courtesy. Consider people who use text browsers or have slow Net connections. Are all those large images REALLY necessary? Also keep in mind that Netscape, although neat, is still nonstandard.

Fresh links. If you provide links to other sites, be sure that they are current and operational. A link to a closed site is no fun. . . . Keep them [the links] to a minimum and keep them good.

Travels with Samantha by Philip Greenspun

<http://www-swiss.ai.mit.edu/samantha/>

Philip Greenspun is a graduate student in computer science at MIT who has a background in advertising photography and a knack for putting words together. In late 1993 he found the World-Wide Web and decided to put the story of his summer travels around the United States on the Internet as a "Web book."

In the literary tradition of John Steinbeck's Travels with Charlie, Travels with Samantha is a highly readable chronicle of the places he saw and the people he met, as well as his moving reason for taking the trip. Travels with Samantha includes 250 excellently digitized photographs to go with the stories written on his Macintosh PowerBook (Samantha) while on the road. Travels with Samantha won a 1994 Best of the Web Award and attracts more than 1,000 readers a day.

Since Samantha Greenspun has written of his travels to Berlin, Prague, and New Zealand and put up earlier writings on photography and computer science. He even started a legal defense fund for a friend's brother. Because there is no charge to read Travels with Samantha, he was surprised when he started making money. With 2,000 photos scattered among his online stories, he'd inadvertently become a stock photo agency. After several months he'd received five photo requests and one assignment and was about $5,000 richer. Now he's working on software to automate the maintenance of indexed photo archives.

Technical Notes

Greenspun also edits two online journals, Web Tools Review <http://webtools.com/wtr> and Photo Journal <http://photo.net/photo>. He uses the PBMTOOLS graphics package to do his graphic manipulation. His Web server runs on a HP 9000 series UNIX machine with a T3 connection to the backbone. His Web server ships out 1.5GB per day answering user requests.

Advice

Greenspun finds that "Web publishers with desktop scanners and 8-bit video boards are the biggest source of ugly images on the Internet today." He recommends using a 24-bit video card for any image work, Kodak PhotoCD as the minimum acceptable quality scan, and a T-1 connection as the minimum speed necessary for serving popular pages with many photographs.

Virtual Shareware Library

<http://www.fagg.uni-lj.si/SHASE/>

Descriptions of more than 100,000 pieces of software are available on the Internet, archived at FTP sites like SimTel, CICA, Hobbes, and SunSite. Typically, these archives are mirrored to reduce the load on the original site. The Virtual Shareware Library (VSL) acts as a catalog by using a search-and-retrieval mechanism (SHASE) behind a WWW server to provide a user-friendly way to select particular sites and then search those software descriptions.

Dr. Ziga Turk (pronounced Zheegah Toork), a member of the faculty of civil engineering at the University of Ljubljana, Slovenia, developed the service out of personal frustration, "The Internet opened up tons of shareware and related software," but he could not tell what was available and what wasn't. "Indexes of CICA and SimTel provided the most vital information," he recalls. "At about the same time I started the WWW server, I found out that Perl is a language everyone uses for CGI scripts. As an exercise in Perl and for my own use I wrote the search engine pointing to CICA and SimTel only."

It was rudimentary but proved popular. "I started to receive mail from all over the world, the great majority of which praised the service. The feedback, 5 to 20 thank you notes a day, was the greatest motivation for developing the service," Turk says. "Heavy usage was the motivation for more improvements. One remaining problem is to verify that the sites storing the files are really there."

Another, more serious problem was a legal matter that arose when some FTP site managers claimed their indexes should not be used in Turk's service without their explicit permission. That forced Turk to close his service for a week or so, until he and the site managers were able to reach a friendly agreement. "The lesson I learned was to ask for permission from the managers of all the other archives (most had been contacted before)," Turk said.

Technical Notes

In Ljubljana the VSL runs on an HP710 workstation running NCSA's HTTPD server. Mirrors run on other, usually better, machines.

When the VSL-related load on the server in Ljubljana approached 50,000 hits a day, users at the university found they had difficulty gaining access to other services run from the same server and to mirrors pumping the database each day, so now the server in Ljubljana does not allow access for the commercial and "unresolved" domains. A growing number of mirror servers are offering exactly the same service (three in North America and two in Europe).

Advice

Turk offers the following advice: "First impressions are important. When you announce [your server] make sure it works perfectly. Users will not come the second time if you disappoint them. If you write forms, test them with UNIX Mosaic. It's the most picky about the case of keywords."


small image of cover of Internet Publishing Handbook
Table of Contents