THE FORUM NEWS GATEWAY

README file for version 0.5

This is a beta release! Trust it a little.

The Forum News Gateway puts netnews on the World Wide Web. It's a server that talks to one or more news servers and serves news articles as HTML pages to connecting web clients. It lets users subscribe to groups and keeps track of what's been read, like a traditional newsreader. One advantage is that the news articles themselves can be written in HTML.

The gateway is written in perl. It will probably work only on a unix machine.

The software is copyrighted by Swarthmore College, "all rights reserved." We're allowing free use with no restrictions.

This document explains how to set up and run the gateway. The sections are:

  1. Disclaimers and Acknowledgements
  2. Install, Replacing an Old Version
  3. Install for the First Time
  4. Start Up the Gateway
  5. Try Out the Gateway
  6. Shut Down the Gateway
  7. Write Your Own Welcome Page
  8. Restricting Access
  9. Administration Features
  10. URLs
  11. All Those Data Files
  12. Maintenance and Troubleshooting; Known Bugs

1. Disclaimers and Acknowledgements

The Forum News Gateway used to be called The Perly Gateway, but we changed the name for several reasons. Our new name is less catchy but we hope more informative.

The gateway is a bit slow. It's adequate for modest workgroups, but it can't handle a large number of simultaneous users. Of course, it all depends on how fast your computer is.

In a fit of unaccustomed creativity, the Forum News Gateway was thought up by Helen Plotkin and Asa Packer. It was written by Jay Scott and Asa Packer. We work at The Geometry Forum, a project in mathematics and education, and we wrote this gateway as a tool for our own project. To find out about our main purpose, see

http://forum.swarthmore.edu

We're eager for feedback. Send bug reports, comments, problems, suggestions, expressions of stunned adoration, and kitchen sinks to me, jay@forum.swarthmore.edu.

Oodles of thanks to our courageous beta testers. Many people have been helpful, but Jean-Philippe Martin-Flatin <syj@ecmwf.co.uk> came up with particularly good suggestions. Mike Baptiste <baptiste@bnr.ca> showed distinguishing valor. The Forum Beta Tester of the Month is Ken Stewart <kens@slick.whoi.com>, who went beyond the call of duty.

2. Install, Replacing an Old Version

If you have an older version of the Forum News Gateway and you want to install the new version, follow the instructions here. If you do not already have the Forum News Gateway, and you're installing it for the first time, skip this section and follow the instructions in the next section.

  1. Unpack the distribution into a fresh directory. This is both easier and safer than trying to install over your existing copy (see the warning below, in paragraph 4).
  2. Edit the "config.pl" file. You can't copy your old one, because the new one may have more stuff in it, but you can read through the old one and make the appropriate changes to the new one. Later, when you've made sure it's working, you can check out any additions to "config.pl" and see whether you want to take advantage of any of them.
  3. If you have edited the welcome page or other files in the HTMLFiles directory, or added new files, copy over the edited or added files from your old version into the new version's HTMLFiles directory. The new version may have more files in the HTMLFiles directory (for example, version 0.4 includes GIF buttons), so you'll have to be careful not to disturb the new files.
  4. To keep user accounts, copy the UserFiles directory from your old version of the Forum News Gateway into the new one. WARNING: the user cache in version 0.5 improves efficiency, but it gives the gateway a greater chance to destroy user account information--especially if there's a bug in the new cache code. Keep the original UserFiles directory around until you're confident the new version is stable on your system.
  5. Fire it up, perhaps on a different port if the old one is still running, and test it out. If all is well, you're ready to switch over for real.
  6. 3. Install for the First Time

    The Forum News Gateway is a perl program. The first thing you need to run it is perl. You want perl4; it doesn't work with the new perl5. It may help if you know perl yourself, but this is not essential. Your site probably already has perl, but if not, look at

    http://www.metronet.com/1/perlinfo

    Some testers have reported problems with versions of perl4 earlier than patchlevel 36. It's best to use the latest version. To find out what version of perl you have, type

    perl -version

    If it hasn't already been done, you have to convert your system headers to .ph form before the gateway will run. The h2ph program in the perl distribution handles this.

    The second thing you need is the Forum News Gateway distribution. It unpacks into a few directories and a collection of perl files.

    Check the file "config.pl". You need to make at least one change, and it may help to look over the other stuff in the file.

    4. Start Up the Gateway

    The gateway runs with user permissions. Don't run it with root permissions--it could be a security risk, if there is a bug in the gateway. You may want to set up a separate user account to be the Forum News Gateway account.

    Decide what port you want the gateway to run on. If you choose port 8888, then in unix you might start the gateway by typing

    perl httpd.pl 8888&
    The gateway will print some messages as it starts up. The first time you run it, check for failure messages. If your news server info has a typo, the gateway will print a message and continue, so you have to watch it. In the future, you'll probably want to direct the messages to a log file; a log file is extremely valuable for tracing problems and reporting bugs. Note that the gateway directs messages to standard error, not to standard output.

    On some systems, you may have to fiddle with the header (.ph) files to get it to run. We only require one, "sys/socket.ph" in "common.ph". If you have to add more, "common.ph" is a good place to put them (but you'll have to remember to make the change every time you install a new version).

    If you get error -4, "Couldn't make directory for...", then possibly the gateway does not have a directory permission that it needs. In unix, it needs at least "rx" permission on every directory from "/" all the way down to its own working directory (the gateway uses perl -e and -d operators to check directories as it moves down the tree). One user solved this problem by chowning the gateway's directory to the same user that owned all the directories above, and running it under that user.

    If you get a perl error in the "config.pl" file, check your typing. If you get a different run-time error, and you want to try to find the cause yourself, you can search the .pl files for the text of the message (the reported error location will be in our generic error routine, which doesn't help you). As our alpha tester said, this is where it's helpful to know perl.

    If you have trouble getting the gateway to run, please drop me some e-mail and let us know what we could do better. We may not be able to help you (or we would have made it more portable in the first place), but if you have suggestions we'll listen. My e-mail address is at the bottom of this file.

    When the gateway admits that it is listening to connections, you can test it.

    4. Try Out the Gateway

    Fire up your favorite Web browser--as long as it supports forms and authentication. If the gateway machine is "pinch.me.au" and the port is 8888, then give the browser this URL:

    http://pinch.me.au:8888

    A welcome page should come up. The first thing you should do is to create a newsreading account named "root" to be the gateway administrator. Follow the "Create a newsreading account here." link on the welcome page, and give the user name "root". Pick a good password, because the "root" user has the power to shut down the gateway and change account passwords. Fill in the various fields and click the "Create account and start reading news" button at the bottom.

    Now your browser should ask you for authentication info. Call yourself "root" and give the password. You'll end up at root's home news page.

    Now you can play around. Read some news.

    6. Shut Down the Gateway

    As "root", return to the welcome page. Follow the "Server Administration Page" link at the bottom of the welcome page. Only "root" is allowed to follow this link.

    A page with a few links will show up. Follow the "Shut down server" link, and the gateway shuts down.

    I've found that ULTRIX takes several seconds after the gateway closes its local port and shuts down to release the port for use again. I'm told that HP-UX takes several minutes. SunOS may be in between. If your operating system does it too, you'll have to wait a bit after the gateway shuts down before restarting it.

    In unix, it's also possible to shut down the gateway by sending it a signal. This works, but there is some chance that the gateway will be caught in the middle of an operation. The gateway doesn't handle this correctly; there's a small chance that it will get confused and write out inconsistent data files.

    7. Write Your Own Welcome Page

    The distribution includes a generic welcome page. You'll want to edit the file "HTMLFiles/RootPage.html" to something you like.

    You may want to mention:

    You can create links from your welcome page (or elsewhere) to URLs of the form "/Files/<filename>". The capital F in "Files" is significant. The gateway looks up these files in the directory HTMLFiles (or whatever $HTMLFilesRoot is set to; see the section "All Those Data Files") and returns them as they are. These pages don't use HTTP authorization, so anyone who can access the welcome page can access the Files pages; they don't need to have a newsreading account.

    The gateway recognizes file suffixes in "/Files/..." URLs to decide what type of file it is serving. As distributed, it recognizes the usual suffixes for GIF, JPEG and text files, and assumes that any other file is an HTML file. If you need other file types, you can edit the routine TranslateSuffixToType in "config.pl".

    While I'm on the subject of "/Files/..." URLs, I should point out that the GIF buttons are in the directory HTMLFiles/Buttons. If you want, you can draw your own buttons to replace the ones we made.

    8. Restricting Access

    As distributed, the gateway allows anyone in the world to create a newsreading account. Since the gateway is too slow to handle continents-full of active readers, if you have full Internet access (if you aren't behind a firewall) this is probably not what you want.

    There are two ways to restrict access to your gateway. You can disable account creation, so that only the administrator can create newsreading accounts, and you can allow only authorized sites to connect to the gateway.

    To disable account creation, set $UserAccountCreation = 0 in "config.pl". The administrator, "root", can create accounts by following the "Create user accounts" link on the administration page. If you disable account creation, you'll probably want to remove the no-longer-functional account creation link on the welcome page so that people don't get confused.

    To restrict access to sites that you authorize, rewrite the routine &SiteAuthorized in "config.pl" as explained in the comments there. Site authorization has nothing to do with HTTP authorization; it is entirely internal to the gateway. There are a couple of examples, commented out, which show how to allow access from a single machine or from within a single domain. If you want to do anything more complicated, you'll have to figure it out yourself.

    A client at an unauthorized site is sent the file "Unauthorized.html" no matter what the client asks for. Unauthorized sites can't even see your welcome page. "Unauthorized.html" is in the HTMLFiles directory (or whatever $HTMLFilesRoot is set to; see the next section). You may or may not want to edit it to mention:

    9. Administration Functions

    The Server Administration Page, at URL "http:/Administration/Index", is accessible only to "root". It includes these links, which only "root" is allowed to follow:

    10. URLs

    The gateway responds to these URLs. Case is significant; if you change upper case to lower case in these URLs, or vice versa, the gateway will not find the URL. The /Administration and /Newsreading URLs require authentication; the others do not.

    If a <news.server.name> is omitted (so it looks like "//"), the gateway chooses the first server on the @NNTPServers list in "config.pl".

    In URLs with /<news.group.name>/<article id>, it's sometimes possible to leave out the <news.group.name>, giving //<article id>. The gateway has some glitches in handling these abbreviated URLs, but it mostly works.

    http:/ (same as http:/WelcomePage)

    http:/WelcomePage

    http:/Files/<pathname> (pathname may not contain ..)

    http:/Administration/Index

    http:/Administration/Quit (cause the gateway to exit)

    http:/AccountCreationPage

    http:/Newsreading/HomeNewsPage

    http:/Newsreading/PrefsPages

    http:/Newsreading/KillPages

    http:/Newsreading/AllGroupsPages/<news.server.name>

    http:/Newsreading/AllGroupsPages/<news.server.name>/<news/group/prefix>

    http:/Newsreading/NewsGroupPages/<news.server.name>/<news.group.name>

    http:/Newsreading/ThreadPages/<news.server.name>/<news.group.name>/<canonical subject of thread>

    http:/Newsreading/ArticlePages/<news.server.name>/<news.group.name>/<article id>

    http:/Newsreading/TextArticlePages/<news.server.name>/<news.group.name>/<article id>

    http:/Newsreading/PostArticlePages/<news.server.name>/<news.group.name>/<article id being responded to>

    The gateway also handles news: URLs, converting them to equivalent http: URLs. This may make it possible to use the gateway as an HTTP news proxy. Configuring a browser to actually send news: URLs to the gateway, rather than handling them itself, is way beyond the scope of this README. :-) Here are the equivalences.

    news:
    http:/Newsreading/HomeNewsPage

    news:<group.name>
    http:/Newsreading/NewsGroupPages//<group.name>

    news:<message-id>
    http:/Newsreading/ArticlePages///<article id>

    news:*
    http:/Newsreading/AllGroupsPages/

    news:<group.prefix>.*
    http:/Newsreading/AllGroupsPages//<group/prefix>

    11. All Those Data Files

    The gateway creates on the order of a zillion data files.

    • For each news server, there is a file of the tree of newsgroups on that server.
    • For each newsgroup on each server, there is a Thread.DB file with the latest information about the articles in the group.
    • There is a file for the welcome page, and a few others including GIF files for the buttons.

    • For each user, there is a Newsrc and a Kill file.

    Where these files go depends on the setup in file "config.pl". There are several perl variables which contain the names of directories.
    • $PublicDataFilesRoot holds files about the newsgroups and their contents.
    • $ThreadDBFilesDirectoryName, normally underneath $PublicDataFilesRoot, is the root of a directory tree. There is a subdirectory for each news server, and under those are subdirectories for each newsgroup. At the leaves are "Thread.DB" files, each one holding the overview and threading information for a newsgroup.
    • $AllGroupsFilesDirectoryName, normally underneath $PublicDataFilesRoot, holds one file for each news server. The file is named after its news server, and describes the tree of groups available from the server.
    • $HTMLFilesRoot holds HTML files. It must contain the welcome page "RootPage.html", "Unauthorized.html", and the "Buttons". It may also contain other HTML files, which are accessible under URLs of the form "http:/Files/<filename>".
    • $UserFilesDirectoryName holds files with user newsrc and killfile information. It has a subdirectory for each user. Each subdirectory has a file "Kill" with kill and highlight information, and a file "Newsrc" with newsrc and account information.

    12. Maintenance and Troubleshooting; Known Bugs

    If the PublicFiles directory (which caches news server and newsgroup information) gets too big, you can delete it. The files will be recreated automatically--this slows things down, but it doesn't hurt.

    There's no provision for removing accounts. You can remove them by hand: Go to the UserFiles directory (or whatever $UserFilesDirectoryName is set to) and delete the user directories you don't want. This works if the users you delete don't happen to in the user cache (they'll be there if they read news recently enough). To be sure that they aren't, you may want to do this while the gateway is shut down.

    NNTP servers display an apparently unending variety of pathologies. Some of these the gateway can deal with, and others it can't. For example, our local news server sometimes likes to pretend it has articles with no headers; the gateway correctly rejects them. But when our news server lies about the article numbers of articles it has, there is nothing the gateway can do.

    The gateway behaves ungracefully when one of its news servers crashes and stays down. It tries over and over to open the connection, slowing everything down and making life unbearable for everyone.

    The gateway also has trouble when its NNTP connection is unexpectedly broken. Sometimes it issues a whole stream of error messages, and has trouble recovering. If you frequently get the message "The NNTP connection has broken," you can try reducing $NNTPConnectionSeconds in "config.pl"; this may or may not help. I hope to improve it eventually.

    Twice, I've seen infinite streams of error messages when the gateway somehow messed up in handling a signal. This bug has not yet been reproduced for fixing.

    There is a rare bug which happens on our system that causes the gateway to hang while receiving a request. It appears to be an operating system problem.


    jay@forum.swarthmore.edu

    8 May 1995