Veronica C. Lepore
Distributed Systems
Professor Getschmann
April 1998
AFS (Andrew File System)
Introduction
The distributed file systems advancement is probably one of most focused area of Distributed Operating Systems. The Andrew File System (AFS) is one Distributed File System that has become a standard in the business and research worlds. This paper will give the reader an overview of the AFS. AFS was designed as part of the Andrew project, a combined effort of IBM and a research team from Carnegie-Mellon University. Location transparency and location independence are strengths of AFS' design, which can work with a large number of workstations.
Design Purpose of AFS
The Andrew File System (AFS), developed at Carnegie-Mellon University (CMU), Pittsburgh, under a sponsorship from IBM, is one of the distributed network file systems that enables files from any AFS machine across the country to be accessed as easily as locally stored files. AFS uses client/server technology. AFS servers are transparent to the users who use an AFS client to access the information that resides on the servers, but the information is accessible just as if it was stored on the users’ AFS client machine. The resources can be shared through networks, both local area and wide area. "Since AFS hides the underlying network, working in AFS is just like working on your" ("The Andrew File System.", 1998) AFS client's file system, but with access to many more files. The file system appears to the user as a giant virtual hard disk or mega-directory made up of subdirectories from across the entire continent. AFS is designed to support a large number of networked workstations, providing users, application programs, and system administrators with the benefits of a shared file system. Sun's NFS (Network File System) is similar to AFS. The advantages of using AFS over NFS will be discussed shortly.
With AFS, almost all of the users’ data can be cached on their disk(s). "AFS chooses entire files to be the basic unit of data movement requiring every used file to be copied entirely to the local disk and copied back whenever they are modified" (NHMFL, 1997). AFS is able to provide strong system consistency while maintaining good performance. The ability to cache files can save network bandwidth since they can be used again. AFS has been the starting point for the Andrew Project "at CMU that includes the ability for users to operate in a shared file system while totally disconnected for extended periods of time i.e. portable computers" (NHMFL, 1997). AFS is designed so that workstations have enough processing power, and do not use the server processing power whenever possible.
Today, AFS is marketed by Transarc Corporation and has been chosen by the Open Software Foundation as the basis of its Distributed File System (DFS). Open Software Foundation is "a foundation created by nine computer vendors, — Apollo, DEC, Hewlett-Packard, IBM, Bull, Nixdorf, Philips, Siemens, and Hitachi — to promote 'Open Computing'" (Többicke, 1996). The purpose of establishing a foundation is to develop a common operating system and interfaces, based on developments of Unix and the X Window System that will be compatible for a wide range of different hardware architectures.
File Structure & Cells
The AFS file system uses a hierarchical tree structure. The lowest level of the structure, i.e., the worldwide root AFS directory, is called /afs. Subdirectories of the worldwide AFS root directory are called AFS cells. These cells contains each cell "representing an independently administered portion of file space" (Többicke, 1996). In UNIX file system under the root /afs directory, the cells connect to each other to form one enormous file system. It eliminates the need to transfer files between machines.
The worldwide AFS system is built up by a number of cells. The users are members of the cellothers.foo.com. In each cell there are a few servers and several clients. Each client has an extra directory, /afs, under which the entire AFS file system resides. Under /afs there is a directory for each defined cell, i.e. the cell resides in /afs/others.foo.com. This can be shortened to /afs/cell. A bit further down are the user directories. AFS has a mechanism for limiting the disk space a user’ files can occupy.
Volume
A user's designated portion of the AFS file system is a subdirectory called a volume. A client AFS program through a Kerberos authentication/password procedure mounts volumes. The files and programs in a volume can be stored on local AFS server(s) or on a remote AFS file server(s) on the Internet. From the user's point of view at the desktop, it seems as if the volume is physically mounted locally. The local volume and other AFS volumes are tracked by the AFS servers which make them appear as subdirectories of one virtual mega-directory or disk. With this kind of feature, it means that it is unnecessary for an AFS user to use the FTP program to transfer files between directories in the global AFS file space. The transfer of files between directories is possible once the proper directory permissions have been set up by a volume's owner; for instance, an AFS user can simply use basic UNIX file system commands such as cp to copy files from a permitted volume to their personal AFS volume. An AFS user does not have to know the physical location of the files she/he wants to access even though the file could be on a number of machines. The user just requests the file by its ordinary UNIX filename and AFS finds the file automatically.
Access Control List
To gain access to remote cells, the permission is set up with an AFS Access Control List (ACL), "an easy-to-use tool that makes it possible to share directories, files, and programs among AFS users" (NHMFL, 1997). There are also local cells that users can access in their local area networks.
The access lists, which are associated with a certain directory, consist of seven fields, rlidwak — read, lookup, insert, delete, write, administer, and lock. There are four levels: none, read (rl), write (rlidwk) and all. As ACLs differs from the standard UNIX privilege system, it is quite easy to set up directories that a specified group of people, or single individuals, can access. It is also easy to prohibit people from accessing files.
The biggest difference between ACLs and traditional UNIX file permissions is that an ACL is used to apply permissions to an entire directory in an AFS volume. ACL cannot be used to apply permissions to individual files within a directory by using the UNIX chmod command. The ACL of "a parent AFS directory is also automatically inherited by any new child subdirectories created in or moved into the parent directory" (Sandell). The only way to change the ACL of the child subdirectory is to separately apply a different ACL to it.
These rights, shown below, can be given to other AFS users by the owner of the AFS directory in question. In AFS terms, the user who has the rights is called the possessor.
As a convenient option, the separate access control rights shown above can also be grouped together to form four different packages of access control rights. These four packages are:
(Information came from NHMFL, 1997)
Security
- system:administrators
(This group is comprised of system administrators who have all access control rights to all directories in local cell)
- system:authuser
(This group is made of all Kerberos-authenticated users in the local cell)
- system:anyuser
(This group is comprised of all users, including non-AFS users, whether they are authenticated or not)
User-defined AFS protection groups are very useful as they allow you to be very selective in your sharing. AFS users can easily share their directories with other individual AFS users, or with a designated group of AFS users. This means that AFS user can use shared AFS file space to do things like:
(Information came from NHMFL, 1997)
Once AFS clients and servers have spread over the Internet, the users will be able to mount their AFS disk space from microcomputers and UNIX workstations that running an AFS client program. For example, the user will be able to mount the AFS disk space and client like a Macintosh or a Sun workstation. Regardless of the user's whereabouts, the user's personal AFS data and directories will be easy to access, update, and share from within the user's personalized desktop environment.
Cache
The cache feature of AFS client programs is a major source of strength in a heavy traffic network environment. When the client requests information from the AFS server, a copy of the requested file is stored in cache memory on the client computer. Only the modifications are allowed to be made on the copy, not on the original, which remains unchanged on the server until the copy of the file on the client has been closed or saved. The purpose of this design is to reduce requests to the AFS file server and to lessen the impact on overall network traffic. It also allows more clients per network and more clients per server. The powerful workstations, with larger disk drives and expanded memory, that are connected to distributed file systems is the major reason for "relatively low cost performance enhancement in existing network systems" (Többicke, 1996).
NFS versus AFS
NFS clients use Remote Procedure Calls (RPCs) to perform file system like read, write, create, and lookup on files stored by NFS servers. "RPCs work by encoding arguments in a portable format, sending a request and waiting for a reply which is also encoded in a portable format; it retransmits on timeout" (Sandell).
AFS has several advantages over NFS. They include:
(Information came from "Andrew File System (AFS) Support." -emphasis added)
Conclusion
Citations
NHMFL. "Using the Andrew File System (AFS)." Florida State University, Tallahassee, Florida. 1997. Online. http://www.nhmfl.gov/csg/helpdesk/afs/afs1.html. Last Modified Tuesday, December 02 1997 04:20.
Sandell, Björn. "AFS -- The Andrew File System." Online. http://niels.che.chalmers.se/inst/phc/GU/CompRes/AFS.html.
Többicke, R. "CERN Andrew File System User's Guide." CERN. 1996. Online. http://consult.cern.ch/writeup/afsguide/main.html. Verified: 15 Jul 1996.
"Andrew File System (AFS) Support." Center for Computational Science. Online. http://amp.nrl.navy.mil/code5595/afs-support/.
"The Andrew File System." Pittsburgh Supercomputing Center (PSC). 1998. Online. http://laguerre.psc.edu/general/filesys/afs/afs.html. Revised: Tuesday, 10-Feb-98 13:34:05 EST.