Ok dudes, grabed this of a CD-ROM disk of articles, explains how the internet worm worked. Chamelion Journal: Communications of the ACM June 1989 v32 n6 p678(10) Title: Crisis and aftermath. (the Internet worm) Author: Spafford, Eugene H. Summary: The Internet computer network was attacked on Nov 2, 1988, by a computer worm. Although the program affected only Sun Microsystems Sun-3 workstations and VAX computers running a variant of version 4 of the Berkeley Unix, the program spread over a huge section of the network. Early the following day a number of methods for containing and eradicating the virus had been discovered and published. It was discovered that the worm exploited flaws in the Unix operating system's security routines and used some of Unix's own utilities to propagate itself. A complete description of the workings of the worm and its methods of entry into Unix systems are discussed. The aftermath of the infection and the motives of Robert T. Morris, its author, are also discussed. Full Text: Crisis and Aftermath On the evening of November 2, 1988 the Internet came under attack from within. Sometime after 5 p.m., a program was executed on one or more hosts connected to the Internet. That program collected host, network, and user information, then used that information to break into other machines using flaws present in those systems' software. After breaking in, the program would replicate itself and the replica would attempt to infect other systems in the same manner. Although the program would only infect Sun Micro-systems' Sun 3 systems and VAX computers running variants of 4 BSD UNIX, the program spread quickly, as did the confusion and consternation of system administrators and users as they discovered the invasion of their systems. The scope of the break-ins came as a great surprise to almost everyone, despite the fact that UNIX has long been known to have some security weaknesses (cf. [4, 12, 13]). The program was mysterious to users at sites where it appeared. Unusual files were left in the /usr/tmp directories of some machines, and strange messages appeared in the log files of some of the utilities, such as the sendmail mail handling agent. The most noticeable effect, however, was that systems became more and more loaded with running processes as they became repeatedly infected. As time went on, some of these machines bacame so loaded that they were unable to continue any processing; some machines failed completely when their swap space or process tables were exhausted. By early Thursday morning, November 3, personnel at the University of California at Berkeley and Massachusetts Institute of Technology (MIT) had "captured" copies of the program and began to analyze it. People at other sites also began to study the program and were developing methods of eradicating it. A common fear was that the program was somehow tampering with system resources in a way that could not be readily detected--that while a cure was being sought, system files were being altered or information destroyed. By 5 a.m. Thursday morning, less than 12 hours after the program was first discovered on the network, the Computer Systems Research Group at Berkeley had developed an interim set of steps to halt its spread. This included a preliminary patch to the sendmail mail agent. The suggestions were published in mailing lists and on the Usenet, although their spread was hampered by systems disconnecting from the Internet to attempt a "quarantine." By about 9 p.m. Thursday, another simple, effective method of stopping the invading program, without altering system utilities, was discovered at Purdue and also widely published. Software patches were posted by the Berkeley group at the same time to mend all the flaws that enabled the program to invade systems. All that remained was to analyze the code that caused the problems and discover who had unleashed the worm--and why. In the weeks that followed, other well-publicized computer break-ins occurred and a number of debates began about how to deal with the individuals staging these invasions. There was also much discussion on the future roles of networks and security. Due to the complexity of the topics, conclusions drawn from these discussions may be some time in coming. The on-going debate should be of interest to computer professionasl everywhere, however. HOW THE WORM OPERATED The worm took advantage of some flaws in standard software installed on many UNIX systems. It also took advantage of a mechanism used to simplify the sharing of resources in local area networks. Specific patches for these flaws have been widely circulated in days since the worm program attached the Internet. Fingerd The finger program is a utility that allows users to obtain information about other users. It is usually used to identify the full name or login name of a user, whether or not a user is currently logged in, and possibly other information about the person such as telephone numbers where he or she can be reached. The fingerd program is intended to run as a daemon, or background process, to service remote requests using the finger protocol. This daemon program accepts connections from remote programs, reads a single line of input, and then sends back output matching the received request. The bug exploited to break fingerd involved overrunning the buffer the daemon used for input. The standard C language I/O library has a few routines that read input without checking for bounds on the buffer involved. In particular, the gets call takes input to a buffer without doing any bounds checking; this was the call exploited by the worm. As will be explained later, the input overran the buffer allocated for it and rewrote the stack frame thus altering the behavior of the program. The gets routine is not the only routine with this flaw. There is a whole family of routines in the C library that may also overrun buffers when decoding input or formatting output unless the user explicitly specifies limits on the number of characters to be converted. Although experienced C programmers are aware of the problems with these routines, they continue to use them. Worse, their format is in some sense codified not only by historical inclusion in UNIX and the C language, but more formally in the forthcoming ANSI language standard for C. The hazard with these calls is that any network server or privileged program using them may possibly be compromised by careful precalculation of the (in)appropriate input. Interestingly, at least two long-standing flaws based on this underlying problem have recently been discovered in standard BSD UNIX commands. Program audits by various individuals have revealed other potential problems, and many patches have been circulated since November to deal with these flaws. Unfortunately, the library routines will continue to be used, and as our memory of this incident fades, new flaws may be introduced with their use. Sendmail The sendmail program is a mailer designed to route mail in a heterogeneous internetwork. The program operates in a number of modes, but the one exploited by the worm involves the mailer operating as a daemon (background) process. In this mode, the program is "listening" on a TCP port (#25) for attempts to deliver mail using the standard Internet protocol, SMTP (Simple Mail Transfer Protocol). When such an attempt is detected, the daemon enters into a dialog with the remote mailer to determine sender, recipient, delivery instructions, and message contents. The bug exploited in sendmail had to do with functionality provided by a debugging option in the code. The worm would issue the DEBUG command to sendmail and then specify a set of commands instead of a user address. In normal operation, this is not allowed, but it is present in the debugging code to allow testers to verify that mail is arriving at a particular site without the need to invoke the address resolution routines. By using this option, testers can run programs to display the state of the mail system without sending mail or establishing a separate login connection. The debug option is often used because of the complexity of configuring sendmail for local conditions, and it is often left turned on by many vendors and site administrators. The sendmail program is of immense importance on most Berkeley-derived (and other) UNIX systems because it handles the complex tasks of mail routing and delivery. Yet, despite its importance and widespread use, most system administrators know little about how it works. Stories are often related about how system administrators will attempt to write new device drivers or otherwise modify the kernel of the operating system, yet they will not willingly attempt to modify sendmail or its configuration files. It is little wonder, then, that bugs are present in sendmail that allow unexpected behavior. Other flaws have been found and reported now that attention has been focused on the program, but it is not known for sure if all the bugs have been discovered and all the patches circulated. Passwords A key attack of the worm involved attempts to discover user passwords. It was able to determine success because the encrypted password of each user was in a publicly readable file. In UNIX systems, the user provides a password at sign-on to verify identity. The password is encrypted using a permuted version of the Data Encryption Standard (DES) algorithm, and the result is compared against a previously encrypted version present in a word-readable accounting file. If a match occurs, access is allowed. No plaintext passwords are contained in the file, and the algorithm is supposedly noninvertible without knowledge of the password. The organization of the passwords in UNIX allows nonprivileged commands to make use of information stored in the accounts file, including authentification schemes using user passwords. However, it also allows an attacker to encrypt lists of possible passwords and then compare them against the actual passwords without calling any system function. In effect, the security of the passwords is provided by the prohibitive effort of trying this approach with all combinations of letters. Unfortunately, as machines get faster, the cost of such attempts decreases. Dividing the task among multiple processors further reduces the time needed to decrypt a password. Such attacks are also made easier when users choose obvious or common words for their passwords. An attacker need only try lists of common words until a match is found. The worm used such an attack to break passwords. It used lists of words, including the standard online dictionary, as potential passwords. It encrypted them using a fast version of the password algorithm and then compared the result against the contents of the system file. The worm exploited the accessibility of the file coupled with the tendency of users to choose common words as their passwords. Some sites reported that over 50 percent of their passwords were quickly broken by this simple approach. One way to reduce the risk of such attacks, and an approach that has already been taken in some variants of UNIX, is to have a shadow password file. The encrypted passwords are saved in a file (shadow) that is readable only by the system administrators, and a privileged call performs password encryptions and comparisons with an appropriate timed delay (0.5 to 1 second, for instance). This would prevent any attempt to "fish" for passwords. Additionally, a threshold could be included to check for repeated password attempts from the same process, resulting in some form of alarm being raised. Shadow password files should be used in combination with encryption rather than in place of such techniques, however, or one problem is simply replaced by a different one (securing the shadow file); the combination of the two methods is stronger than either one alone. Another way to strengthen the password mechanism would be to change the utility that sets user passwords. The utility currently makes a minimal attempt to ensure that new passwords are nontrivial to guess. The program could be strengthened in such a way that it would reject any choice of a word currently in the online dictionary or based on the account name. A related flaw exploited by the worm involved the use of trusted logins. One of the most useful features of BSD UNIX-based networking code is the ability to execute tasks on remote machines. To avoid having to repeatedly type passwords to access remote accounts, it is possible for a user to specify a list of host/login name pairs that are assumed to be "trusted," in the sense that a remote access from that host/login pair is never asked for a password. This feature has often been responsible for users gaining unauthorized access to machines (cf. [11]), but it continues to be used because of its great convenience. The worm exploited the mechanism by locating machines that might "trust" the current machine/login being used by the worm. This was done by examining files that listed remote machine/logins used by the host. Often, machines and accounts are reconfigured for reciprocal trust. Once the worm found such likely candidates, it would attempt to instantiate itself on those machines by using the remote execution facility--copying itself to the remote machines as if it were an authorized user performing a standard remote operation. To defeat such future attempts requires that the current remote access mechanism be removed and possibly replaced with something else. One mechanism that shows promise in this area is the Kerberos authentication server. This scheme uses dynamic session keys that need to be updated periodically. Thus, an invader could not make use of static authorizations present in the file system. High Level Description The worm consisted of two parts: a main program, and a bootstrap or vector program. The main program, once established on a machine, would collect information on other machines in the network to which the current machine could connect. It would do this by reading public configuration files and by running system utility programs that present information about the current state of network connections. It would then attempt to use the flaws described above to establish its bootstrap on each of those remote machines. The worm was brought over to each machine it infected via the actions of a small program commonly referred to as the vector program or as the grappling hook program. Some people have referred to it as the l1.c program, since that is the file name suffix used on each copy. This vector program was 99 lines of C code that would be compiled and run on the remote machine. The source for this program would be transferred to the victim machine using one of the methods discussed in the next section. It would then be compiled and invokedon the victim machine with three command line arguments: the network address of the infecting machine, the number of the network port to connect to on that machine to get copies of the main worm files, and a magic number that effectively acted as a one-time-challenge password. If the "server" worm on the remote host and port did not receive the same magic number back before starting the transfer, it would immediately disconnect from the vector program. This may have been done to prevent someone from attempting to "capture" the binary files by spoofing a worm "server." This code also went to some effort to hide itself, both by zeroing out its argument vector (command line image), and by immediately forking a copy of itself. If a failure occurred in transferring a file, the code deleted all files it had already transferred, then it exited. Once established on the target machine, the bootstrap would connect back to the instance of the worm that originated it and transfer a set of binary files (precompiled code) to the local machine. Each binary file represented a version of the main worm program, compiled for a particular computer architecture and operating system version. The bootstrap would also transfer a copy of itself for use in infecting other systems. One curious feature of the bootstrap has provoked many questions, as yet unanswered: the program had data structures allocated to enable transfer of up to 20 files; it was used with only three. this has led to speculation whether a more extensive version of the worm was planned for a later date, and if that version might have carried with it other command files, password data, or possibly local virus or trojan horse programs. Once the binary files were transferred, the bootstrap program would load and link these files with the local versions of the standard libraries. One after another, these programs were invoked. If one of them ran successfully, it read into its memory copies of the bootstrap and binary files and then deleted the copies on disk. It would then attempt to break into other machines. If none of the linked versions ran, then the mechanism running the bootstrap (a command file or the parent worm) would delete all the disk files created during the attempted infection. Step-by-Step Description This section contains a more detailed overview of how the worm program functioned. The description in this section assumes that the reader is somewhat familiar with standard UNIX commands and with BSD UNIX network facilities. A more detailed analysis of operation and components can be found in [16], with additional details in [3] and [15]. This description starts from the point at which a host is about to be infected. At this point, a worm running on another machine has either succeeded in establishing a shell on the new host and has connected back to the infecting machine via a TCP connection, or it has connected to the SMTP port and is transmitting to the sendmail program. The infection proceeded as follows: 1. A socket was established on the infecting machine for the vector program to connect to (e.g., socket number 32341). A challenge string was constructed from a random number (e.g., 8712440). A file name base was also constructed using a random number (e.g., 14481910). 2. The vector program was installed and executed using one of two methods: a. Acro