RISKS-LIST: RISKS-FORUM Digest Tuesday 24 January 1989 Volume 8 : Issue 14 FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Re: Medical Software -- testing and verification (Dave Parnas) NSA and the Internet (Vint Cerf) Re: Losing systems (Geoff Lane) Computer Emergency Response Team (CERT) (Brian M. Clapper) Probability and Product Failure (Geoff Lane) [lack of independence] Probabilities and airplanes (Robert Colwell, Mike Olson, Dale Worley) ---------------------------------------------------------------------- Date: Mon, 23 Jan 89 07:48:13 EST From: parnas@qucis.queensu.ca (Dave Parnas) Subject: Re: Medical Software (Are computer risks different?) (RISKS-8.7) In his contribution to Risks 8.7 Jon Jacky makes the statement that the problems of testing and verification are broadly similar whether the machine includes a computer or not. I have heard this argument from many "old-time" engineers and consider it quite false. In the testing of conventional (analog) devices we make use of the fact that the functions are continuous and that one can put an upper bound on the derivatives or the frequency spectrum. We use those mathematical properties in deciding how many tests are required to validate a device. When digital technology is involved there are no limits on the rate of change. Further, with digital technology, the number of tests required for black-box testing increases sharply with the lowest known upper-bound on the number of states in the device. If we do "white-box" testing, we can reduce the number of tests required by exploiting the regularity of the state space. In practice, the regularity is present and helpful for the testing of hardware but not terribly useful for software testing. In short, the technology being used does make a big difference in testing and validation. While I agree with Jon's statement that industry practices in software development are often much worse than for other kinds of technology, that is not the only explanation of our "special problem". The technology itself is a great contributor and always will be. David Parnas, Queen's University, Kingston Ontario ------------------------------ Date: 23 Jan 1989 01:11-EST From: CERF@A.ISI.EDU Subject: NSA and the Internet John Gilmore asks why NSA has 5 IMPs if they are NOT monitoring the Internet. So far as I know, NSA does not have 5 IMPs on the Internet. It has one to support Dockmaster. The agency has a variety of internal networks, of course, but none are likely to be linked to the Internet since they are used for classified applications for which the Internet is not approved. Does Mr. Gilmore have some evidence he wishes to present that suggests the NSA is engaging in an unacceptable activity on the Internet? Vint Cerf ------------------------------ Date: Mon, 23 Jan 89 09:34:58 GMT From: "Geoff. Lane. Tel UK-061 275 6051" Subject: Re: Losing systems In my experence the single most probable cause of a software project failing is that the people who started the project have no real idea what they want in the end. Almost everything else can be coped with but when you have to deal with a constant stream of "design changes" not even the best people with the best equipment can succeed. Geoff. Lane, UMRCC ------------------------------ Date: Mon, 23 Jan 89 10:28:18 PST From: Dave Platt Subject: Computing Projects that Failed On the subject of computing projects that failed for one reason or another: I recommend that interested Risks readers look up some of Bob Glass's books on this subject. Glass has collected quite a number of case-studies, changed the names to protect the innocent [and the guilty, too], and organized them into categories according to the primary reason for the failure (immature technology, wrong technology, mismanagement, misimplementation, politics, etc.). Some of the stories are roaringly funny... f'rinstance, the mainframe at "Cornbelt U." that survived a series of mishaps during installation (including being watered by the University's lawn sprinklers), only to end up destroying itself (and most of the building) during an earthquake. Glass has written half-a-dozen books on the computing industry (most of them date back to the '70s and early '80s). The three most applicable to Risks issues are: "Computing Projects that Failed", "Computer Messiahs: More Computing Projects that Failed", and "Computing Catastrophies". [I may be off a bit in the exact wording of the titles; my copies are at home.] Based on the recent contributions to Risks concerning recent software- project failures, it sounds to me as if most of the pitfalls that Glass wrote about back in the '70s are alive and well in the late '80s! Dave Platt FIDONET: Dave Platt on 1:204/444 VOICE: (415) 493-8805 UUCP: ...!{ames,sun,uunet}!coherent!dplatt DOMAIN: dplatt@coherent.com USNAIL: Coherent Thought Inc. 3350 West Bayshore #205 Palo Alto CA 94303 ------------------------------ Date: Mon, 23 Jan 89 13:54:59 pst From: Benjamin Ellsworth Subject: Re: Object Oriented Programming Recently a professor from the local university taught a class on OOP at our site. During the first lecture, he said that via OOP one can add functionality to the module without changing the code. I asked incredulously, "Without changing *any* code?" He said, "Yes." A manager at the class sagely nodded his head. I should hope the risks are obvious. ------------------------------ Date: Tue, 24 Jan 89 02:20:32 -0500 From: attcan!utzoo!henry@uunet.UU.NET Subject: re: Losing Systems >The losing systems almost always contain some elements of newness; in fact on >close inspection they usually contain several such elements... To quote from John Gall's SYSTEMANTICS: "A complex system that works is invariably found to have evolved from a simple system that worked." So perhaps it's not so surprising that a lot of these done-yet-again-from- scratch systems (how many different county records systems does the world NEED?!?) fail. Henry Spencer at U of Toronto Zoology uunet!attcan!utzoo!henry henry@zoo.toronto.edu ------------------------------ Date: Tue, 24 Jan 89 10:18:01 EST From: clapper@NADC.ARPA (Brian M. Clapper) Subject: Computer Emergency Response Team (CERT) Excerpted from UNIX Today!, January 23, 1989 (reprinted without permission) WASHINGTON -- The federal government's newly formed Computer Emergency Response Team (CERT) is hoping to sign up 100 technical experts to aid in its battle against computer viruses. CERT, formed last month by the Department of Defense's Advanced Research Project Agency (DARPA) ... expects to sign volunteers from federal, military and civilian agencies to act as advisors to users facing possible network invasion. DARPA hopes to sign people from the National Institute of Science and Technology, the National Security Agency, the Software Engineering Institute and other government-funded university laboratories, and even the FBI. The standing team of UNIX security experts will replace an ad hoc group pulled together by the Pentagon last November to deal with the infection of UNIX systems allegedly brought on by Robert Morris Jr., a government spokesman said. CERT's charter will also include an outreach program to help educate users about what they can do the prevent security lapses, according to Susan Duncal, a spokeswoman for CERT. The group is expected to produce a "security audit" checklist to which users can refer when assessing their network vulnerability. The group is also expected to focus on repairing security lapses that exist in current UNIX software. To contact CERT, call the Software Engineering Institute at Carnegie-Mellon University in Pittsburgh at (412) 268-7090; or use the Arpanet mailbox address cert@sei.cmu.edu. ------------------------------ Date: Mon, 23 Jan 89 09:17:33 GMT From: "Geoff. Lane. Tel UK-061 275 6051" Subject: Probability and Product Failure Unfortunately, from reports here in Britain after the M1 plane crash, it appears that there is a real problem with "Common Mode" failures in aircraft engines. So if one fails then the probability of a second failing during the same flight is much higher than would be expected. The probabilities of failure are not independent. (BTW - in "fly by wire" systems they attempt to avoid common mode errors in the software by having three independent groups implementing the system on three different types of processor. Firstly this does NOT eliminate the problems of errors in the system specification from which all three designs are derived. Secondly what happens 10 years later when the software is updated to incorporate new developments - are three more independent software houses commissioned to produce the new software - or would this be done in-house by some part-time students?) Geoff Lane UMRCC. ------------------------------ Date: Sun, 22 Jan 89 14:20:00 EST From: mfci!colwell@uunet.UU.NET (Robert Colwell) Subject: Probabilities (Re: RISKS-8.12) Organization: Multiflow Computer Inc., Branford Ct. 06405 There is a definite danger to this analysis, stemming mostly from its essential correctness. There was a plane within the last two years (if memory serves) that lost all three of its engines on a flight precisely because such events are not necessarily independent. Turned out that the same mechanic had worked on all three and made the same mistake on all three (left off an oil seal, I think). Another example is the nuclear reactor fire of a couple of years ago, where all the redundant control wiring was for nought because somebody routed them all through the same conduit, so they were all destroyed at the same time. One must be extremely careful with abstract analyses like these -- they can be seductive, and they can lead to unjustified conclusions. ------------------------------ Date: 23 Jan 89 10:05:05 PST (Mon) From: mao@blia.UUCP (Mike Olson) Cc: buck@siswat.UUCP, LordBah@cup.portal.com Subject: real discrete probability and airplanes as at least two people have pointed out, my analysis of the likelihood of failure was wrong. i claimed that the probability of two engines failing out of three was 6(p**2); the correct answer, of course, is 3(p**2). thanks to A. Lester Buck (siswat!buck) and LordBah@cup.portal.com for pointing out my error in a way befitting the kinder, gentler nation we now live in. it's not quite clear what i was computing, but it certainly wasn't probability. it wasn't even conditional probability, since i got the independence argument wrong. it's important to remember one of the real risks of the network -- the potential for embarassing yourself in front of hundreds (thousands?) of intelligent people. next time, i check my work. mike olson, britton lee, inc. ...!ucbvax!mtxinu!blia!mao [Also noted by Mike Wescott (m.wescott@ncrcae.Columbia.NCR.COM) and Dale Worley. PGN] ------------------------------ Date: Mon, 23 Jan 89 10:11:29 EST From: worley@compass.UUCP (Dale Worley) Subject: Probability Actually, given that the probability of an engine failing during the trip, year, etc. is p, and the probability of it not failing is q = 1 - p, then: the probability of 0 engines failing is q**3 the probability of exactly 1 engine failing is 3 p q**2 the probability of exactly 2 engines failing is 3 p**2 q the probability of all 3 engines failing is p**3 Given (we hope!) that p is very small, q is essentially 1, then p >> p**2 >> p**3, so we can approximate: The probability of (at least) 1 engine failing is 3 p . The probability of (at least) 2 engines failing is 3 p**2 . The trouble with "the probability of one engine failing is ... and the probability of one of the remaining two failing is ..." is that is double-counts the failures, for instance the probability of engine A failing, *then* engine B is approximately 1/2 p**2, not p**2 as assumed by the previous poster -- the other 1/2 p**2 times, engine B fails before engine A. Dale Worley, Compass, Inc. compass!worley@think.com ------------------------------ End of RISKS-FORUM Digest 8.14 ************************ -------