Archive for the ‘computer_law’ Category

An award from ACM

Tuesday, March 2nd, 2010

The Association for Computing Machinery’s Computers & Society Special Interest Group just honored me with their “Person Who Made a Difference” award. Here’s what it’s about:

Making a Difference Award

This award is presented to an individual who is widely recognized for work related to the interaction of computers and society. The recipient is a leader in promoting awareness of ethical and social issues in computing. The recipients of this award and the award itself encourage responsible action by computer professionals.

This is a once-in-a-lifetime award, which goes to (at most) one person per year.

ACM has printed a biographic summary that highlights the work that led to this award. We’ll probably add an interview soon.

The award presentation and talk will probably be at the Computers, Freedom & Privacy conference in San Jose in June.

I’ll probably also be talking about this work at a QAI meeting in Chicago, this May.

A few new articles

Saturday, December 26th, 2009

I finally found some time to update my website, posting links to some more of my papers and presentations.

There are a few themes:

  • Investment modeling as a new exemplar. Software testing helps us understand the quality of the product or service under test. There are generically useful approaches to test design, like quicktests and tours and other basic techniques, but I think we add our greatest value when we apply a deeper understanding of the application. Testing instructors don’t teach this well because it takes so long to build a classroom-wide understanding of an application that is complex enough to be interesting. For the last 14 months, I’ve been exploring investment modeling as a potential exemplar of deep and interesting testing within a field that many people can grasp quickly.
  • Exploratory test automation. I don’t understand why people say that exploratory testing is always manual testing. Doug Hoffman and I have been teaching “high-volume” test techniques for twelve years (I wrote about some of these back in Testing Computer Software) that don’t involve regression testing or test-by-test scripting. We run these to explore new risks; we change our parameters to shift the focus of our search (to try something new or to go further in depth if we’re onto something interesting). This is clearly exploratory, but it is intensely automated. I’m now using investment modeling to illustrate this, and starting to work with Scott Barber to use performance modeling to illustrate it as well. Doug is working through a lot of historical approaches; perhaps the three of us can integrate our work, a lot of interesting work published by other folks, into something that more clearly conveys the general idea.
  • Instructional design: Teaching software testing. Rebecca Fiedler, Scott Barber and I have worked through a model for online education in software testing that fosters deeper learning than many other approaches. The Association for Software Testing has been a major testbed for this approach. We’ve also been doing a lot in academic institutions, comparing notes in detail with faculty at other schools.
  • The evolving law of software quality. Federal and state legislatures have failed to adopt laws governing software contracting and software quality. Because of this, American judges have had to figure out for themselves what legal rules should be applied–until Congress or the state legislatures finally get around to giving clear and constitutional guidance to the courts. This spring, the American Law Institute unanimously adopted the Principles of the Law of Software Contracts, which includes some positions that I’ve been advocating for 15 years. The set of papers below includes some discussion of the Principles. In addition, I’m kicking off a wiki-based project to update my book, Bad Software, to give customers good advice about their rights and their best negotiating tactics under the current legal regime. I’ll blog more about this later, looking for volunteers to help update the book.

Here’s the list of new stuff:

  1. Cem Kaner, “Exploratory test automation: Investment modeling as an example.” [SLIDES]. ImmuneIT, Amsterdam, October 2009.
  2. Cem Kaner, “Investment modeling: A software engineer’s approach.” [SLIDES]. Colloquium, Florida Institute of Technology, October 2009.
  3. Cem Kaner, “Challenges in the Evolution of Software Testing Practices in Mission-Critical Environments.” [SLIDES]. Software Test & Evaluation Summit/Workshop (National Defense Industrial Association), Reston VA, September 2009.
  4. Cem Kaner, “Approaches to test automation.” [SLIDES]. Research in Motion, Kitchener/Waterloo, September 2009.
  5. Cem Kaner, “Software Testing as a Quality-Improvement Activity” [SLIDES]. Lockheed Martin / IEEE Computer Society Webinar Series, September 2009.
  6. Rebecca L. Fiedler & Cem Kaner, “Putting the context in context-driven testing (an application of Cultural Historical Activity Theory)” [SLIDES]. Conference of the Association for Software Testing. Colorado Springs, CO., July 2009.
  7. Cem Kaner, “Metrics, qualitative measurement, and stakeholder value” [SLIDES]. Tutorial, Conference of the Association for Software Testing. Colorado Springs, CO., July 2009.
  8. Cem Kaner, “The value of checklists and the danger of scripts: What legal training suggests for testers.” [SLIDES]. Conference of the Association for Software Testing. Colorado Springs, CO., July 2009.
  9. Cem Kaner, “New rules adopted for software contracts.” [SLIDES]. Conference of the Association for Software Testing. Colorado Springs, CO., July 2009.
  10. Cem Kaner, “Activities in software testing education: a structure for mapping learning objectives to activity designs“. Software Testing Education Workshop (International Conference on Software Testing), Denver, CO, April 2009.
  11. Cem Kaner, “Plagiarism-detection software Clashing intellectual property rights and aggressive vendors yield dismaying results.” [SLIDES] [VIDEO]. Colloquium, Florida Institute of Technology, October 2009.
  12. Cem Kaner, “Thinking about the Software Testing Curriculum.” [SLIDES]. Workshop on Integrating Software Testing into Programming Courses, Florida International University, March 2009.
  13. Cem Kaner (initial draft), “Dimensions of Excellence in Research“. Department of Computer Sciences, Florida Institute of Technology, Spring 2009.
  14. Cem Kaner, “Patterns of activities, exercises and assignments.” [SLIDES]. Workshop on Teaching Software Testing, Melbourne FL, January 2009.
  15. Cem Kaner & Rebecca L. Fiedler, “Developing instructor-coached activities for hybrid and online courses.” [SLIDES]. Workshop at Inventions & Impact 2: Building Excellence in Undergraduate Science, Technology, Engineering & Mathematics (STEM) Education, National Science Foundation / American Association for the Advancement of Science, Washington DC, August 2008.
  16. Cem Kaner, Rebecca L. Fiedler, & Scott Barber, “Building a free courseware community around an online software testing curriculum.” [SLIDES]. Poster Session at Inventions & Impact 2: Building Excellence in Undergraduate Science, Technology, Engineering & Mathematics (STEM) Education, National Science Foundation / American Association for the Advancement of Science, Washington DC, August 2008.
  17. Cem Kaner, Rebecca L. Fiedler, & Scott Barber, “Building a free courseware community around an online software testing curriculum.” [SLIDES]. MERLOT conference, Minneapolis, August 2008.
  18. Cem Kaner, “Authentic assignments that foster student communication skills” [SLIDES], Teaching Communication Skills in the Software Engineering Curriculum: A Forum for Professionals and Educators (NSF Award #0722231), Miami University, Ohio, June 2008.
  19. Cem Kaner, “Comments on the August 31, 2007 Draft of the Voluntary Voting System Guidelines.” Submitted to the United States Election Assistance Commission, May 2008.
  20. Cem Kaner and Rebecca L. Fiedler, “A cautionary note on checking software engineering papers for plagiarism.”IEEE Transactions on Education, vol. 51, issue 2, 2008, pp. 184-188.
  21. Cem Kaner, “Software testing as a social science,” [SLIDES] STEP 2000 Workshop on Software Testing, Memphis, May 2008.
  22. Cem Kaner & Stephen J. Swenson, “Good enough V&V for simulations: Some possibly helpful thoughts from the law & ethics of commercial software.” [SLIDES] Simulation Interoperability Workshop, Providence, RI, April 2008.
  23. Cem Kaner, “Improve the power of your tests with risk-based test design.” [SLIDES] QAI QUEST Conference, Chicago, April 2008
  24. Cem Kaner, “Risk-based testing: Some basic concepts.” [SLIDES] QAI Managers Workshop, QUEST Conference, Chicago, April 2008
  25. Cem Kaner, “A tutorial in exploratory testing.” [SLIDES] QAI QUEST Conference, Chicago, April 2008
  26. Cem Kaner, “Adapting Academic Course Materials in Software Testing for Industrial Professional Development.” [SLIDES] Colloquium, Florida Institute of Technology, March 2008
  27. Cem Kaner, “BBST at AST: Adaptation of a course in black box software testing.” [SLIDES]. Workshop on Teaching Software Testing, Melbourne FL, January 2008.
  28. Cem Kaner, “BBST: Evolving a course in black box software testing.” [SLIDES] BBST Project Advisory Board Meeting, January 2008

Principles of the Law of Software Contracts approved

Thursday, May 21st, 2009

“The Proposed Final Draft of the Principles of the Law of Software Contracts was approved, subject to the discussion at the meeting and to editorial prerogative. Approval of the draft clears the way for publication of the official text of this project.” (from a report to members on the  actions taken this week at the American Law Institute’s annual meeting.)

I’ll talk more about these at CAST, this summer.

This document will influence court decisions across the United States. It is the counterbalance to the Uniform Computer Information Transactions Act, which vastly increased software seller’s powers, virtually wiped out customers’ abilities to hold companies accountable for bad software—UCITA passed in 2 states, then died because it was so widely seen as so unbalanced.

The main provisions of the Principles that affect us:

(1) Companies will be required to reveal known defects at time of sale

(2) Reverse engineering will be more legally defensible. People will now have ALMOST as much right to reverse engineer software, in the United States, as they have for every other kind of product in the US. This brings us closer to international standards, making our development efforts less uncompetitive compared to most other high-tech countries.

I helped write the Principles. I wish I could give you more details about the discussion at the meeting (and will be able to by CAST). Unfortunately, I got a nasty virus last week and could not travel.

One comment. Earlier in the week, there was a lot of baloney on the web about a carefully timed letter from Microsoft and the Linux Foundation that (a) pleaded for delay because they said they needed more time to review the draft and (b) said that the disclosure requirements were very new and onerous.

Actually, the community has been aware of proposals for disclosure since 1995, when Ed Foster published widely-read articles on UCITA (then called Article 2B) in Infoworld, which were followed up by a lot of mass-media attention. There have been several follow-up reports to our community (software development, software testing) since then, including talks that I’ve given at previous CAST meetings.

In terms of awareness by LAWYERS, Microsoft has been involved in the drafting process for UCITA and the ALI Principles for longer than I have (I stated working on this in 1995; I think they started in 1989). The Open Source communities have been more variable in their activism on these laws, but several attorneys within that community have been active. More to the point, the Principles specifically exempts open source software from the disclosure rule because the distribution models (and availability of code) are so different from traditional proprietary software. The MS/Open Linux letter also complained that the ALI is treating all software transfer as if it were packaged software. This is a criticism that was applied to early drafts of UCITA (which Microsoft, Apple and IBM played heavy roles in writing) but that was pretty cleared up before UCITA was introduced to state legislatures in 2001. The ALI Principles were started after that, well after everyone in the process understood the variety of distribution models for software. Letters like this make good copy on slashdot and in blogs where authors don’t know much about the law or history of the work they’re blogging about, but as serious criticisms, they seem devoid of merit.

Ed Foster is dead–A great loss for mass-market computing

Tuesday, July 29th, 2008

Ed Foster just died.

Ed was one of the great journalists of Silicon Valley. He listened. He read. He asked probing questions. He changed his mind when the evidence proved him wrong.  He understood the computer and information industries from (at least) a dozen perspectives. And he could explain their perspectives to each other.

Ed was part of the heart of Silicon Valley in the early years of the small computer revolution. He was a whirlwind of well-informed enthusiasm. He taught us about the culture and the values of the Valley.  Consumers, hackers, publishers, marketers were all entertained and informed by him. Directly and indirectly, he shaped our thinking about the potential of this new technology and the responsibilities of these new technologists within the technology-enthusiast society and the broader American society.

I think I first met Ed in the 1980’s at one of the trade shows, but I didn’t have the privilege of talking with him in depth, then of collaborating with him, until the mid-1990’s.

Ed was the first journalist to publicize a series of projects to rewrite the commercial laws governing computers and software.

These new laws were being presented to people, mainly to the broad legal community (no one else was listening in those early days)(well, almost no one–Ed was listening) as if they were a careful balance of the rights of consumers, small business customers, small software developers, the open source community, and bigger software publishers and hardware makers. On the surface, they looked that way. Beneath that surface was a new legal regime designed to give software publishers, database publishers and large software consulting firms a panoply of new rights and defenses.

I was a newly-graduated lawyer when Ed’s comments alerted me to the new stuff on the horizon. I went to school intending to work on the law of software quality. Following up Ed’s leads shaped my career.

As Ed (with some help from me) caught a glimmer of the scope and significance of these proposals, Ed got to work. He read voraciously. He came to meetings. He interviewed and interviewed and interviewed people. He checked facts. He checked his assumptions and conclusions. He took advice from people who disagreed with him as well as from those who agreed. He wrote with care and credibility. His leadership brought dozens of other journalists into coverage of the nuts and bolts of development of highly technical commercial law–something almost never covered by the press. He helped them understand what they were seeing. He was A Force To Be Reckoned With, not because he represented people with power or money but because he did his homework and knew how to explain what he knew to ordinary readers. The most visible bills of this group were the Uniform Computer Information Transactions Act (which was ultimately rejected by 48 of 50 States) and the Uniform Electronic Transactions Act (which improved tremendously under the bright light of public scrutiny. You might know it better under its federal name, ESIGN. It governs electronic commerce in every State). Without Ed, neither result would have happened.

One of the proposals that Ed embraced with passion was the idea that software companies should be required to disclose their known defects.  There’s a natural justice in the idea that a company who knows about a bug but won’t tell its customers about it should be responsible to those customers for any losses caused by the known bug.  You can’t find every bug. But if you honestly and effectively disclose, your customers can at least work around the bugs you know about (or buy a product whose bugs are less serious). Ed could make this natural justice clear and obvious. The implementation of the idea (writing it into a set of laws) is complex–you can easily get lost in the difficult details–but Ed could stand above that and remind people why the implementation was worth the effort. I was initially a skeptic–I favored the idea but saw other matters as more critical. Ed turned my priorities around, leading me gradually to understand that there is no real competition and no hope for justice in a marketplace that allows vendors to hide fundamental information from the people who most need it.

The UCITA drafters dismissed this as naive, unreasonable, excessively burdensome, or impossible to do. But now that UCITA has failed, a major new drafting project (the American Law Institute’s Principles of the Law of Software Contracting) has picked up the idea. It will probably appear in serious legislation in 2012 or so, almost twenty years  after Ed started explaining it to people. I am sad that Ed will miss seeing this come to fruition. Among his many gifts to American society, this is an important one.

In this decade, Ed has been one of this industry’s extremely few consumer advocates. Since 2000, when consumer protection at the Federal level went dark and State-level protection continued to vanish in the never ending waves of tax cuts and litigation “reforms,” I have learned more about the pulse of consumer problems from Ed’s alerts than from any other source.

I teach courses on computer law and ethics these days, to budding software engineers. Ed’s work provided perfect starting points for many students. Some probably learned more about professionalism and ethics from Ed’s writing than from anything in their textbooks or my lectures.

Ed wrote as a voice of the conscience of an industry that needs to find its conscience.

I am writing through tears as I say that he will be missed

— Cem Kaner

Software error and the meltdown of US finances

Thursday, May 22nd, 2008

“LONDON, May 21 (Reuters) – A computer coding error led Moody’s Investors Service to assign incorrect triple-A ratings to a complex debt product that came to mark the peak of the credit boom, the Financial Times said on Wednesday. (see www.forbes.com/reuters/feeds/reuters/2008/05/21/2008-05-21T075644Z_01_L21551923_RTRIDST_0_MOODYS-CPDOS.html. For more, see blogs.spectrum.ieee.org/riskfactor/2008/05/moodys_rating_bug_gives_credit.html and www.ft.com/cms/s/0/0c82561a-2697-11dd-9c95-000077b07658.html?nclick_check=1 or just search for Moody’s software error.

This is the kind of stuff David Pels and I expected when we fought the Uniform Computer Information Transactions Act (UCITA) back in the 1990’s. UCITA was written as a shield for large software publishers, consulting firms, and other information publishers. It virtually wiped out liability for defects in information-industry products or services, expanded intellectual property rights well beyond what the Copyright Act and the patent laws provide, and helped companies find ways to expand their power to block reverse engineering of products to check for functional or security defects and to publicly report those defects.

UCITA was ultimately adopted only in Virginia and Maryland, rejected in all other American states, but largely imported into most states by judicial rulings (a fine example of “judicial activism”–imposing rules on a state even after its legislators rejected them. People who still whine about left-wing judicial activism are still stuck in the 1970’s).

David Pels and I wrote a book, “Bad Software” on the law of software quality circa 1998. It provides a striking contrast between software customers’ rights in the 1990s and the vastly-reduced rights we have come to expect today, along with some background on the UCITA legislation (UCITA was then called “Article 2B”–as part of a failed effort to add a new Article to the Uniform Commercial Code). John Wiley originally published Bad Software, but they have let me post a free copy at the web. You can find it at http://www.badsoftware.com/wiki/

Political activism

Monday, May 19th, 2008

As you’ve been able to see for a while, from his campaign picture on my blog, I support Barack Obama for U.S. president.

This blog isn’t the vehicle that I want to use for political discussion. Instructional theory, engineering practice, and the legal context (the evolution of the law of software quality) yes. Election-year politics, no.

If you’re interested in my more political views, see my new blog at http://my.barackobama.com/page/community/blog/cemkaner

New Discussion List: Review the ALI Principles of the Law of Software Contracts

Tuesday, July 24th, 2007

New discussion group:
http://groups.yahoo.com/group/review_sw_contract_law

At the Conference of the Association for Software Testing, I presented an introduction to the Principles of the Law of Software Contracts, a project-in-the-works of the American Law Institute that I believe will frame this decade’s (2006-2016) debate over the regulation / enforcement of software contracts in the USA. For more information, see my blog, https://kaner.com/?p=19 and my CAST slides, https://kaner.com/pdfs/Law%20of%20Software%20Contracting.pdf

The ALI is open to critical feedback of its drafts. The Principles project will continue for several years.

The ALI is an association of lawyers, not software developers. If you want the law to sensibly guide engineering practice, you’re going to have to help these lawyers figure out what the implications of their proposals are and, where those implications are bad, what tweaks they might make to achieve their objectives more cleanly.

Note: Free Software Contracts are software contracts too. The same laws that govern enforceability of Microsoft’s software licenses govern the free software licenses. If you don’t think that gives people a chance to make mischief for GPL and Creative Commons, well, think again.

My goal is for this list to help people develop feedback memos for review by the ALI drafting committee.

Membership in this list is open, but messages are tightly moderated. I expect most members will be software development practitioners or academics, with a few journalists and lawyers thrown in. I intend a relatively low volume list with zero spam and reasonably collegial discussion. I am not at all opposed to the idea of developing contradictory sets of comments for ALI (one group submits one set, another group submits another set that completely disagrees with the first) but I will push for development of comments that are well reasoned and well supported by facts.

— Cem Kaner

A first look at the proposed Principles of the Law of Software Contracts

Sunday, June 24th, 2007

The American Law Institute (ALI) is writing a new Principles of the Law of Software Contracts (PLSC), to replace the failed Uniform Computer Information Transactions Act (UCITA). I recently attended ALI’s annual meeting, in which we reviewed the Principles and am giving a first status report at the Conference of the Association for Software Testing (CAST July 9-10).
The Principles raise important issues for our community. For example:

  • They apply a traditional commercial rule to software contracting–a vendor who knows of a hidden material defect in its product but sells the product without disclosing the defect is liable to the customer for damage (including expenses) caused by the defect.
    • A defect is hidden if a reasonably observant person in the position of the buyer would not have found it in an inspection that one would expect a reasonably diligent customer to perform under the circumstances.
    • A defect is material if it is serious enough to be considered a significant breach of the contract.

    I think this is an excellent idea. It reflects the fact that no one can know all of the bugs in their product and lets vendors shield themselves from liability for defects they didn’t know about, but it demands that vendors reveal the serious bugs that they know about, so that customers can (a) make better informed purchase decisions and (b) avoid doing things that have serious effects.

I think we could help clarify it:

    • When should we hold a software company responsible for “knowing” about a defect? Is an irreproducible bug “known”? What about a bug that wasn’t found in the lab but a customer reported it? One customer reported it? 5000 customers reported it? How long should we allow the company to investigate the report before saying that they’ve had enough time to be held responsible for it?
    • What counts as disclosure to the customer? The bugs Firefox and OpenOffice are disclosed in their open-to-the-public bug databases. Is this good enough? Maybe it is for these, but I tried figuring out the bugs published in Apache’s database and for many reports, had no clue what these people were writing about. Serious problems were reported in ways that tied closely to the implementation and not at all to anything I would know to do or avoid. For their purpose (help the developers troubleshoot the bug), these reports might have been marvelous. But for disclosure to the customer? What should our standards be?
    • What is a material defect anyway? Do the criteria differ depending on the nature of the product? Is a rarely-occuring crash more material in a heart monitor than a word processor? And if a bug corrupts data in a word processor, do we have different expectations from Word, OpenOffice Writer, Wordpad, and my 12-year-old niece’s StudentProjectEditor?
    • What about the idea of security by obscurity? That some security holes won’t be exploited if no one knows about them, and so we should give the vendor a chance to fix some bugs before they disclose them? This is a controversial idea, but there is evidence that at least some problems are much more exploited after they are publicized than before.
  • Another issue is reverse engineering. Historically, reverse engineering of products (all products–hardware, software, chemical, mechanical, whatever) has been fair game. American know-how has a lot of “building a better mousetrap” in it, and to build a better one, you start by reverse engineering the current one. There have been some very recent, very expansive rulings (such as Baystate v Bowers) that have enforced EULA-based restrictions against software reverse engineering that they would exclude black box testing (for example, for a magazine review).
    • The Principles lay out several criteria under which unreasonable restrictions could be ruled unenforceable by a judge. To me, these seem like reasonable criteria (a contract clause is unenforceable if it conflicts with federal or state law, public policy (for example, as expressed in the constitution and statutes), is outrageously unfair, or would be considered an abuse of copyright or patent under (mainly) antitrust doctrines.
    • Does it make sense for us to identify solid examples of contexts in which reverse engineering should be permissible (for example, poses absolutely no threat to the vendor’s legitimate interests) and others in which the vendor might have more of a basis and rationale for seeking protection? We can identify these shades of gray with much more experience-based wisdom than lawyers who don’t even know what a loop is, let alone being unable to code their way out of one.

There are plenty of other issues. CAST will provide an opening forum for discussion–remember, CAST isn’t just a place for presentations. We make time for post-presentation discussion, for hours if the conference participants want to go that long.
.

Software Customer Bill of Rights

Wednesday, August 27th, 2003

As the software infrastructure has been going through chaos, reporters (and others) have been called me several times to ask what our legal rights are now and whether we should all be able to sue Microsoft (or other vendors who ship defective software or software that fails in normal use).

Unfortunately, software customer rights have eroded dramatically over the last ten years. Ten years ago, the United States Court of Appeals for the Third Circuit flatly rejected a software publisher’s attempts to enforce contract terms that it didn’t make available to the customer until after the customer ordered the software, paid for it, and took delivery. Citing sections of Uniform Commercial Code’s Article 2 (Law of Sales) that every law student works through in tedious detail in their contracts class, the Court said that the contract for sale is formed when the customer agrees to pay and the seller agrees to deliver the product. Terms presented later are proposals for modification to the contract. The customer has the right to keep the product and use it under the original terms, and refuse to accept the new, seller favorable terms. Other courts (such as the United States Court of Appeals for the First Circuit) cited this case as representative of the mainstream interpretation of Article 2. Under this decision, and several decisions before it, shrinkwrapped contracts and clickwrapped contracts (the ones you have to click “OK” to in order to install the product) would be largely unenforceable.

The software publishing community started aggressively trying to rewrite contract law in about 1988, after the United States Court of Appeal for the Fifth Circuit rejected a shrinkwrapped restriction on reverse engineering. That effort resulted in the Uniform Computer Information Transactions Act and a string of court decisions, starting in 1995, that make it almost impossible to hold a software company liable for defects in its product (unless the defect results in injury or death)– even defects that it knew about when it shipped the product — and also very difficult to hold a mass-market seller liable for false claims about its product. (For background, see InfoWorld and Kaner’s Software Engineering & UCITA in the section on Forcing Products Liability Suits into Arbitration).

So what should we do about this? There are some strong feelings to hold companies fully accountable for losses caused by their products’ defects.

I’d rather stand back from the current crisis, consider the legal debates over the last 10 years, and make some modest suggestions that could go a long way toward restoring integrity and trust — and consumer confidence, consumer excitement, and sales — in this stalled marketplace.

1. Let the customer see the contract before the sale. It should be easy for customers of mass-market software products and computer information contracts to compare the contract terms for a product, or for competing products, before they download, use, or pay for a product. (NOTE: This is not a radical principle. American buyers of all types of consumer products that cost more than $15 are entitled to see the contract (at a minimum, the warranties in the contract) before the sale).

2. Disclose known defects. The software company or service provider must disclose the defects that it knows about to potential customers, in a way that is likely to be understood by a typical member of the market for that product or service.

3. The product (or information service) must live up to the manufacturer’s and seller’s claims. A statement by the vendor (manufacturer or seller) about the product that is intended to describe the product to potential customers is a warranty, a promise that the product will work as described. Warranties by sellers are defined in UCC Article 2 Section 313. Manufacturer liability is clarified (manufacturers are liable for claims they make in ads and in the manual) in a set of clarifying amendments to Article 2 that have now been approved by the Permanent Editorial Board for the UCC, which will be probably introduced in state legislatures starting early in 2004. In addition, it is a deceptive trade practice in most states (perhaps all) to make claims about the product that are incorrect and make the product more attractive. For example, under the Uniform Deceptive Trade Practices Act, Section 2(5) it is unlawfully deceptive to represent “that goods or services have sponsorship, approval, characteristics, ingredients, uses, benefits, or quantities that they do not have.” UCITA was designed to pull software out of the scope of laws like this, which it did by defining software transactions as neither goods nor services but licenses. We should get rid of this cleverly created ambiguity.

4. User has right to see and approve all transfers of information from her computer. Before an application transmits any data from the user’s computer, the user should have the ability to see what’s being sent. If the message is encrypted, the user should be shown an unencrypted version. On seeing the message, the user should be able to refuse to send it. This may cause the application to cancel a transaction (such as a sale that depends on transmission of a valid credit card number), but transmission of data from the user’s machine without the user’s knowledge or in spite of the user’s refusal should be prosecutable as computer tampering.

5. A software vendor may not block customer from accessing his own data without court approval.

6. A software vendor may not prematurely terminate a license without court approval. The issue of vendor self-help (early termination of a software contract without a supporting court order) was debated at great length through the UCITA process. To turn off a customer’s access to software that runs on the customer’s machine, the vendor should get an injunction (a court order). However, perhaps a vendor should be able to deny a customer access to software running on the vendor’s machine without getting an injunction (though the unfairly-terminated customer should be allowed to get a court order to restore its access.)

7. Mass-market customers may criticize products, publish benchmark study results, and make fair use of a product. Some software licenses bar the customer from publishing criticisms of the product, or publishing comparisons of this product with others or using screenshots or product graphics to satirize or disparage the product or the company. Under the Copyright Act, you are allowed to reproduce part of a copyrighted work in order to criticize it, comment on it, teach from it, and so on. Software publishers shouldn’t be able to use “license” contracts to bar their mass-market customers from the type of free speech that the Federal laws (including the Copyright Act) have consistently protected.

8. The user may reverse engineer the software. Software licenses routinely ban reverse engineering, but American courts routinely say that reverse engineering is fair use, permissible under the Copyright Act. Recently, California courts have started enforcing no-reverse-engineering bans in software licenses. This is a big problem. Software publishers claim that reverse engineering is a way to steal their work. There are many legitimate, important uses of reverse engineering, such as exposing security holes in the software, exposing and fixing bugs (that the manufacturer might not fix because it is unwilling, unable, or no longer in business), exposing copyright violations or fraudulent claims by the manufacturer, or achieving interoperability (making the product work with another product or device). These benefit or protect the customer but do not help anyone unfairly compete with the manufacturer.

9. Mass-market software should be transferrable. Under the First Sale Doctrine, someone who buys a copyrighted product (like a book) can lend it, sell it, or give it away without having to get permission of the original publisher or author. Similarly, if you buy a car, you don’t have to get the car manufacturer’s permission to lend, sell, or donate your car. UCITA Section 503(2)allows mass-market software publishers to take away their customers’ rights to transfer software that they’ve paid for. It should not.

10. When software is embedded in a product, the law governing the product should govern the software. Think of the software that controls the fuel injectors in a car. Should the car manufacturer be allowed to license this software instead of supplying it under the basic contract for the sale of the car? (Paper 1) (Paper 2). Under extended pressure from the software industry, the Article 2 amendments specify that software (information) is not “goods” and so is not within the scope of Article 2, even though courts have been consistently applying Article 2 to packaged software transactions since 1970. In the 48 states that have not adopted UCITA, this amendment would mean that there is no law in that state that governs transactions in software. The courts would have to reason by analogy, either to UCITA or to UCC 2 or to something else. When a product includes both hardware (the car) and software (the fuel injector software, braking software, etc.), amended Article 2 allows the court to apply Article 2 to the hardware and other law to the software. Thus different warranty rules could apply and even though you could sell your car used without paying a fee to the manufacturer, you might not be able to transfer the car’s software without paying that fee. Vendors should not be able to play these kinds of games. “Embedded software” is itself a highly ambiguous term. In those cases in which it is unclear whether software is embedded or not, the law should treate the software as embedded.

SWEBOK Problems, Part 2

Friday, June 27th, 2003

I’m going through my detailed review of SWEBOK, in preparation for the June 30 comment deadline. The bulk of this blog entry is a page-by-page commentary / critique that I will submit to the SWEBOK review. Before that, here are some contextual comments.

Please get involved in this review process, which will close on June 30. Go to www.swebok.org to sign up and download swebok, and submit comments.

Time is short, and you might not be able to read all of SWEBOK in time to submit detailed comments. That’s OK. I recommend that you download it, skim the parts that are most interesting, realize the extent to which it excludes modern methods (such as agile development) and, if this bothers you, you can submit a very simple comment.

You can say something like:

“I have reviewed SWEBOK. I manage software development staff
and play a role in their training and supervision. SWEBOK does not
provide a good basis for the structure or detail of the knowledge
that I want my staff to have. It emphasizes attitudes and practices
that are not helpful on my projects and it downplays or skips
attitudes and practices that I consider essential. I consider this
document fundamentally flawed, and if I could vote to disapprove it,
I would.”

Obviously, you would tailor this to your circumstances.

=======

Overall Concerns with SWEBOK

=======

SWEBOK was created using a strange process. They started with the table of contents of the main software engineering textbooks — as if there is a strong relationship between software engineering as described in textbooks and software engineering as practiced in the field. From there, SWEBOK developed as delta’s from these books. SWEBOK is focused on “established traditional practices recommended by many organizations” and is intended to exclude “practices used only for certain types of software” and to exclude “innovative practices tested and used by some organizations and concepts still being developed and testing in research organizations.”

Somehow, we conclude that mutation testing is an established traditional practice that is widely recommended and used, but we exclude scenario testing. We conclude that massive tombs of test documentation are an established traditional practice widely followed, even though rants about bad test documentation are, to say the least, a common theme of comment in the community. And we exclude consideration of requirements analytical techniques (or project context considerations) that might help you make a sensible engineering determination of what types of documentation, at what level of depth, for what target reader, are worth the expense of creating and (possibly) maintaining them.

In the SWEBOK, page IX, we learn that the purpose of SWEBOK is to provide a “consensually-validated characterization.” In this, SWEBOK has failed utterly. Only a few people (about 500) were involved in the project. It alienated leading people, such as Grady Booch who recently said (in a post to the extremeprogramming listserv on yahoogroups, dated 5/31/2003)

“I was one of those 500 earlier reviewers – and
my comments were entirely negative. The SWEBOK
I reviewed was well-intentioned but misguided,
naive, incoherent, and just flat wrong in so
many dimensions.”

The Association for Computing Machinery was a co-authoring, co-sponsoring organization of SWEBOK at one point. But ACM eventually commissioned task forces to study the document and the rationale underlying the effort, and the result was a deeply critical evaluation and ACM’s withdrawal from the project.

ACM is the largest association of computing professionals in the world. How can it be said, with a straight face, that SWEBOK is a consensually-validated document when the ACM, including leaders of the ACM Special Interest Group in Software Engineering, determine that the approach to creating the document and the result are fundamentally flawed? See http://www.acm.org/serving/se_policy/ for details.

The SWEBOK response (front page of www.swebok.org) was this:

“The following motion was unanimously adopted
on April 18 2001.

“The Industrial Advisory Board of the Guide
to the Software Engineering Body of Knowledge
(SWEBOK) project recognizes that due process
was followed in the development of the Guide
(Trial Version) and endorses the position that
the Guide (Trial Version) is ready for field
trials for a period of two years.”

I love this phrasing. “Due process” has a fine, legalistic, officious ring to it. It sounds good, and (speaking as an attorney who has experience using lawyerly terms like “due process”) it will intimidate or silence some critics. But if your acceptance criterion is consensus, and you have obviously failed to achieve consensus, then a term like “due process” is just so much smoke to confuse the issue. If the process fails to produce the required product, the fact (if it is a fact) that the process was followed doesn’t make the failure a non-failure.

==========

Detailed Evaluation Comments

==========

Here are my page-by-page comments on the testing section of SWEBOK. I have reviewed other parts of SWEBOK and have concerns about them too, but life is short and precious and there is only so much of mine that I am willing to dedicate to a criticism of a fundamentally flawed piece of work.

===========

Page 69. The document praises the role of testing as a preventative technique throughout the lifecycle, but doesn’t consider test-driven development, which I believe is the single most important type of early testing.

============

Page 69. The document defines software testing as follows: “Software testing consists of the dynamic verification of the behavior of a program on a finite series of test cases, suitably selected from the usually infinite executions [sic] domain, against the specified expected behavior.”

In fact, a great deal of testing is done without specifying expected behavior. Here are three examples:

(1) Exploratory testing is done partially to discover the behavior.

(2) Some types of high volume random testing check for indicators of failure without having any model of expected behavior. (It would be ludicrous to say that their model of the expected behavior is that the program will not have memory leaks, stack corruption or other specific defects.)

(3) Most forms of user testing fail to involve comparison to specified behavior, and the user who protests that a certain behavior in a certain context is inappropriate, confusing or unacceptable, might well not be able to articulate her expectations, even after the failure, let along specify them in advance. (In many cases, expectation is driven by similarity to other experiences and we know from research in cognitive psychology, e.g. from Lee Brooks’ lab at McMaster, that many people would be unable to describe the similarity space that is the basis for their judgments.)

These types of test are widely used by testers, and they have been widely used for decades. Good testing sometimes involves comparison to specified expected behavior, but it often does not.

=============

Page 70. The document provides a laundry list of test techniques with no obvious selection or exclusion principle.

One of the oddities on page 70 is the assertion that “branch coverage is a popular test technique.” HUH? What makes this a technique? You achieve branch coverage by running any group of tests that take the program through each branch. We could achieve this test objective (achieve a certain level of coverage) via scenario tests, domain tests, various other types of tests. We could achieve the objective by running tests at the unit level or the fully integrated system level. SWEBOK says that coverage “should not be considered _per_se_ as the objective of testing.” I share that opinion — it is a poor objective. But it appears to be the objective of many people who drive their testing in order to achieve this result. The fact that the authors of SWEBOK don’t like coverage as an objective doesn’t make it a technique.

Another strange page 70 assertion is that test techniques used primarily to expose failures are primarily domain testing. SWEBOK says, “These techniques variously attempt to “breakâ€? the program, by running one [or more] test[s] drawn from identified classes of (deemed equivalent) executions. The leading principle underlying such techniques is being as much systematic as possible in identifying a representative set of program behaviors (generally in the form of subclasses of the input domain).”

Yes, domain testing is the most commonly described technique in textbooks. It is simple, easy to understand, and easy to teach. But risk-based testing, scenario testing, stress testing, specification-focused testing, high-volume automated testing, state-model-based testing, transaction-flow testing, and heuristic-based exploratory testing are other examples of testing techniques that go after bugs in the product. Why ignore these in favor of domain testing?

Additionally, even though the textbooks most often talk in terms of subclasses of input domains, it is important and fruitful to also analyze the program in terms of its output domains, its interfaces with other devices (disk, printer, etc.) and other processes, and its internal intermediate-result variables. By focusing students (or worse, professionals) on input domains to the exclusion of the others, we virtually blind them to important problems. As the ACM pointed out in its evaluation of SWEBOK, a “body of knowledge” should be focused on competent practice, not on the descriptions in introductory books.

SWEBOK (p. 70) also tells us that to avoid confusing test objectives and techniques, we must clearly distinguish between measures of the thoroughness of testing and measures of the software under test (such as measures of reliability). SWEBOK also tells us that when we conduct testing “in view of a specific purpose”, then that specific purpose is the “test objective.” SWEBOK lists examples of reliability measurement, usability evaluation, and contractor’s acceptance as important examples of objectives. I think those are fine objectives. But if a regulatory requirement specifies that I must achieve a certain type of coverage, and I design tests to meet that requirement, then meeting that coverage target IS my specific purpose for those tests. I can think of several circumstances under which achievement of a level of thoroughness of a certain type of testing IS the specific purpose for running a set of tests. What principled basis does SWEBOK have in (apparently) rejecting these as invalid objectives?

One (failing) rationale for deciding that achieving a certain level of (some type of) coverage is not a valid objective is that we strive to achieve coverage in order to help achieve something else, such as reliability. That sounds good (in spite of the fact that in some situations, we strive to achieve a certain level of coverage primarily in order to be able to say we achieved that level of coverage), but the reasoning generalizes inconveniently. For example, in many organizations, we do usability testing in order to help achieve customer acceptance. So usability evaluation should not be a valid test objective (because in some contexts, coverage is to reliability as, in other contexts, usability is to acceptance). But SWEBOK specifically blesses usability evaluation and contractor (customer) acceptance as valid test objectives.

A test objective is the objective that drives the design and execution of the tests. Different objectives are appropriate in different contexts. SWEBOK has no business dismissing some objectives as non-objectives.

=================

SWEBOK page 70 states that “Software testing is a very expensive and labor-intensive part of development. For this reason, tools are instrumental for automated test execution, test results logging and evaluation, and in general to support test activities. Moreover, in order to enhance cost-effectiveness ratio, a key issue has always been pushing test automation as much as possible.”

The idea that we should be “pushing test automation as much as possible” has been a source of much mischief and misunderstanding. I frequently hear from experienced testers that their highest bug find rates are achieved using manual or computer-assisted one-time-use tests. I don’t believe that it is to our advantage to stop doing this type of testing. Instead, I think we should be “pushing” cost-benefit analysis and implementing automation when it is cost-effective. For additional discussion of cost/benefit analysis for automation, see my papers, Architectures of Test Automation (https://kaner.com/testarch.html) and Avoiding Shelfware: A Manager’s View of Automated GUI Testing (https://kaner.com/pdfs/shelfwar.pdf).

The idea that we are actually automating testing is itself a misconception. Let’s consider the most common form of test “automation”, GUI regression-level “automation”. It involves these tasks

TASK / DONE BY

Analyze the specification and other docs for ambiguity or other indicators of potential error

–> Done by humans

Analyze the source code for potential errors or other things to test

–> Done by humans

Design test cases

–> Done by humans

Create test data

–> Done by humans

Run the tests the first time

–> Done by humans

Evaluate the first result

–> Done by humans

Report a bug from the first run

–> Done by humans

Debug the tests

–> Done by humans

Save the code

–> Done by humans

Save the results

–> Done by humans

Document the tests

–> Done by humans

Build a traceability matrix (tracing test cases back to specs or requirements)

–> Done by humans or by another tool (not the GUI tool)

Select the test cases to be run

–> Done by humans

Run the tests

–> The Tool does it

Record the results

–> The Tool does it

Evaluate the results

–>The Tool does it, but if there’s an apparent failure, a human re-evaluates the results.

Measure the results (e.g. performance measures)

–> Done by humans or by another tool (not the GUI tool)

Report errors

–> Done by humans

Update and debug the tests

–> Done by humans

When we see how many of the testing-related tasks are being done by people or, perhaps, by other testing tools, we realize that the GUI-level regression test tool doesn’t really automate testing. It just helps a human to do the testing. Rather than calling this “automated testing”, we should call it computer-assisted testing. I am not showing disrespect for this approach by calling it computer-assisted testing. Instead, I’m making a point–there are a lot of tasks in a testing project and we can get help from a hardware or software tool to handle any subset of them. GUI regression test tools handle some of these tasks very well. Other tools or approaches will handle a different subset

We should use tools in software testing, but we should not strive for complete automation. It is the wrong goal.

======================

Page 71-73 diagrams

These pages provide some diagrams of the structure of the rest of the testing chapter. Several of the items on these pages are troubling, but I’ll refer to them in the context of the more detailed discussions in the rest of the chapter.

=====================

Page 74, definitions of fault, failure and defect.

I don’t disagree with the definitions of fault and failure. However, SWEBOK equates “fault” and “defect”, where “fault” refers to the underlying cause of a malfunction.

I have two objections to the use of the word defect.

(a) First, in use, the word “defect” is ambiguous. For example, as a matter of law, a product is dangerously defective if it behaves in a way that would be unexpected by a reasonable user and that behavior results in injury. This is a failure-level definition of “defect.” Rather than trying to impose precision on a term that is going to remain ambiguous despite IEEE’s best efforts, our technical language should allow for the ambiguity.

(b) Second, the use of the word “defect” has legal implications. While some people advocate that we should use the word “defect” to refer to “bugs”, a bug-tracking database that contains frequent assertions of the form “X is a defect” may severely and unnecessarily damage the defendant software developer/publisher in court. In a suit based on an allegation that a product is defective (such as a breach of warranty suit, or a personal injury suit), the plaintiff must prove that the product is defective. If a problem with the program is labeled “defect” in the bug tracking system, that label is likely to convince the jury that the bug is a defect, even if a more thorough legal analysis would not result in classification of that particular problem as “defect” in the meaning of the legal system.

We should be cautious in the use of the word “defect”, recognize that this word will be interpreted in multiple ways by technical and nontechnical people, and recognize that a company’s use of the word in its engineering documents might unreasonably expose that company to legal liability.

=====================

Page 75, The Oracle Problem

As Doug Hoffman has pointed out, oracles are heuristic. When we use an oracle to determine that a program has passed or failed a test, we are comparing the program to some model or expectation on some number of dimensions. The program can fail on other dimensions that the oracle is blind to. For example, if we use Excel as the oracle for a spreadsheet under development, and evaluate the formula A1+A2, we might set cell A1 to 2 and cell A2 to 3 in both program and get 5 in both cases. In terms of the oracle, our spreadsheet has passed this test. But suppose the new spreadsheet took 5 hours to evaluate A1+A2. This is unacceptable, but the oracle is oblivious to it.

The characterization of oracles as tools to decide whether a program behaved correctly on a given test, without discussion of the inherent fallibility of all oracles, has led to serious misunderstandings.

=======================

Page 75, Testability

The third common meaning of testability in practice refers to the extend to which the program is easy to test and the test results are easy to interpret. Thus a highly testable program provides a high level of _control_ (the tester might be able to change data, start the program at any point, etc.) and a high level of _visibility_ (the tester can determine the state of the program, the value of specific variables, etc.)

This is widely used and it guides negotiation among testers and programmers regarding the support for testing that will be designed into a program.

=======================

Page 75, Test Levels (Unit, Integration, System)

SWEBOK says “Clearly, unit testing starts after coding is quite mature, for instance after a clean compile.”

This is 100% in disagreement with the practice of test-driven development, which requires the programmer to write a unit test immediately _before_ writing code that will enable the program to pass the test.

I think that test-driven development is the most important advance in the craft of testing of the past 30 years. This, more than any of the other flaws, illustrates the extent to which SWEBOK is blind to modern good practice.

========================

Page 75, Test Levels (Unit, Integration, System)

Much testing now involves API-level driving of a component developed by someone else. I think this is neither unit, nor integration, nor system testing.

========================

Page 75, Test Levels (Unit, Integration, System)

In test-driven development, the programmer implements a test and then writes the code needed for the program to pass the test. (More precisely, implement the test, run the program and see how it fails, write the simplest code that can pass the test, run the program and fix until the program passes the test, then refactor the code and retest.

The first use of these tests is to guide and check the initial implementation of a few lines of code. In that sense, they are “unit” tests and they are often referred to as unit tests. However, as programming tools evolve, these tests often look at the lines of code in the context of several other features. Using tools like Ward Cunningham’s FIT, for example, programmers might create many “unit” tests that are also “integration” (multi-variable, multi-function) and “system” (check whether an intended benefit will actually be provided to the end user).

I don’t think we should ban the use of the terms “unit”, “integration” and “system.” However, thinking about these as THE THREE LEVELS of testing, as defining the 3 targets of testing, leads to blind spots with respect to the nature of targets and the potential focus of individual tests.

========================

Page 75-77, Objectives of Testing

I think the categorization of testing concepts is strikingly odd. Here, I note the oddness of the list of testing objectives.

SWEBOK lists
– conformance testing, which it equates to functional testing
– reliability testing
– usability testing
– acceptance / qualification testing
– installation testing
– alpha and beta testing
– regression testing
– performance testing
– stress testing
– back-to-back testing
– recovery testing
– configuration testing, and
– usability testing.

This seems like a laundry list. Back-to-back testing looks more like a technique than an objective. Several others of these could be classed in different ways.

More important, what determines inclusion on this list?

For example, I think of objectives of testing as including:
– minimize liability risk
– decision support (help a project manager determine whether
to release the product)
– compliance with regulations or the expectations of a
regulatory inspector (this may or may not involve
conformance with a specification)
– assess and improve safety
– determine the nature of problems likely to arise in long
use of the product
– expose defects
– block premature release of a product
– improve the user experience

and many others.

In looking at the various lists in this document, I cannot divine a principle that governs inclusion versus exclusion.

As a teacher, I think that many things off the list are more important than the things on the list.

===================

Page 76. Regression testing

SWEBOK defines regression as

“the selective retesting of a system or component to verify that modifications have not caused unintended effects. In practice, the idea is to show that previously passed tests still do.” It then refers to “the assurance given by regression testing” [. . .] “that the software’s behavior is unchanged.”

An earlier version of SWEBOK noted that “regression testing” is commonly used to mean retesting the program to determine whether a bug was fixed. This is a popular definition. Why is it excluded?

Another common definition of regression testing is retesting the program to determine whether changes have caused fixed bugs to be re-broken.

If SWEBOK is a description of what is generally known and done, it should not select one definition and objective and exclude other common ones without even mentioning them.

Next, consider the idea that we run a bunch of tests again and again in order to assure ourselves that software behavior is unchanged. The regression test suite is a relatively tiny collection of tests that can only look at a relatively small proportion of the system’s behaviors. Our gamble is that the software’s behavior is not changing in ways missed by the regression tests. I have never seen a convincing theoretical argument that a regression test suite will expose most or all possible behavioral changes of a program and therefore I reject the notion that regression testing provides “assurance.”

The initial definition of regression testing is quite different from the idea of “assuring that system behavior is unchanged.” The definition is “verify that modifications have not caused unintended effects.” Let’s restate this definition in less antiquated terms — let’s talk about RISK instead of VERIFICATION.

That yields the idea that regression testing is done to mitigate the risk of unintended side effects of change.

A risk-based view of regression testing no longer requires us to use the same test, time and again, to study aspects of the program that have been previously tested. You can change data or combine a test with other tests or do other creative things to search for side effects. By varying the tests, you give yourself the chance to find previously-missed bugs — problems that were in the software all along but that you have missed with your tests so far — along with catching some side-effects. You are increasing coverage instead of mindlessly repeating the same old thing.

And under the risk-based view, you don’t fool yourself or defraud others with the idea that a small set of tests verifies some quality characteristic of the product. On page 75, SWEBOK approvingly noted Dijkstra’s insight that you can show the presence of bugs, but not their absence. This insight is flatly inconsistent with claims that we do any type of testing to “assure” or “verify”.

This conceptual contradiction illustrates the extent to which SWEBOK (as reflected in the testing section) seems to be more like a dumpster of testing concepts than like a conceptually coherent presentation.

As a dumpster (a disorganized collection of a miscellany of concepts) it is problematic because of the number and nature of things that have been kept out of the dumpster.

(NOTE: The analysis of regression testing above is mainly an analysis of system-level regression testing. I very much like the idea of creating an extensive suite of unit-level change-detectors, tests that we mainly create test-first, that cover every line and branch that we write. The difference between unit-level regression tests and system level is cost. The programmer runs the change-detector suite, every time she recompiles the code. If her change breaks the build, she fixes it immediately.

The labor cost associated with an independent tester discovering a regression error (which might be a bug in the program but is very often a test announcing that it must be changed to conform to a revised design) is quite high. Counting all people involved in the process, the time from failure through bug reporting to bug evaluation, prioritization, repair, and retesting will often total to an average of 4 labor-hours and or even higher (much higher) in some organizations.

In contrast, with the unit level change-detector suite, the programmer discovers the problem when she compiles the code, and immediately either fixes the code or the test. The labor cost is minimal. The practice is cheap enough that we can use it to support refactoring. The cost associated with traditional system-level regression testing is so high that we could not use it to support refactoring. The high communcation cost drives the cost of late change through the roof (one of the factors of the exponential growth curve of the cost of change over project time) whereas the absence of communication cost associated the unit-level change detectors allows us to make late changes at relatively low cost.

This is an important distinction within the description of a technique that SWEBOK says can be run at the system level or the unit level. We can do regression testing at either level, but their costs, benefits and uses are entirely different. It’s too bad that SWEBOK misses this point.

====================

Page 77 Test Techniques

This is another laundry list that excludes important current techniques, includes techniques that seem to be not widely used, and doesn’t expose any principled basis for inclusion or exclusion.

====================

Page 77 “Ad hoc testing”

SWEBOK says this (which it equates to exploratory testing) is the most widely practiced technique, and then advises a more systematic approach and then says that only experts should do ad hoc testing.

This is a blatant admission that the SWEBOK drafters simply don’t understand the most widely practiced approach to testing. SWEBOK cites my book as its source for testing “based on tester’s intuition and experience” and I believe I am the person who coined the term, “exploratory testing”, so let’s look at what this is. (Much of this material was developed by or with James Bach and many other colleagues over a 20 year period.)

First, exploratory testing involves simultaneous design, execution, and learning about the program. Rather than design tests and then run them, you do some testing, learn from them, learn from other sources, and base your design of next tests on your new insights. Your oracle (set of evaluation criteria) evolve as you learn more.

Second, every competent tester does exploratory testing. If you report a bug and the programmer tells you it was fixed, you do some testing around the fix. One test is the test that exposed the bug in the first place. But if you’re any good, you create additional tests to see if the fix is more general than the specific circumstances reported in the bug report, and to see if there were side effects. These are not pre-planned, pre-specified tests. They are designed, run, evaluated and extended in the moment. I use this testing situation as a basic training ground for junior testers (and classroom students). Surely, it is not something we would leave only to the experts. But SWEBOK tells us that “A more systematic approach is advised” and “ad hoc testing might be useful but only if the tester is really expert!”

There is such a thing as systematic exploratory testing.

I’m not trying to write SWEBOK here, but rather to support my assertion that SWEBOK seem to be clueless about a body of work that even it describes as the most widely practiced technique in the field.

If SWEBOK is intended to describe the current body of knowledge and practice in the field, its cluelessness about the most widely practiced approach is inexcusable.

===============

Page 80

My final comment on SWEBOK’s testing section has to do with its comments on test documentation.

“Documentation is an integral part of the formalization of the test process. The IEEE standard for Software Test Documentation [829] provides a good description of test documents and of their relationship with one another and with the testing process. . . . Test documentation should be produced and continually updated, at the same standards as other types of documentation in development.”

This is pious-sounding claptrap, religious doctrine rather than engineering. There is far too much of this in SWEBOK.

Is IEEE Standard 829 a good description?

In my experience, I have seen 829 applied by many commercial software companies or commercial companies that were developing software as part of their support process for their main business. I have never seen a case in which a commercial software application was benefitted more than it was harmed by application of standard 829. Several colleagues of mine have had the same experience. Bach, Pettichord and I discuss the problems in Lessons Learned in Software Testing.

Normally, an engineering body of knowledge includes assertions that are based on theory and tested by experiment. There was no theoretical basis underlying Standard 829. I am not aware of any experimental research of the costs and benefits associated with the application of 829.

Test document is expensive.

Testing is subject to a very difficult constraint. We have an infinity of potential tests and a very limited amount of time in which to imagine, create, document, and evaluate the results of running a few of those many possible tests.

Time spent generating paperwork is time not available for test implementation, test tool development, test execution and evaluation.

Good practice, therefore, probably pushes toward cost-benefit evaluation on a case by case basis. If a certain type of document is so valuable for the current project that it is worth taking time away from competing tasks in order to create the document, create the document.

Before we can pronounce that test documentation should be continually updated, we should discover why the test documentation is being created and how it will be used in the future. Maybe updating is called for. Maybe not. Maybe the documentation should be up to the standards of other documentation on the project, but maybe not. It depends on who will use the documents, and for what purpose.

=====================

In Sum

My time is limited.

I could write pages and pages more about the weaknesses of SWEBOK, but I think it would be pointless.

I agree with the ACM appraisal that the SWEBOK started with a fundamentally flawed approach. The result continues to be fundamentally flawed.

The call for comments on SWEBOK asked for appraisal of SWEBOK as it relates to teaching.

I teach courses in software testing. SWEBOK is not a good reference point for them.

SWEBOK’s criteria for inclusion and exclusion of topics is unsatisfactory. Many of the most important topics in my testing courses, (such as test driven development, API-level testing, scenario testing, skilled exploration, the difference in objectives and cost/benefit for unit-level regression suites and system-level regression suites, risk-based testing, a risk-based approach to domain testing instead of the stale, 40-year-old, boundary/equivalence approach documented in SWEBOK), effective bug reporting, using requirements analysis techniques to drive decisions about the types of artifacts to be generated, and on and on, are absent from SWEBOK. Much of what is present in SWEBOK is organized strangely, is dated, and many of the techniques (etc.) are marginal in terms of how often they are used and what value they actually provide.

I have appraised SWEBOK against my course notes, which I update regularly (and which I am updating again this summer). My conclusion was that SWEBOK’s flaws are so severe that, on balance, it is a less-than-worthless reference point for discovery of opportunities to improve the notes.

I also teach courses in software metrics.

Measurement, as studied in other fields, normally involves extensive study of validity of measures and threats to validity. One of the most important validity questions is how can we tell whether the measure actually measures what it purports to measure. What model or theory (and associated empirical support) relates the number we obtain (a complexity level of “10”) to the underlying attribute we are trying to measure?

Another critical question involves side effects of measurement. Robert Austin’s book, Measuring and Managing Performance in Organizations, discusses this in detail.

The review of measurement theory in SWEBOK (page 174 and on) skips lightly past these issues and provides a laundry list of metrics, including many that are invalid and unvalidated “metrics” — to the extent that it is clear what attribute they are actually intended to measure. The main value of the SWEBOK treatment of measurement is that it is concise. It makes an excellent “straw man”, something I can hand out and enthusiastically criticize. This is probably not the educational use we would hope to obtain from something that is SUPPOSED TO serve as the basis for a licensing exam.

=============

CLOSING ASSERTION

THERE IS NO BALLOTING PROCESS FOR SWEBOK THIS TIME. IF THERE WAS, I WOULD VOTE THAT SWEBOK SHOULD NOT BE ACCEPTED, WITH OR WITHOUT MODIFICATION.

— Cem Kaner
— Professor of Software Engineering
— Florida Institute of Technology

IEEE’s “Body of Knowledge” for Software Engineering

Tuesday, June 17th, 2003

SOFTWARE ENGINEERING’S “BODY OF KNOWLEDGE�

The IEEE Computer Society has been developing its own statement of the Software Engineering Body of Knowledge (SWEBOK). They are now calling for a review of SWEBOK, which you can participate in at www.swebok.org.
According to their Call for Reviewers (email, May 29, 2003:

“The purpose of the Guide is to characterize the contents of the software engineering discipline, to promote a consistent view of software engineering worldwide, to clarify the place of, and set the boundary of, software engineering with respect to other disciplines, and to provide a foundation for curriculum development and individual licensing material. All deliverables are available without any charge at www.swebok.org.”

SWEBOK pushes the traditional, documentation-heavy approaches. I have read several drafts of it over the years but I chose to not be involved in the official process because I believed that:

  • The document had little merit and probably wouldn’t get much better;
  • My comments wouldn’t have much influence.
  • These grand, in my view highly premature, efforts to standardize and regulate the field come and go but don’t really have enough influence to worry about.

In retrospect, I think that keeping away from SWEBOK was a mistake. I think it has the potential to do substantial harm. I urge you to get involved in the SWEBOK review, make your criticisms clear and explicit, and urge them in writing to abandon this project. Even though this will have little influence with the SWEBOK promoters, it will create a public record of controversy and protest. Because SWEBOK is being effectively pushed as a basis for licensing software engineers and evaluating / accrediting software engineering degree programs, a public record of controversy may play an important role.

LICENSING

Should software engineers be licensed as engineers?
One of the key reasons for the creation of the SWEBOK was to support political moves to license software engineers. This is from the SWEBOK Project Overview

“A core body of knowledge is pivotal to the development and accreditation of university curricula and the licensing and certification of professionals. Achieving consensus by the profession on a core body of knowledge is a key milestone in all disciplines and has been identified by the Coordinating Committee as crucial for the evolution of software engineering toward a professional status. The Guide to the Software Engineering Body of Knowledge project is an initiative completed under the auspices of this Committee to reach this consensus. “

In a series of studies, the Association for Computing Machinery recommended against licensing. I was a member of one of the ACM’s study panels, the one that considered the relationship between licensing and safety-critical software.
I think that licensing engineers in our profession today is premature and likely to do serious harm to the profession. I don’t say this lightly. I’ve thought about it for a long time, and from many perspectives (I am a full Professor of Software Engineering, an Attorney who has a strong interest in malpractice law, and a person who has almost 20 years experience in commercial software development (programming, designing user interfaces, testing, tech writing, managing programmers, testers, and writers, consulting, negotiating contracts, etc.).

SWEBOK

The SWEBOK is written as the basis for licensing exams for professional software engineers. If your state requires you to get a license to practice software engineering (and more will, if they are convinced that they can create fair exams based on a consensus document from the profession), the SWEBOK is the document you will have to study.
If the SWEBOK is the basis for the licensing exam, the practices in the SWEBOK will be treated as the basis for malpractice lawsuits. People who do what is called good practice in SWEBOK will be able to defend their practices in court if they are ever sued for malpractice. People who adopt what might be much better practices, but practices that conflict with the SWEBOK, will risk harsh criticism in court. As the basis for a licensing exam, SWEBOK becomes as close to an Official Statement of the approved practices of the field as a licensed profession is going to get.

So what’s in this SWEBOK?

The IEEE SWEBOK is a statement of “generally accepted practices�, which are defined as “established traditional practices recommended by many organizations.� SWEBOK is NOT a document intended to include “specialized� practices, which are “practices used only for certain types of software� nor for “advanced and research� practices, which are “innovative practices tested and used only by some organizations and concepts still being developed and tested in research organizations.
I am most familiar with SWEBOK’s treatments of software testing, software quality and metrics. It endorses practices that I consider wastefully bureaucratic, document-intensive, tedious, and in commercial software development, not likely to succeed. These are the practices that some software process enthusiasts have tried and tried and tried and tried and tried to ram down the throats of the software development community, with less success than they would like.
By promoting these document-centered, rigid practices in a document that serves as the basis for licensing of software engineers, the SWEBOK committee can drive adoption of these practices to a much greater degree than practitioners have accepted voluntarily.
The Association for Computing Machinery assessed SWEBOK and concluded it was seriously flawed and that ACM (originally a partner in development of the SWEBOK) should withdraw from the process. SWEBOK was ultimately adopted as a “consensus� document based on votes from fewer than 350 reviewers, in the face of criticism and walkout by the largest association of computing professionals in the world.

IN SUM

Only 500 people participated in the development of SWEBOK and many of them voiced deep criticisms of it. The balloted draft was supported by just over 300 people (of a mere 340 voting). Within this group were professional trainers who stand to make substantial income from pre-licensing-exam and pre-certification-exam review courses, consulting/contracting firms who make big profits from contracts that (such as government contracts) that specify gold-plated software development processes (of course you need all this process documentation—the IEEE standards say you need it!), and academics who have never worked on a serious development project. There were also experienced, honest people with no conflicts of interest, but when there are only a few hundred voices, the voices of vested interests can exert a substantial influence on the result.
I don’t see a way to vote on the 2003 version of SWEBOK. If I did, I would urge you to vote NO.
But even though you cannot vote to disapprove this document, you can review it, criticize it, and make clear the extent to which it does fails reflect the better practices in your organization.
To the extent that it is clear that there is no consensus around the SWEBOK, engineering societies will be less likely to rely on it in developing licensing exams (and less likely to push ahead with plans to license software engineers), and judges and juries will be less likely to conclude that “It says so in the SWEBOK. That must be what the best minds in the profession have decided is true.�
Please, go to www.swebok.org ASAP.
Comments at www.swebok.org are welcome until July 1, 2003.

SPAM, Filtering, and Commercial Legislation

Thursday, May 1st, 2003

The Federal Trade Commission is in the midst of hearings on SPAM. Congressional leaders are promising anti-spam legislation. The biggest headline getter at the moment is Charles Schumer’s promise to introduce a national opt-out registry within a few weeks. Actually, such a bill already exists, sponsored by Mark Dayton along with an added bonus, a federal study of software companies’ technical support practices.

I think we need we a national opt-out list as a first level of defense in dealing with spam because the more obvious first line of defense, the spam filter used by your ISP or installed on your machine, subjects you to important risks. These risks arise from electronic communications rules in the Uniform Electronic Transactions Act (UETA) and the Uniform Computer Information Transactions Act (UCITA) .

The problem is that under these bills, if you someone sends you an important legal notice that looks like spam to your ISP’s filter or the one that runs locally in conjunction with your e-mail client, you won’t see it, but you will still be legally accountable for knowing its contents. There are two significant risks here. First, you might ACCIDENTALLY miss some important mail (like a mortgage foreclosure notice). Second, someone who is required by law to send you a notice but doesn’t want you to read it can INTENTIONALLY distribute it in a way that is likely to trigger your spam filter. In both cases, you lose.
(more…)