WTST 2012: Workshop on Teaching Software Testing

November 27th, 2011

TEACHING SECURITY-RELATED SOFTWARE TESTING

WTST 2012: The 11th Annual Workshop on Teaching Software Testing

January 27-29, 2012

at the Harris Institute for Assured Information

Florida Institute of Technology, Melbourne, Florida

http://www.wtst.org.

Software testing is often described as a central part of software security, but it has a surprisingly small role in security-related curricula. Over the next 5 years, we hope to change this. If there is sufficient interest, we hope to focus WTSTs 2012-2016 on instructional support for teaching security-related testing.

OUR GOALS FOR WTST 2012

  • Survey the domain: What should we consider as part of “security-related software testing”?
  • Cluster the domain: What areas of security-related testing would fit well together in the same course?
  • Characterize some of the key tasks:
    • Some types of work are (or should be) routine. To do them well, an organization needs a clearly defined, repeatable process that is easy to delegate.
    • Other types are cognitively complex. Their broader goals might stay stable, but the details constant change as circumstances and threats evolve.
    • And other types are centered on creating, maintaining and extending technology, such as tools to support testing.
  • Publish this overview (survey / clustering / characterization)
  • Apply for instructional development grants. We (CSTER) intend to apply for funding. We hope to collaborate with other institutions and practitioners and we hope to foster other collaborations that lead to proposals that are independent of CSTER.

UNDERLYING VIEWPOINT

The Workshop on Teaching Software Testing is concerned with the practical aspects of teaching university-caliber software testing courses to academic or commercial students.

We see software testing as a cognitively complex activity, an active search for quality-related information rather than a tedious collection of routines. We see it as more practical than theoretical, more creative than prescriptive, more focused on investigation than assurance (you can’t prove a system is secure by testing it), more technical than managerial, and more interested in exploring risks than defining processes.

We think testing is too broad an area to cover fully in a single course. A course that tries to teach too much will be too superficial to have any real value. Rather than designing a single course to serve as a comprehensive model, we think the field is better served with several designs for several courses.

We are particularly interested in online courses that promote deeper knowledge and skill. You can see our work on software testing at http://www.testingeducation.org/BBST. Online courses and courseware, especially Creative Commons courseware, make it possible for students to learn multiple perspectives and to study new topics and learn new skills on a schedule that works for them.

WHO SHOULD ATTEND

We invite participation by:

  • academics who have experience teaching courses on testing or security
  • practitioners who teach professional seminars on software testing or security
  • one or two graduate students
  • a few seasoned teachers or testers who are beginning to build their strengths in teaching software testing or security.

There is no fee to attend this meeting. You pay for your seat through the value of your participation. Participation in the workshop is by invitation based on a proposal. We expect to accept 15 participants with an absolute upper bound of 22.

TO APPLY TO ATTEND

Send an email to Cem Kaner (kaner@cs.fit.edu) by December 20, 2011.

Your email should describe your background and interest in teaching software testing or security. What skills or knowledge do you bring to the meeting that would be of interest to the other participants?

If you are willing to make a presentation, send an abstract. Along with describing the proposed concepts and/or activities, tell us how long the presentation will take, any special equipment needs, and what written materials you will provide. Along with traditional presentations, we will gladly consider proposed activities and interactive demonstrations.

We will begin reviewing proposals on December 1. We encourage early submissions. It is unlikely but possible that we will have accepted a full set of presentation proposals by December 20

Proposals should be between two and four pages long, in PDF format. We will post accepted proposals to http://www.wtst.org.

We review proposals in terms of their contribution to knowledge of HOW TO TEACH software testing and security. We will not accept proposals that present a theoretical advance with weak ties to teaching and application. Presentations that reiterate materials you have presented elsewhere might be welcome, but it is imperative that you identify the publication history of such work.

By submitting your proposal, you agree that, if we accept your proposal, you will submit a scholarly paper for discussion at the workshop by January 15, 2010. Workshop papers may be of any length and follow any standard scholarly style. We will post these at http://www.wtst.org as they are received, for workshop participants to review before the workshop.

HOW THE MEETING WILL WORK

WTST is a workshop, not a typical conference.

  • We will have a few presentations, but the intent of these is to drive discussion rather than to create an archivable publication.
    • We are glad to start from already-published papers, if they are presented by the author and they would serve as a strong focus for valuable discussion.
    • We are glad to work from slides, mindmaps, or diagrams.
  • Some of our sessions will be activities, such as brainstorming sessions, collaborative searching for information, creating examples, evaluating ideas or workproducts and lightning presentations (presentations limited to 5-minutes, plus discussion).
  • In a typical presentation, the presenter speaks 10 to 90 minutes, followed by discussion. There is no fixed time for discussion. Past sessions’ discussions have run from 1 minute to 4 hours. During the discussion, a participant might ask the presenter simple or detailed questions, describe consistent or contrary experiences or data, present a different approach to the same problem, or (respectfully and collegially) argue with the presenter.

Our agenda will evolve during the workshop. If we start making significant progress on something, we are likely to stick with it even if that means cutting or timeboxing some other activities or presentations.

Presenters must provide materials that they share with the workshop under a Creative Commons license, allowing reuse by other teachers. Such materials will be posted at http://www.wtst.org.

HOSTS

The hosts of the meeting are:

LOCATION AND TRAVEL INFORMATION

We will hold the meetings at

Harris Center for Assured Information, Room 327

Florida Tech, 150 W University Blvd,

Melbourne, FL

Airport

Melbourne International Airport is 3 miles from the hotel and the meeting site. It is served by Delta Airlines and US Airways. Alternatively, the Orlando International Airport offers more flights and more non-stops but is 65 miles from the meeting location.

Hotel

We recommend the Courtyard by Marriott – West Melbourne located at 2101 W. New Haven Avenue in Melbourne, FL.

Please call 1-800-321-2211 or 321-724-6400 to book your room by January 2. Be sure to ask for the special WTST rates of $89 per night. Tax is an additional 11%.

All reservations must be guaranteed with a credit card by January 2, 2010 at 6:00 pm. If rooms are not reserved, they will be released for general sale. Following that date reservations can only be made based upon availability.

For additional hotel information, please visit the http://www.wtst.org or the hotel website at http://www.marriott.com/hotels/travel/mlbch-courtyard-melbourne-west/

OUR INTELLECTUAL PROPERTY AGREEMENT

We expect to publish some outcomes of this meeting. Each of us will probably have our own take on what was learned. Participants (all people in the room) agree to the following:

  • Any of us can publish the results as we see them. None of us is the official reporter of the meeting unless we decide at the meeting that we want a reporter.
  • Any materials initially presented at the meeting or developed at the meeting may be posted to any of our web sites or quoted in any other of our publications, without further permission. That is, if I write a paper, you can put it on your web site. If you write a problem, I can put it on my web site. If we make flipchart notes, those can go up on the web sites too. None of us has exclusive control over this material. Restrictions of rights must be identified on the paper itself.
    • NOTE: Some papers are circulated that are already published or are headed to another publisher. If you want to limit republication of a paper or slide set, please note the rights you are reserving on your document. The shared license to republish is our default rule, which applies in the absence of an asserted restriction.
  • The usual rules of attribution apply. If you write a paper or develop an idea or method, anyone who quotes or summarizes you work should attribute it to you. However, many ideas will develop in discussion and will be hard (and not necessary) to attribute to one person.
  • Any publication of the material from this meeting will list all attendees as contributors to the ideas published as well as the hosting organization.
  • Articles should be circulated to WTST-2012 attendees before being published when possible. At a minimum, notification of publication will be circulated.
  • Any attendee may request that his or her name be removed from the list of attendees identified on a specific paper.
  • If you have information which you consider proprietary or otherwise shouldn’t be disclosed in light of these publication rules, please do not reveal that information to the group.

ACKNOWLEDGEMENTS

Support for this meeting comes from the Harris Institute for Assured Information at the Florida Institute of Technology, and Kaner, Fiedler & Associates, LLC.

Funding for WTST 1-5 came primarily from the National Science Foundation , under grant EIA-0113539 ITR/SY+PE “Improving the Education of Software Testers.” Partical funding for the Advisory Board meetings in WTST 6-10 came from the the National Science Foundation, under grant CCLI-0717613 “Adaptation & Implementation of an Activity-Based Online or Hybrid Course in Software Testing”.

Opinions expressed at WTST or published in connection with WTST do not recessarily reflect the views of NSF.

WTST is a peer conference in the tradition of the Los Altos Workshops of Software Testing.

 

Please update your links to this blog

November 20th, 2011

A few new posts will be coming soon. I’m hoping they won’t be missed.

I moved my blog to http://kaner.com over a year ago — If you’re still linking to the old site, please update it.

Thanks!

A welcome addition to the scholarship of exploratory software testing

November 16th, 2011

Juha Itkonen will be defending his dissertation on “Empirical Studies on Exploratory Software Testing” this Friday. I haven’t read the entire document, but what I have read looks very interesting.

Juha has been studying real-world approaches to software testing for about a decade (maybe longer–when I met him almost a decade ago, his knowledge of the field was quite sophisticated). I’m delighted to see this quality of academic work and wish him well in his final oral exam.

For a list of soon-to-be-Dr. Itkonen’s publications, see https://wiki.aalto.fi/display/~jitkonen@aalto.fi/Full+list+of+publications.

Emphasis & Objectives of the Test Design Course

October 8th, 2011

Becky and I are getting closer to rolling out Test Design. Here’s our current summary of the course:

Learning Objectives for Test Design

This is an introductory survey of test design. The course introduces students to:

  • Many (over 100) test techniques at a superficial level (what the technique is).
  • A detailed-level of familiarity with a few techniques:
    • function testing
    • testing tours
    • risk-based testing
    • specification-based testing
    • scenario testing
    • domain testing
    • combination testing.
  • Ways to compare strengths of different techniques and select complementary techniques to form an effective testing strategy
  • Using the Heuristic Test Strategy Model for specification analysis and risk analysis
  • Using concept mapping tools for test planning.

I’m still spending about 6 hours per night on video edits, but our most important work is on the assessments. To a very large degree, my course designs are driven by my assessments. That’s because there’s such a strong conviction in the education community–which I share–that students learn much more from the assessments (from all the activities that demand that they generate stuff and get feedback on it) than from lectures or informal discussions. The lectures and slides are an enabling backdrop for the students’ activities, rather than the core of the course.

In terms of design decisions, deciding what I will hold my students accountable for knowing requires me to decide what I will hold myself accountable for teaching well.

If you’re intrigued by that way of thinking about course design, check out:

I tested the course’s two main assignments in university classrooms several times before starting on the course slides (and wrote 104 first-draft multiple-quess questions and maybe 200 essay questions). But now that the course content is almost complete, we’re revisiting (and of course rewriting) these materials. In the process, we’ve been gaining perspective.

I think the most striking feature of the new course is its emphasis on content.

Let me draw the contrast with a chart that compares the BBST courses (Foundations, Bug Advocacy, and Test Design) and some other courses still on the drawing boards:

A few definitions:

  • Course Skills: How to be an effective student. Working effectively in online courses. Taking tests. Managing your time.
  • Social Skills: Working together in groups. Peer reviews. Using collaboration tools (e.g. wikis).
  • Learning Skills: How to gather, understand, organize and be able to apply new information. Using lectures, slides, and readings effectively. Searching for supplementary information. Using these materials to form and defend your own opinion.
  • Testing Knowledge: Definitions. Facts and concepts of testing. Structures for organizing testing knowledge.
  • Testing Skills: How to actually do things. Getting better (through practice and feedback) at actually doing them.
  • Computing Fundamentals: Facts and concepts of computer science and computing-relevant discrete mathematics.

As we designed the early courses, Becky Fiedler and I placed a high emphasis on course skills and learning skills. Students needed to (re)learn how to get value from online video instruction, how to take tests, how to give peer-to-peer feedback, etc.

The second course, Bug Advocacy, emphasizes specific testing skills–but the specific skills are the ones involved in bug investigation and reporting. Even though these critical thinking, research, and communication skill have strong application to testing, they are foundational for knowledge-related work.

Test Design is much more about the content (testing knowledge). We survey (depends on how you count) 70 to 150 test techniques. We look for ways to compare and contrast them. We consider how to organize projects around combinations of a few techniques that complement each other (make up for each other’s weaknesses and blindnesses). The learning skills component is active reading–This is certainly generally useful, but its context and application is specification analysis.

Test Design is more like the traditional Software Testing Course firehose. Way too much material in way too little time, with lots of reference material to help students explore the underemphasized parts of the course when they need it on the job.

The difference is that we are relying on the students’ improved learning skills. The assignments are challenging. The labs are still works-in-progress and might not be polished until the third iteration of the course, but labs-plus-assignments being home a bunch of lessons.

Whether the students’ skills are advanced enough to process over 500 slides efficiently, integrate material across sections, integrate required readings, and apply them to software — all within the course’s 4-week timeframe — remains to be seen.

Learning Objectives of the BBST Courses

September 19th, 2011

As I finish up the post-production and assessment-design for the Test Design course, I’m writing these articles as a set of retrospectives on the instructional design of the series.

For me, this is a transition point. The planned BBST series is complete with lessons to harvest as we create courseware on software metrics, development of skills with specific test techniques, computer-related law/ethics (in progress), cybersecurity, research methods and instrumentation applied to quantitative finance, qualitative research methods, and analysis of requirements.

Instructional Design

I think course design is a multidimensional challenge, focused around seven core questions:

    1. Content: What information, or types of information do I want the students to learn?
    2. Skills: What skills do I want the students to develop?
    3. Level of Learning: What level of depth do I want to students to learn this material at?
    4. Learning activities: How will the course’s activities (including the assessments, such as assignments and exams) that I use support the students’ learning?
    5. Instructional Technologies: What technologies will I use to support the course?
    6. Assessment: How will I assess the course: How will I find out what the students have learned, and at what level?
    7. Improvement: How will I use the assessment results to guide my improvement of the course?

This collection of articles will probably skip around in these questions as I take inventory of my last 7 years of notes on online and hybrid course development.

Objectives of the BBST Courses

The changing nature of the objectives of the BBST courses.

Courses differ in more than content. They differ in the other things you hope students learn along with the content.

The BBST courses include a variety of types of assessment: quizzes, labs, assignments and exams. For instructional designers, the advantage of assessments is that we can find out what the students know (and don’t know) and what they can apply.

The BBST courses have gone through several iterations. Becky Fiedler and I used the performance data to guide the evolutions.

Based on what we learned, we place a higher emphasis in the early courses on course skills and learning skills and a greater emphasis in the later courses on testing skills.

A few definitions:

  • Course Skills: How to be an effective student. Working effectively in online courses. Taking tests. Managing your time.
  • Social Skills: Working together in groups. Peer reviews. Using collaboration tools (e.g. wikis).
  • Learning Skills: How to gather, understand, organize and be able to apply new information. Using lectures, slides, and readings effectively. Searching for supplementary information. Using these materials to form and defend your own opinion.
  • Testing Knowledge: Definitions. Facts and concepts of testing. Structures for organizing testing knowledge.
  • Testing Skills: How to actually do things. Getting better (through practice and feedback) at actually doing them.
  • Computing Fundamentals: Facts and concepts of computer science and computing-relevant discrete mathematics.

You can see the evolution of emphasis in the course’s specific learning objectives.

Learning Objectives of the 3-Course BBST Set

  • Understand key testing challenges that demand thoughtful tradeoffs by test designers and managers.
  • Develop skills with several test techniques.
  • Choose effective techniques for a given objective under your constraints.
  • Improve the critical thinking and rapid learning skills that underlie good testing.
  • Communicate your findings effectively.
  • Work effectively online with remote collaborators.
  • Plan investments (in documentation, tools, and process improvement) to meet your actual needs.
  • Create work products that you can use in job interviews to demonstrate testing skill.

Learning Objectives for the First Course (Foundations)

This is the first of the BBST series. We address:

  • How to succeed in online classes
  • Fundamental concepts and definitions
  • Fundamental challenges in software testing

Improve academic skills

  • Work with online collaboration tools
    • Forums
    • Wikis
  • Improve students’ precision in reading
  • Create clear, well-structured communication
  • Provide (and accept) effective peer reviews
  • Cope calmly and effectively with formative assessments (such as tests designed to help students learn).

Learn about testing

  • Key challenges of testing
    • Information objectives drive the testing mission and strategy
    • Oracles are heuristic
    • Coverage is multidimensional
    • Complete testing is impossible
    • Measurement is important, but hard
  • Introduce you to:Basic vocabulary of the field
    • Basic facts of data storage and manipulation in computing
    • Diversity of viewpoints
    • Viewpoints drive vocabulary

Learning Objectives for the Second Course (Bug Advocacy)

Bug reports are not just neutral technical reports. They are persuasive documents. The key goal of the bug report author is to provide high-quality information, well written, to help stakeholders make wise decisions about which bugs to fix.

Key aspects of the content of this course include:

  • Defining key concepts (such as software error, quality, and the bug processing workflow)
  • The scope of bug reporting (what to report as bugs, and what information to include)
  • Bug reporting as persuasive writing
  • Bug investigation to discover harsher failures and simpler replication conditions
  • Excuses and reasons for not fixing bugs
  • Making bugs reproducible
  • Lessons from the psychology of decision-making: bug-handling as a multiple-decision process dominated by heuristics and biases.
  • Style and structure of well-written reports

Our learning objectives include this content, plus improving your abilities / skills to:

  • Evaluate bug reports written by others
  • Revise / strengthen reports written by others
  • Write more persuasively (considering the interests and concerns of your audience)
  • Participate effectively in distributed, multinational workgroup projects that are slightly more complex than the one in Foundations

Learning Objectives for the Third Course (Test Design)

This is an introductory survey of test design. The course introduces students to:

  • Many (nearly 200) test techniques at a superficial level (what the technique is).
  • A detailed-level of familiarity with a few techniques:
    • function testing
    • testing tours
    • risk-based testing
    • specification-based testing
    • scenario testing
    • domain testing
    • combination testing.
  • Ways to compare strengths of different techniques and select complementary techniques to form an effective testing strategy
  • Using the Heuristic Test Strategy Model for specification analysis and risk analysis
  • Using concept mapping tools for test planning.

 

A New Course on Test Design: The Bibliography

September 13th, 2011

Back in 2004, I started developing course videos on software testing and computer-related law/ethics. Originally, these were for my courses at Florida Tech, but I published them under a Creative Commons license so that people could incorporate the materials in their own courses.

Soon after that, a group of us (mainly, I think, Scott Barber, Doug Hoffman, Mike Kelly, Pat McGee, Hung Nguyen, Andy Tinkham, and Ben Simo) started planning the repurposing of the academic course videos for professional development. I put together some (failing) prototypes and Becky Fiedler took over the instructional design.

  • We published the first version of BBST-Foundations and taught the courses through AST (Association for Software Testing). It had a lot of rough edges, but people liked it at lot.
  • So Becky and I created course #2, Bug Advocacy, with a lot of help from Scott Barber (and many other colleagues). This was new material, a more professional effort than Foundations, but it took a lot of time.

That took us to a fork in the road.

  • I was working with students on developing skills with specific techniques (I worked with Giri Vijayaraghavan and Ajay Jha on risk-based testing; Sowmya Padmanabhan on domain testing; and several students with not-quite-successful efforts on scenario testing). Sowmya and I proved (not that we were trying to) that developing students’ testing skills was more complex than I’d been thinking. So, Becky and I were rethinking our skills-development-course designs.
  • On the other hand, AST’s Board wanted to pull together a “complete” introductory series in black box testing. Ultimately, we went that way.

The goal was a three-part series:

  1. A reworked Foundations that fixed many of the weaknesses of Version 1. We completed that one about a year ago (Becky and Doug Hoffman were my main co-creators, with a lot of design guidance from Scott Barber).
  2. Bug Advocacy, and
  3. a new course in Test Design.

Test Design is (finally) almost done (many thanks to Michael Bolton and Doug Hoffman). I’ll publish the lectures as we finish post-production on the videos. Lecture 1 should be up around Saturday.

Test Design is a survey course. We cover a lot of ground. And we rely heavily on references, because we sure don’t know everything there is to know about all these techniques.

To support the release of the videos, I’m publishing our references now. (The final course slides will have the references too, but those won’t be done until we complete editing the last video.):

  • As always, it has been tremendously valuable reading books and papers suggested by colleagues and rereading stuff I’ve read before. A lot of careful thinking has gone into the development and analysis of these techniques.
  • As always, I’ve learned a lot from people whose views differ strongly from my own. Looking for the correctness in their views–what makes them right, within their perspective and analysis, even if I disagree with that perspective–is something I see as a basic professional skill.
  • And as always, I’ve not only learned new things: I’ve discovered that several things I thought I knew were outdated or wrong. I can be confident that the video is packed with errors–but plenty fewer than there would have been a year ago and none that I know about now.

So… here’s the reference list. Video editing will take a few weeks to complete–if you think we should include some other sources, please let me know. I’ll read them and, if appropriate, I’ll gladly include them in the list.

Active reading (see also Specification-based testing and Concept mapping)

All-pairs testing

See http://www.pairwise.org/ for more references generally and http://www.pairwise.org/tools.asp for a list of tools.

Alpha testing

See references on tests by programmers of their own code, or on relatively early testing by development groups. For a good overview from the viewpoint of the test group, see Schultz, C.P., Bryant, R., & Langdell, T. (2005). Game Testing All in One. Thomson Press

Ambiguity analysis (See also specification-based testing)

Best representative testing (See domain testing)

Beta testing

Boundary testing (See domain testing)

Bug bashes

Build verification

  • Guckenheimer, S. & Perez, J. (2006). Software Engineering with Microsoft Visual Studio Team System. Addison Wesley.
  • Page, A., Johnston, K., & Rollison, B.J. (2009). How We Test Software at Microsoft. Microsoft Press.
  • Raj, S. (2009). Maximize your investment in automation tools. Software Testing Analysis & Review. http://www.stickyminds.com

Calculations

Note: There is a significant, relevant field: Numerical Analysis. The list here merely points you to a few sources I have personally found helpful, not necessarily to the top references in the field.

Combinatorial testing. See All-Pairs Testing

Concept mapping

  • Hyerle, D.N. (2008, 2nd Ed.). Visual Tools for Transforming Information into Knowledge, Corwin.
  • Margulies, N., & Maal, N. (2001, 2nd Ed.) Mapping Inner Space: Learning and Teaching Visual Mapping. Corwin.
  • McMillan, D. (2010). Tales from the trenches: Lean test case design. http://www.bettertesting.co.uk/content/?p=253
  • McMillan, D. (2011). Mind Mapping 101. http://www.bettertesting.co.uk/content/?p=956
  • Moon, B.M., Hoffman, R.R., Novak, J.D., & Canas, A.J. (Eds., 2011). Applied Concept Mapping: Capturing, Analyzing, and Organizing Knowledge. CRC Press.
  • Nast, J. (2006). Idea Mapping: How to Access Your Hidden Brain Power, Learn Faster, Remember More, and Achieve Success in Business. Wiley.
  • Sabourin, R. (2006). X marks the test case: Using mind maps for software design. Better Software. http://www.stickyminds.com/BetterSoftware/magazine.asp?fn=cifea&id=90

Concept mapping tools:

Configuration coverage

Configuration / compatibility testing

Constraint checks

See our notes in BBST Foundation’s presentation of Hoffman’s collection of oracles.

Constraints

Diagnostics-based testing

  • Al-Yami, A.M. (1996). Fault-Oriented Automated Test Data Generation. Ph.D. Dissertation, Illinois Institute of Technology.
  • Kaner, C., Bond, W.P., & McGee, P.(2004). High volume test automation. Keynote address: International Conference on Software Testing Analysis & Review (STAR East 2004). Orlando. http://www.kaner.com/pdfs/HVAT_STAR.pdf (The Telenova and Mentsville cases are both examples of diagnostics-based testing.)

Domain testing

  • Abramowitz & Stegun (1964), Handbook of Mathematical Functions. http://people.math.sfu.ca/~cbm/aands/frameindex.htm
  • Beizer, B. (1990). Software Testing Techniques (2nd Ed.). Van Nostrand Reinhold.
  • Beizer, B. (1995). Black-Box Testing. Wiley.
  • Binder, R. (2000). Testing Object-Oriented Systems: Addison-Wesley.
  • Black, R. (2009). Using domain analysis for testing. Quality Matters, Q3, 16-20. http://www.rbcs-us.com/images/documents/quality-matters-q3-2009-rb-article.pdf
  • Copeland, L. (2004). A Practitioner’s Guide to Software Test Design. Artech House.
  • Clarke, L.A. (1976). A system to generate test data and symbolically execute programs. IEEE Transactions on Software Engineering, 2, 208-215.
  • Clarke, L. A. Hassel, J., & Richardson, D. J. (1982). A close look at domain testing. IEEE Transactions on Software Engineering, 2, 380-390.
  • Craig, R. D., & Jaskiel, S. P. (2002). Systematic Software Testing. Artech House.
  • Hamlet, D. & Taylor, R. (1990). Partition testing does not inspire confidence. IEEE Transactions on Software Engineering, 16(12), 1402-1411.
  • Hayes, J.H. (1999). Input Validation Testing: A System-Level, Early Lifecycle Technique. Ph.D. Dissertation (Computer Science), George Mason University.
  • Howden, W. E. (1980). Functional testing and design abstractions. Journal of Systems & Software, 1, 307-313.
  • Jeng, B. & Weyuker, E.J. (1994). A simplified domain-testing strategy. ACM Transactions on Software Engineering, 3(3), 254-270.
  • Jorgensen, P. C. (2008). Software Testing: A Craftsman’s Approach (3rd ed.). Taylor & Francis.
  • Kaner, C. (2004a). Teaching domain testing: A status report. Paper presented at the Conference on Software Engineering Education & Training. http://www.kaner.com/pdfs/teaching_sw_testing.pdf
  • Kaner, C., Padmanabhan, S., & Hoffman, D. (2012) Domain Testing: A Workbook, in preparation.
  • Myers, G. J. (1979). The Art of Software Testing. Wiley.
  • Ostrand, T. J., & Balcer, M. J. (1988). The category-partition method for specifying and generating functional tests. Communications of the ACM, 31(6), 676-686.
  • Padmanabhan, S. (2004). Domain Testing: Divide and Conquer. M.Sc. Thesis, Florida Institute of Technology. http://www.testingeducation.org/a/DTD&C.pdf
  • Schroeder, P.J. (2001). Black-box test reduction using input-output analysis. Ph.D. Dissertation (Computer Science). Illinois Institute of Technology.
  • Weyuker, E. J., & Jeng, B. (1991). Analyzing partition testing strategies. IEEE Transactions on Software Engineering, 17(7), 703-711.
  • Weyuker, E.J., & Ostrand, T.J. (1980). Theories of program testing and the application of revealing subdomains. IEEE Transactions on Software Engineering, 6(3), 236-245.
  • White, L. J., Cohen, E.I., & Zeil, S.J. (1981). A domain strategy for computer program testing. In Chandrasekaran, B., & Radicchi, S. (Ed.), Computer Program Testing (pp. 103-112). North Holland Publishing.
  • http://www.wikipedia.org/wiki/Stratified_sampling

Dumb monkey testing

  • Arnold, T. (1998), Visual Test 6. Wiley.
  • Nyman, N. (1998). Application testing with dumb monkeys. International Conference on Software Testing Analysis & Review (STAR West).
  • Nyman, N. (2000), Using monkey test tools. Software Testing & Quality Engineering, 2(1), 18-20
  • Nyman, N. (2004). In defense of monkey testing. http://www.softtest.org/sigs/material/nnyman2.htm

Eating your own dogfood

  • Page, A., Johnston, K., & Rollison, B.J. (2009). How We Test Software at Microsoft. Microsoft Press.

Equivalence class analysis (see Domain testing)

Experimental design

  • Popper, K.R. (2002, 2nd Ed.). Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge.
  • Shadish, W.R., Cook, T.D., & Campbell, D.T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference, 2nd Ed. Wadsworth.

Exploratory testing

Failure mode analysis: see also Guidewords and Risk-Based Testing.

Feature integration testing

Function testing

Function equivalence testing

  • Hoffman, D. (2003). Exhausting your test options. Software Testing & Quality Engineering, 5(4), 10-11
  • Kaner, C., Falk, J., & Nguyen, H.Q. (2nd Edition, 2000). Testing Computer Software. Wiley.

Functional testing below the GUI

Guerilla testing

  • Kaner, C., Falk, J., & Nguyen, H.Q. (2nd Edition, 2000). Testing Computer Software. Wiley.

Guidewords

Installation testing

Interoperability testing

Load testing

Localization testing

  • Bolton, M. (2006, April). Where in the world? Better Software. http://www.developsense.com/articles/2006-04-WhereInTheWorld.pdf
  • Chandler, H.M. & Deming, S.O (2nd Ed. in press). The Game Localization Handbook. Jones & Bartlett Learning.
  • Ratzmann, M., & De Young, C. (2003). Galileo Computing: Software Testing and Internationalization. Lemoine International and the Localization Industry Standards Association. http://www.automation.org.uk/downloads/documentation/galileo_computing-software_testing.pdf
  • Savourel, Y. (2001). XML Internationalization and Localization. Sams Press.
  • Singh, N. & Pereira, A. (2005). The Culturally Customized Web Site: Customizing Web Sites for the Global Marketplace. Butterworth-Heinemann.
  • Smith-Ferrier, G. (2006). .NET Internationalization: The Developer’s Guide to Building Global Windows and Web Applications. Addison-Wesley Professional.
  • Uren, E., Howard, R. & Perinotti, T. (1993). Software Internationalization and Localization. Van Nostrand Reinhold.

Logical expression testing

  • Amman, P., & Offutt, J. (2008). Introduction to Software Testing. Cambridge University Press.
  • Beizer, B. (1990). Software Testing Techniques (2nd Ed.). Van Nostrand Reinhold.
  • Copeland, L. (2004). A Practitioner’s Guide to Software Test Design. Artech House (see Chapter 5 on decision tables).
  • Jorgensen, P. (2008, 3rd Ed.). Software Testing: A Craftsman’s Approach. Auerbach Publications (see Chapter 7 on decision tables).
  • Brian Marick (2000) modeled testing of logical expressions by considering common mistakes in designing/coding a series of related decisions. Testing for Programmers. http://www.exampler.com/testing-com/writings/half-day-programmer.pdf.
  • MULTI. Marick implemented his approach to testing logical expressions in a program, MULTI. Tim Coulter and his colleagues extended MULTI and published it (with Marick’s permission) at http://sourceforge.net/projects/multi/

Long-sequence testing

Mathematical oracle

See our notes in BBST Foundation’s presentation of Hoffman’s collection of oracles.

Numerical analysis (see Calculations)

Paired testing

Pairwise testing (see All-Pairs testing)

Performance testing

Programming or software design

  • Roberts, E. (2005, 20th Ed.). Thinking Recursively with Java. Wiley.

Psychological considerations

  • Bendor, J. (2005). The perfect is the enemy of the best: Adaptive versus optimal organizational reliability. Journal of Theoretical Politics. 17(1), 5-39.
  • Rohlman, D.S. (1992). The Role of Problem Representation and Expertise in Hypothesis Testing: A Software Testing Analogue. Ph.D. Dissertation, Bowling Green State University.
  • Teasley, B.E., Leventhal, L.M., Mynatt, C.R., & Rohlman, D.S. (1994). Why software testing is sometimes ineffective: Two applied studies of positive test strategy. Journal of Applied Psychology, 79(1), 142-155.
  • Whittaker, J.A. (2000). What is software testing? And why is it so hard? IEEE Software, Jan-Feb. 70-79.

Quicktests

Random testing

Regression testing

Requirements-based testing

Requirements-based testing (continued)

  • Whalen, M.W., Rajan, A., Heimdahl, M.P.E., & Miller, S.P. )2006). Coverage metrics for requirements-based testing. Proceedings of the 2006 International Symposium on Software Testing and Analysis. http://portal.acm.org/citation.cfm?id=1146242
  • Wiegers, K.E. (1999). Software Requirements. Microsoft Press.

Risk-based testing

  • Bach, J. (1999). Heuristic risk-based testing. Software Testing & Quality Engineering. http://www.satisfice.com/articles/hrbt.pdf
  • Bach, J. (2000a). Heuristic test planning: Context model. http://www.satisfice.com/tools/satisfice-cm.pdf
  • Bach, J. (2000b). SQA for new technology projects. http://www.satisfice.com/articles/sqafnt.pdf
  • Bach, J. (2003). Troubleshooting risk-based testing. Software Testing & Quality Engineering, May/June, 28-32. http://www.satisfice.com/articles/rbt-trouble.pdf
  • Becker, S.A. & Berkemeyer, A. (1999). The application of a software testing technique to uncover data errors in a database system. Proceedings of the 20th Annual Pacific Northwest Software Quality Conference, 173-183.
  • Berkovich, Y. (2000). Software quality prediction using case-based reasoning. M.Sc. Thesis (Computer Science). Florida Atlantic University.
  • Bernstein, P.L. (1998). Against the Gods: The Remarkable Story of Risk. Wiley.
  • Black, R. (2007). Pragmatic Software Testing: Becoming an Effective and Efficient Test Professional. Wiley.
  • Clemen, R.T. (1996, 2nd ed.) Making Hard Decisions: An Introduction to Decision Analysis. Cengage Learning.
  • Copeland, L. (2004). A Practitioner’s Guide to Software Test Design. Artech House.
  • DeMarco, T. & Lister, T. (2003). Waltzing with Bears. Managing Risk on Software Projects. Dorset House.
  • Dorner, D. (1997). The Logic of Failure. Basic Books.
  • Gerrard, P. & Thompson, N. (2002). Risk-Based E-Business Testing. Artech House.
  • HAZOP Guidelines (2008). Hazardous Industry Planning Advisory Paper No. 8, NSW Government Department of Planning. http://www.planning.nsw.gov.au/plansforaction/pdf/hazards/haz_hipap8_rev2008.pdf
  • Hillson, D. & Murray-Webster, R. (2007). Understanding and Managing Risk Attitude. (2nd Ed.). Gower. http://www.risk-attitude.com/
  • Hubbard, D.W. (2009). The Failure of Risk Management: Why It’s Broken and How to Fix It. Wiley.
  • Jorgensen, A.A. (2003). Testing with hostile data streams. ACM SIGSOFT Software Engineering Notes, 28(2). http://cs.fit.edu/media/TechnicalReports/cs-2003-03.pdf
  • Jorgensen, A.A. & Tilley, S.R. (2003). On the security risks of not adopting hostile data stream testing techniques. 3rd International Workshop on Adoption-Centric Software Engineering (ACSE 2003), p. 99-103. http://www.sei.cmu.edu/reports/03sr004.pdf
  • Kaner, C. (2008). Improve the power of your tests with risk-based test design. Quality Assurance Institute QUEST conference. http://www.kaner.com/pdfs/QAIriskKeynote2008.pdf
  • Kaner, C., Falk, J., & Nguyen, H.Q. (2nd Edition, 2000a). Testing Computer Software. Wiley.
  • Neumann, P.G. (undated). The Risks Digest: Forum on Risks to the Public in Computers and Related Systems. http://catless.ncl.ac.uk/risks
  • Perrow, C. (1999). Normal Accidents: Living with High-Risk Technologies. Princeton University Press (but read this in conjunction with Robert Hedges’ review of the book on Amazon.com).
  • Petroski, H. (1992). To Engineer is Human: The Role of Failure in Successful Design. Vintage.
  • Petroski, H. (2004). Small Things Considered: Why There is No Perfect Design. Vintage.
  • Petroski, H. (2008). Success Through Failure: The Paradox of Design. Princeton University Press.
  • Pettichord, B. (2001). The role of information in risk-based testing. International Conference on Software Testing Analysis & Review (STAR East). http://www.stickyminds.com
  • Reason, J. T.  (1997). Managing the Risks of Organizational Accident. Ashgate Publishing.
  • Schultz, C.P., Bryant, R., & Langdell, T. (2005). Game Testing All in One. Thomson Press (discussion of defect triggers).
  • Software Engineering Institute’s collection of papers on project management, with extensive discussion of project risks. https://seir.sei.cmu.edu/seir/
  • Weinberg, G. (1993). Quality Software Management. Volume 2: First Order Measurement. Dorset House.

Rounding errors (see Calculations)

Scenario testing (See also Use-case-based testing)

Self-verifying data

Specification-based testing (See also active reading; See also ambiguity analysis)

State-model-based testing

  • Auer, A.J. (1997). State Testing of Embedded Software. Ph.D. Dissertation (Computer Science). Oulun Yliopisto (Finland).
  • Becker, S.A. & Whittaker, J.A. (1997). Cleanroom Software Engineering Practices. IDEA Group Publishing.
  • Buwalda, H. (2003). Action figures. Software Testing & Quality Engineering. March/April 42-27. http://www.logigear.com/articles-by-logigear-staff/245-action-figures.html
  • El-Far, I. K. (1999), Automated Construction of Software Behavior Models, Masters Thesis, Florida Institute of Technology, 1999.
  • El-Far, I. K. & Whittaker, J.A. (2001), Model-based software testing, in Marciniak, J.J. (2001). Encyclopedia of Software Engineering, Wiley. http://testoptimal.com/ref/Model-based Software Testing.pdf
  • Jorgensen, A.A. (1999). Software Design Based on Operational Modes. Doctoral Dissertation, Florida Institute of Technology. https://cs.fit.edu/Projects/tech_reports/cs-2002-10.pdf
  • Katara, M., Kervinen, A., Maunumaa, M., Paakkonen, T., & Jaaskelainen, A. (2007). Can I have some model-based GUI tests please? Providing a model-based testing service through a web interface. Conference of the Association for Software Testing. http://practise.cs.tut.fi/files/publications/TEMA/cast07-final.pdf
  • Mallery, C.J. (2005). On the Feasibility of Using FSM Approaches to Test Large Web Applications. M.Sc. Thesis (EECS). Washington State University.
  • Page, A., Johnston, K., & Rollison, B.J. (2009). How We Test Software at Microsoft. Microsoft Press.
  • Robinson, H. (1999a). Finite state model-based testing on a shoestring. http://www.stickyminds.com/getfile.asp?ot=XML&id=2156&fn=XDD2156filelistfilename1%2Epdf
  • Robinson, H. (1999b). Graph theory techniques in model-based testing. International Conference on Testing Computer Software. http://sqa.fyicenter.com/art/Graph-Theory-Techniques-in-Model-Based-Testing.html
  • Robinson, H. Model-Based Testing Home Page. http://www.geocities.com/model_based_testing/
  • Rosaria, S., & Robinson, H. (2000). Applying models in your testing process. Information & Software Technology, 42(12), 815-24. http://www.harryrobinson.net/ApplyingModels.pdf
  • Schultz, C.P., Bryant, R., & Langdell, T. (2005). Game Testing All in One. Thomson Press.
  • Utting, M., & Legeard, B. (2007). Practical Model-Based Testing: A Tools Approach. Morgan Kaufmann.
  • Vagoun, T. (1994). State-Based Software Testing. Ph.D. Dissertation (Computer Science). University of Maryland College Park.
  • Whittaker, J.A. (1992). Markov Chain Techniques for Software Testing and Reliability Analysis. Ph.D. Dissertation (Computer Science). University of Tennessee.
  • Whittaker, J.A. (1997). Stochastic software testing. Annals of Software Engineering, 4, 115-131.

Stress testing

Task analysis (see also Scenario testing and Use-case-based testing)

  • Crandall, B., Klein, G., & Hoffman, R.B. (2006). Working Minds: A Practitioner’s Guide to Cognitive Task Analysis. MIT Press.
  • Draper, D. & Stanton, N. (2004). The Handbook of Task Analysis for Human-Computer Interaction. Lawrence Erlbaum.
  • Ericsson, K.A. & Simon, H.A. (1993). Protocol Analysis: Verbal Reports as Data (Revised Edition). MIT Press.
  • Gause, D.C., & Weinberg, G.M. (1989). Exploring Requirements: Quality Before Design. Dorset House.
  • Hackos, J.T. & Redish, J.C. (1998). User and Task Analysis for Interface Design. Wiley.
  • Jonassen, D.H., Tessmer, M., & Hannum, W.H. (1999). Task Analysis Methods for Instructional Design.
  • Robertson, S. & Robertson, J. C. (2006, 2nd Ed.). Mastering the Requirements Process. Addison-Wesley Professional.
  • Schraagen, J.M., Chipman, S.F., & Shalin, V.I. (2000). Cognitive Task Analysis. Lawrence Erlbaum.
  • Shepard, A. (2001). Hierarchical Task Analysis. Taylor & Francis.

Test design / test techniques (in general)

Test idea catalogs

Testing skill

Many of the references in this collection are about the development of testing skill. However, a few papers stand out, to me, as exemplars of papers that focus on activities or structures designed to help testers improve their day-to-day testing skills. We need more of these.

Tours

Usability testing

  • Cooper, A. (2004). The Inmates are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity. Pearson Education.
  • Cooper, A., Reimann, R. & Cronin, D. (2007). About Face 3: The Essentials of Interaction Design. Wiley.
  • Dumas, J.S. & Loring, B.A. (2008). Moderating Usability Tests: Principles and Practices for Interacting. Morgan Kaufmann.
  • Fiedler, R.L., & Kaner, C. (2009). “Putting the context in context-driven testing (an application of Cultural Historical Activity Theory).” Conference of the Association for Software Testing. http://www.kaner.com/pdfs/FiedlerKanerCast2009.pdf
  • Ives, B., Olson, M.H., & Baroudi, J.J. (1983). The measurement of user information systems. Communications of the ACM, 26(10), 785-793. http://portal.acm.org/citation.cfm?id=358430
  • Krug, S. (2005, 2nd Ed.). Don’t Make Me Think: A Common Sense Approach to Web Usability. New Riders Press.
  • Kuniavsky, M. (2003). Observing the User Experience: A Practitioner’s Guide to User Research. Morgan Kaufmann.
  • Lazar, J., Fend, J.H., & Hochheiser, H. (2010). Research Methods in Human-Computer Interaction. Wiley.
  • Nielsen, J. (1994). Guerrilla HCI: Using discount usability engineering to penetrate the intimidation barrier. http://www.useit.com/papers/guerrilla_hci.html
  • Nielsen, J. (1999). Designing Web Usability. Peachpit Press.
  • Nielson, J. & Loranger, H. (2006). Prioritize Web Usability. MIT Press.
  • Norman, D.A. (2010). Living with Complexity. MIT Press.
  • Norman, D.A. (1994). Things that Make Us Smart: Defending Human Attributes in the Age of the Machine. Basic Books.
  • Norman, D.A. & Draper, S.W. (1986). User Centered System Design: New Perspectives on Human-Computer Interaction. CRC Press.
  • Patel, M. & Loring, B. (2001). Handling awkward usability testing situations. Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting. 1772-1776.
  • Platt, D.S. (2006). Why Software Sucks. Addison-Wesley.
  • Rubin, J., Chisnell, D. & Spool, J. (2008). Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests. Wiley.
  • Smilowitz, E.D., Darnell, M.J., & Benson, A.E. (1993). Are we overlooking some usability testing methods? A comparison of lab, beta, and forum tests. Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting, 300-303.
  • Stone, D., Jarrett, C., Woodroffe, M. & Minocha, S. (2005). User Interface Design and Evaluation. Morgan Kaufmann.
  • Tullis, T. & Albert, W. (2008). Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics (Interactive Technologies). Morgan Kaufmann.

Use-case based testing (see also Scenario testing and Task analysis)

  • Adolph, S. & Bramble, P. (2003). Patterns for Effective Use Cases. Addison-Wesley.
  • Alexander, Ian & Maiden, Neil. Scenarios, Stories, Use Cases: Through the Systems Development Life-Cycle.
  • Alsumait, A. (2004). User Interface Requirements Engineering: A scenario-based framework. Ph.D. dissertation (Computer Science), Concordia University.
  • Berger, Bernie (2001) “The dangers of use cases employed as test cases,” STAR West conference, San Jose, CA. http://www.testassured.com/docs/Dangers.htm
  • Charles, F.A. (2009). Modeling scenarios using data. STP Magazine. http://www.quality-intelligence.com/articles/Modelling%20Scenarios%20Using%20Data_Paper_Fiona%20Charles_CAST%202009_Final.pdf
  • Cockburn, A.(2001). Writing Effective Use Cases. Addison-Wesley.
  • Cohn, M. (2004). User Stories Applied: For Agile Software Development. Pearson Education.
  • Collard, R. (July/August 1999). Test design: Developing test cases from use cases. Software Testing & Quality Engineering, 31-36.
  • Hsia, P., Samuel, J. Gao, J. Kung, D., Toyoshima, Y. & Chen, C. (1994). Formal approach to scenario analysis. IEEE Software, 11(2), 33-41.
  • Jacobson, I. (1995). The use-case construct in object-oriented software engineering. In John Carroll (ed.) (1995). Scenario-Based Design. Wiley.
  • Jacobson, I., Booch, G. & Rumbaugh, J. (1999). The Unified Software Development Process. Addison-Wesley.
  • Jacobson, I. & Bylund, S. (2000) The Road to the Unified Software Development Process. Cambridge University Press.
  • Kim, Y. C. (2000). A Use Case Approach to Test Plan Generation During Design. Ph.D. Dissertation (Computer Science). Illinois Institute of Technology.
  • Kruchten, P. (2003, 3rd Ed.). The Rational Unified Process: An Introduction. Addison-Wesley.
  • Samuel, J. (1994). Scenario analysis in requirements elicitation and software testing. M.Sc. Thesis (Computer Science), University of Texas at Arlington.
  • Utting, M., & Legeard, B. (2007). Practical Model-Based Testing: A Tools Approach. Morgan Kaufmann.
  • Van der Poll, J.A., Kotze, P., Seffah, A., Radhakrishnan, T., & Alsumait, A. (2003). Combining UCMs and formal methods for representing and checking the validity of scenarios as user requirements. Proceedings of the South African Institute of Computer Scientists and Information Technologists on Enablement Through Technology. http://dl.acm.org/citation.cfm?id=954014.954021
  • Zielczynski, P. (2006). Traceability from use cases to test cases. http://www.ibm.com/developerworks/rational/library/04/r-3217/

User interface testing

User testing (see beta testing)

  • Albert, W., Tullis, T. & Tedesco, D. (2010). Beyond the Usability Lab: Conducting Large-Scale Online User Experience Studies. Morgan Kaufmann.
  • Wang, E., & Caldwell, B. (2002). An empirical study of usability testing: Heuristic evaluation vs. user testing. Proceedings of the Human Factors and Ergonomics Society 46th Annual Meeting. 774-778.

Testing tours: Research for Best Practices?

June 24th, 2011

A few years ago, Michael Bolton wrote a blog post on “Testing Tours and Dashboards.” Back then, it had recently become fashionable to talk about “tours” as a fundamental class of tool in exploratory testing. Michael reminded readers of the unacknowledged history of this work.

Michael’s post also mentioned that some people were talking about running experiments on testing tours. Around that time, I heard a bunch about experiments that purported to compare testing tours. This didn’t sound like very good work to me, so I ignored it. Bad ideas often enjoy a burst of publicity, but if you leave them alone, they often fade away over time.

This topic has come up again recently, repeatedly. A discussion with someone today motivated me to finally publish a comment.

The comments have come up mainly in discussions or reviews of a test design course that I’m creating. The course (video) starts with a demonstration of a feature tour, followed by a inventory of many of the tours I learned from Mike Kelly, Elisabeth Hendrickson, James Bach and Mike Bolton.

Why, the commentator asks, do I not explain which tours are better and which are worse? After all, hasn’t there been some research that shows that Tour X is better than Tour Y? This continues down one of two tracks:

  • Isn’t it irresponsible to ignore scientific research that demonstrates that some tours are much better than others or that some tours are ineffective?
  • Shouldn’t I be recommending that people do more of this kind of research? Wouldn’t this be a promising line of research? Couldn’t someone propose it to a corporate research department or a government agency that funds research? After all, this could be a scientific way to establish some Best Practices for exploratory testing (use the best tours, skip the worst ones).

This idea, using experiments to rank tours from best to worst, can certainly be made to sound impressive.

I don’t think this is a good idea. I’ll say that more strongly: even though this idea might be seductive to people who have training in empirical methods (or who are easily impressed by descriptions of empirical work), I think it reflects a fundamental lack of understanding of exploratory testing and of touring as a class of exploratory tools.

A tour is a directed search through the program. Find all the capabilities. Find all the claims about the product. Find all the variables. Find all the intended benefits. Find all the ways to get from A to B. Find all the X. Or maybe not ALL, but find a bunch.

This helps the tester achieve a few things:

  1. It creates an inventory of a class of attributes of the product under test. Later, the tester can work through the inventory, testing each one to some intended level of depth. This is what “coverage”-oriented testing is about. You can test N% of the program’s statements, or N% of the program’s features, or N% of the claims made for the product in its documentation–if you can list it, you can test it and check it off the list.
  2. It familiarizes the tester with this aspect of the product. Testing is about discovering quality-related information about the product. An important part of the process of discovery is learning what is in the product, how people can / will use it, and how it works. Tours give us different angles on that multidimensional learning problem.
  3. It provides a way for the tester to explain to someone else what she has studied in the product so far, what types of things she has learned and what basics haven’t yet been explored. In a field with few trustworthy metrics, this gives us a useful basis for reporting progress, especially progress early in the testing effort.

So which tour is better?

From a test-coverage perspective, I think that depends a lot on contract, regulation, and risk.

  1. To the extent that you have to know (and have to be able to tell people in writing) that all the X’s have been tested and all the X’s work, you need to know what all the X’s are and how to find them in the program. That calls for an X-tour. Which tour is better? The one you need for this product. That probably varies from product to product, no?
  2. Some programmers are more likely to make X-bugs than Y-bugs. Some programmers are sloppy about initializing variables. Some are sloppy about boundary conditions. Some are sloppy about thread interactions. Some are good coders but they design user interactions that are too confusing. If I’m testing Joe X’s code, I want to look for X-bugs. If Joe blows boundaries, I want to do a variable tour, to find all the variables so I can test all the boundaries. But if Joe’s problem is incomprehensibility, I want to do a benefit tour, to see what benefits people _should_ get from the program and how hard/confusing it is for users to actually get them. Which tour is better? That depends on which risks we are trying to mitigate, which bugs we are trying to find. And that varies from program to program, programmer to programmer, and on the same project, from time to time.

From a tester-learning perspective, people learn differently from each other.

  1. If I set 10 people with the task of learning what’s in a program, how it can be used, and how it works, those people would look for different types of information. They would be confused by different things. They would each find some things more interesting than others. They would already know some things that their colleagues didn’t.
  2. Which tour is better? The tour that helps you learn something you’re trying to learn today. Tomorrow, the best tour will be something different.

Testing is an infinite task. Define “distinct tests” this way: two tests are distinct if each can expose at least one bug that the other would miss. For a non-trivial program, there is an infinite number of distinct potential tests. The central problem of test design is boiling down this infinite set to a (relative to infinity) tiny collection of tests that we will actually use. Each test technique highlights a different subset of this infinity. In effect, each test technique represents a different sampling strategy from the space of possible tests.

Tours help the tester gain insight into the multidimensional nature of this complex, infinite space. They help us imagine, envision, and as we gain experience on the tour, prioritize the different sampling strategies we could use when we do more thorough, more intense testing after finishing the tours. So which tour is best? The ones that give the testers more insight and that achieve a greater stretch of the testers’ imagination. For this, some tours will work better for me, others for you.

The best tour isn’t necessarily the one that finds the most bugs. Or covers the most statements (or take your pick of what coverage attribute) of the product. The best tour is the one that helps the individual human tester learn something new and useful. (That’s why we call it exploration. New and useful knowledge.) And that depends on what you already know, on what risks and complexities characterize this program, and what the priorities are in this project.

A tour that is useful to you might be worthless to me. But that doesn’t stop it from being useful for you.

  • Rather than looking for a few “best” tours, I think it would be more interesting to develop guidance on how to do tour X given that you want to. (Do you know how to do a tour of uses of the program that trigger multithreaded operations in the system? Would it be interesting to know how?)
  • Rather than looking for a few “best” tours, I think it would be more interesting to develop a more diverse collection of tours that we can do, with more insight into what each can teach us.
  • Rather than seeking to objectify and quantify tours, I think we should embrace their subjectivity and the qualitative nature of the benefits they provide.

Pseudo-academics and best-practicers trivialize enough aspects of testing. They should leave this one alone.

A new brand of snake oil for software testing

May 19th, 2010

I taught a course last term on Quantitative Investment Modeling in Software Engineering to a mix of undergrad and grad students of computer science, operations research and business. We had a great time, we learned a lot about the market, about modeling, and about automated exploratory testing (more on this type of exploratory testing at this year’s Conference of the Association for Software Testing…)

In the typical undergraduate science curriculum, most of the experimental design we teach to undergraduates is statistical. Given a clearly formulated hypothesis and a reasonably clearly understood oracle, we learn how to design experiments that control for confounding variables, so that we can decide whether our experimental effect was statistically significant. We also teach some instrumentation, but in most cases, the students learn how to use well-understood instruments as opposed to how to appraise, design, develop, calibrate and then apply them.

Our course was not so traditionally structured. In our course, each student had to propose and evaluate an investment strategy. We started with a lot of bad ideas. (Most small investors lose money. One of the attributes of our oracle is, “If it causes you to lose money, it’s probably a bad idea.”) We wanted to develop and demonstrate good ideas instead. We played with tools (some worked better than others) and wrote code to evolve our analytical capabilities, studied some qualitative research methods (hypothesis-formation is a highly qualitative task), ran pilot studies, and then eventually got to the formal-research stages that the typical lab courses start at.

Not surprisingly, the basics of designing a research program took about 1/3 of the course. With another course, I probably could have trained these students to be moderately-skilled EVALUATORS of research articles. (It is common in several fields to see this as a full-semester course in a doctoral program.)

Sadly, few CS doctoral programs (and even fewer undergrad programs) offer courses in the development or evaluation of research, or if they offer them, they don’t require them.

The widespread gap between having a little experience replicating other people’s experiments and seeing some work on a lab, on the one hand, and learning to do and evaluate research on the other hand — this gap is the home court for truthiness. In the world of truthiness, it doesn’t matter whether the evidence in support of an absurd assertion is any good, as long as we can make it look to enough people as though good enough evidence exists. Respectable-looking research from apparently-well-credentialed people is hard for someone to dispute if, as most people in our field, one lacks training in critical evaluation of research.

The new brand of snake oil is “evidence-based” X, such as “evidence-based” methods of instruction or in a recent proposal, evidence-based software testing. Maybe I’m mistaken in my hunch about what this is about, but the tone of the abstract (and what I’ve perceived in my past personal interactions with the speaker) raise some concerns.

Jon Bach addresses the tone directly. You’ll have to form your own personal assessments of the speaker. But I agree with Jon that this does not sound merely like advocacy of applying empirical research methods to help us improve the practice of testing, an idea that I rather like. Instead, the wording  suggests a power play that seems to me to have less to do with research and more to do with the next generation of ISTQB marketing.

So let me talk here about this new brand of snake oil (“Evidence-Based!”), whether it is meant this way by this speaker or not.

The “evidence-based” game is an interesting one to play when most of the people in a community have limited training in research methods or research evaluation. This game has been recently fashionable in American education. In that context, I think it has been of greatest benefit to people who make money selling mediocritization. It’s not clear to me that this movement has added one iota of value to the quality of education in the United States.

In principle, I see 5 problems (or benefits, depending on your point of view). I say, “in principle” because of course, I have no insight into the personal motives and private ideas of Dr. Reid or his colleagues. I am raising a theoretical objection. Whether it is directly applicable to Dr. Reid and ISTQB is something you will have to decide yourself, and these comments are not sufficient to lead you to a conclusion.

  1. It is easy to promote forced results from worthless research when your audience has limited (or no) training in research methods, instrumentation, or evaluation of published research. And if someone criticizes the details of your methods, you can dismiss their criticisms as quibbling or theoretical. Too many people in the audience will be stuck making their decision about the merits of the objection on the personal persuasiveness of the speakers (which snake oil salesmen excel at) rather than on the underlying merits of the research.
  2. When one side has a lot of money (such as, perhaps, proceeds from a certification business), and a plan to use “research” results as a sales tool to make a lot more money, they can invest in “research” that yields promotable results. The work doesn’t have to be competent (see #1). It just has to support a conclusion that fits with the sales pitch.
  3. When the other side doesn’t have a lot of money, when the other side are mainly practitioners (not much time or training to do the research), and when competent research costs a great deal more than trash (see #2 and #5), the debates are likely to be one-sided. One side has “evidence” and if the other side objects, well, if they think the “evidence” is so bad,  they should raise a bunch of money and donate a bunch of time to prove it. It’s an opportunity for well-funded con artists to take control of the (apparent) high road. They can spew impressive-looking trash at a rate that cannot possibly be countered by their critics.
  4. It is easy for someone to do “research” as a basis for rebranding and reselling someone else’s ideas. Thus, someone who has never had an original thought in his life can be promoted as the “leading expert” on X by publishing a few superficial studies of it.  A certain amount of this goes on already in our field, but largely as idiosyncratic misbehavior by individuals. There is a larger threat. If a training organization will make more money (influence more standards, get its products mandated by more suckers) if its products and services have the support of “the experts”, but many of “the experts” are inconveniently critical, there is great marketing value in a vehicle for papering over the old experts with new-improved experts who have done impressive-looking research that gives “evidence-based” backing to whatever the training organization is selling. Over time, of course, this kind of plagiarism kills innovation by bankrupting the innovators. For companies that see innovation as a threat, however, that’s a benefit, not a problem. (For readers who are wondering whether I am making a specific allegation about any person or organization, I am not. This is merely a hypothetical risk in an academic’s long list of hypothetical risks, for you to think about  in your spare time.)
  5. In education, we face a classic qualitative-versus-quantitative tradeoff. We can easily measure how many questions someone gets right or wrong on simplistic tests. We can’t so easily measure how deep an understanding someone has of a set of related concepts or how well they can apply them. The deeper knowledge is usually what we want to achieve, but it takes much more time and much more money and much more research planning to measure it. So instead, we often substitute the simplistic metrics for the qualitative studies. Sadly, when we drive our programs by those simplistic metrics, we optimize to them and we gradually teach to the superficial and abandon the depth. Many of us in the teaching community in the United States believe that over the past few years, this has had a serious negative impact on the quality of the public educational system and that this poses a grave threat to our long-term national competitiveness.

Most computer science programs treat system-level software testing as unfit for the classroom.

I think that software testing can have great value, that it can be very important, and that a good curriculum should have an emphasis on skilled software testing. But the popular mix of ritual, tedium, and moralizing that has been passed off by some people as testing for decades has little to offer our field, and even less for university instruction. I think ISTQB has been masterful at selling that mix. It is easy to learn and easy to certify. I’m sure that a new emphasis, “New! Improved! Now with Evidence!” could market the mix even better. Just as worthless, but with even better packaging.

Conference of the Association for Software Testing

March 18th, 2010

I’ll be keynoting at CAST on investment modeling and exploratory test automation.

In its essence, exploratory testing is about learning new things about the quality of the software under test. Exploratory test automation is about using software to help with that learning.

Testing the value of investment models is an interesting illustration because it is all about the quality of the software, it is intensely automated, and none of the tests are regression tests. This is one illustration of automated exploration. I’ll point to others in my talk and paper.

THERE’S STILL TIME TO SUBMIT YOUR OWN PROPOSAL FOR A PRESENTATION. (The official deadline is March 20, but if you’re a few days late, I bet the conference committee will still read your proposal.)

Here’s the link: http://conferences.associationforsoftwaretesting.org/

What’s different about CAST is that we genuinely welcome discussion and debate. Any participant can ask questions of any speaker. Any participant can state her or his counter-argument to a speaker. We’ll keep taking questions all day. Artificial time limits are not used to cut off discussion.  After all, the point of a conference is “conferring.”

See you there…

An award from ACM

March 2nd, 2010

The Association for Computing Machinery’s Computers & Society Special Interest Group just honored me with their “Person Who Made a Difference” award. Here’s what it’s about:

Making a Difference Award

This award is presented to an individual who is widely recognized for work related to the interaction of computers and society. The recipient is a leader in promoting awareness of ethical and social issues in computing. The recipients of this award and the award itself encourage responsible action by computer professionals.

This is a once-in-a-lifetime award, which goes to (at most) one person per year.

ACM has printed a biographic summary that highlights the work that led to this award. We’ll probably add an interview soon.

The award presentation and talk will probably be at the Computers, Freedom & Privacy conference in San Jose in June.

I’ll probably also be talking about this work at a QAI meeting in Chicago, this May.