To develop a 10 KDSI embedded-mode communication product on a microprocessor.
need to determine the effect of various features on the overall development effort
and cost.
Communication software generally is very complicate with high complexity. Planning
to use high capability analyst and programmers.
Can the project cost be reduced by using less expensive personnel which
would cost $5000 per month only?
$10,000 could buy 96K memory to replace the 64K. Is it cost effective?
The equation is determined through a 4-step process.
Large variation between the Basic COCOMO estimates and the actuals are
mostly eliminated by the use of the cost driver factors in the Intermediate
COCOMO.
The Intermediate COCOMO estimates are within 20% of the actuals 68% of the time.
It was developed to fit the ``Rayleigh curve" to data extracted from a large
project.
It is a dynamic multivariable model that assumes a specific distribution of
effort over the life a software development project.
The mathematics is fairly complex and is not needed for this subject.
Hence the man-month as a unit for measuring the size of a job is a
dangerous and decptive myth.
Many models were derived from an examination of the empiral statistical
relationships between size, effort, schedule, using multivariate model
to come up with
A model is most useful for prediction at the beginning of product
development, well before the exact number of lines of code
can be known.
A framework that relies on the functionality of software as derived from the
requirement. I.e. The model is based on an evluation of several measures of
the domain for which a software project is defined.
Spelling chcker specification: the chcker accepts as input a document file and an
optional personal directory file. The checker lists all words not contained in either
the dictionary or personal dictionary files. The user can query the number of words
processed and the number of spelling errors found at any stage during the
processing.
FP = UFC x TCF
Assume average weights,
UFC = 4A + 5B + 4C + 10D + 7E = 58
Assume dict. file and misspelt word report are considered
complex,
UFC = 4A + (5x2 + 7x1) + 4C + 10D + 10 E = 63
Say
- F3, F5, F9, F11, F12, F13 = 0
- F1, F2, F6, F7, F8, F14 = 3
- F2, F10 = 5
TCF = 0.65 + 0.01 (18 + 10) = 0.678
FP = 63 * 0.678 = 59
If each FP is about 2 person-days ==> this project needs 118 days.
Chapter 8 - Quality Measurement
References
- Pressman, chapter 17
- Pfleeger, chapter 5.4
- Sommerville, chapter 31.4
- Fenton, chapter 10
Quality == conformance to Requiremnts (Crossby)
Axiom of software development
``Good internal structure" ==> Good external quality
Quality Measurement
- Internal attributes of a product, process, or resource are
those which can be measured purely in terms of the product, process,
or resource itself (see chapter 8 in Tutorial Guide)
- External attributes of a product, process, or resource are
those which can only be measured with respect to how the product, process,
or resource relates to its environment
Internal attributes
- Used for qulaity control and assurance
- The building blocks for measuring complexity
- Can be measured early during SW projects
- Are curcial for evlauting the efficiency of software methods
- Will also assure
- The external attributes expected by SW users, e.g. reliability,
usability
- The external process attributes expected by managers, e.g. productivity,
cost-effectiveness
Measurement Metrics
- Size
- Crucial inputs to cost and effort prediction systems
- Specification documents, e.g. length, functionality, complexity of
the underlying problem
- Modularity and information flow
- High/medium level design
- Measures various aspects of modularity, e.g. coupling, cohesion, tree
impurity, reuse and information flow
- Control-flow structure
- Detail design
- Identify those programs or modules which are likely to be difficult
to test and maintain
- Data structure
- Detail design
- Identify those programs or modules which are likely to be difficult
to test and maintain
Size
- There is some consenus view on measuring length of programs but not of
specifications or design
- There is some work on measuring functionality of specifications
- There is very little work on measuring problem complexity other than some
under the subject of computational complexity
Size - Length
A line of code is any line of program text that is not a comment or blank line,
regardless of the number of statements or fragments of statements on the line. This
sepcifically includes all lines containing program headers, declarations and
executable and non-executable statements.
LOC = NCLOC (non-comment) + CLOC (comment)
Density of comment = CLOC/LOC
Size of Specification and Design
- Not easy to have a good measure
- Frequently used in the industry, no. of pages
- Counts of different atomic attributes
- Text - pages, lines, words, ...
- Diagrams - uniform syntax; atomic objects for different types
of diagrams and symbols
DeMarco 1982
| View | Diagram | Atomic object |
| Functional | Data-flow diagram | Bubbles |
| | Data dictionary | Data elements |
| Data | Entity relation diagram | objects, relations |
| State | State transition diagram | states, transitions |
Modularity and Information Flow
A module is a contiguous sequence of program statements, bounded by boundary elements,
having an aggregate identifier. (Yourdon 1979)
- Intra-modular attributes
- Attributes of individual modules
- Inter-modular attributes
- Attributes of a system viewed as a collection of modules with
dependencies
Module Call Graph
A module calls another module
- Module A calls B,C
- Module B calls D
- Module C calls D,E

Module Call Graph: dependencies between data
- Initialize
- If (x<y)
- then A = B
- else A = C
- D = A

Related Metrics
- Views of modularity
- M1 = modules/procedures
- M2 = modules/variables
- Morphology
- Size: no. of nodes, number of arcs
- Depth: length of the longest path from root to a leaf node
- Width: max. number of nodes at any one level
- Arc-to-node ratio: connectivity density measure
- Tree impurity

- The extent to which a graph deviates from being a tree (Fenton
chapter 10.4.4)
- Reuse
- Public reuse: the proportion of a product which was constructed externally
PR = |P2| / (|P1| + |P2|) where |P1| is for new and
|P2| is external - Private reuse: the extent to which modules within a product are reused within
the same product
r(G) = e - n + 1 where e is no. of edges and n is number of nodes
- Coupling
- A measure of the degree of interdependence between modules
- No standard numerical characterization of this attribute
- Suggested to use an ordinal scale of measurement
- Cohesion
- An attribute of individual modules, describing their relative
functional strength, i.e. the extent to which the individual module
components are needed to perform the same task
- Information Flow
- Fan-in of a module M is the number of local flows that terminate
at M, plus the number of data structures from which information is
retrieved by M
- Fan-out of a module M is the number of local flows that emanate
from M, plus the number of data structures that are updated by M
Coupling
For a pair of modules, x and y, we have
| Bad | Content, R5(x,y) | x refers to the inside of y |
| Common, R4(x,y) | x,y refer to some global data |
| Control, R3(x,y) | x passes a parameter to y in order
to control its behavior |
| Stamp, R2(x,y) | x,y accept the same record type
as a paramter |
| Data, R1(x,y) | x,y communicate by parameters |
| Good | Data, R0(x,y) | x,y no communication |
A Coupling-Model Graph

- Coupling between modules x and y
- c(x,y) = i + n/(n+1)
- i: number of worst coupling type between x, y
- n: number of interconnections between x,y
- Global coupling of a system S
- C(S) = median of {C(Di,Dj): 1<= i < j <= n}
for modules {D1, ..., Dn}.
Cohesion
| Good | 6, function | performs a single function |
| 5, sequential | performs sequential functions |
| 4, communicational | > 1 function, but these are all
on the same body of data |
| 3, procedural | > 1 function, and these are only
related to a general procedure |
| 2, temporal | > 1 function, and these are only
by the fact that they must occur within the same time span |
| 1, logical | > 1 function, and these are only
related logically |
| Bad | 0, coincidental | > 1 function, and
these are unrelated |
Cohesion ratio = no. of modules having functional cohesion/ total no. of modules
Information Flow
Henry-Kafura, 1981
CM = Length(M) x [fan-in(M) x fan-out(M)]2
where length is the size of the module M.
Shepperd 1990
CM = [fan-in(M) x fan-out(M)]2
Word Processor

| Module | FI | FO | (FI x FO)2 | Length
| CM |
| WC | 2 | 2 | 16 | 30 | 480 |
| FD | 2 | 3 | 36 | 11 | 396 |
| CW | 4 | 4 | 256 | 40 | 10240 |
| DR | 2 | 0 | 0 | 23 | 0 |
| GDN | 0 | 2 | 0 | 14 | 0 |
| RD | 3 | 1 | 9 | 28 | 252 |
| FWS | 1 | 2 | 4 | 46 | 184 |
| PW | 2 | 1 | 4 | 29 | 156 |
Control-Flow Structure
Direct graphs (flow graphs)
- Nodes correspond to program statements
- Edges correspond to flow of control between the
corresponding statements

Flow graphs
Hierarchical Measures
- Depth of nesting measure
- Number of nodes measure
- Number of edges measure
- Largest prime measure
- Number of occurrences of named primes measure
- Is D-structured measure
Recursive measure: number of simple paths through F for each prime F
(Fenton chapter 10.3)
Data Structures
- Global measures
- Number of variables
- Halstead, u2
- u2 = no. of variables + no. of unique constants
+ no. of labels
- Fail to identify complexity if it is hidden in the data
structure
- Boehm 1981
- D/P = Database size in bytes or char. /
Program size in DSI where DSI is the delivered source
instruction - Used by the DATA field in the COCOMO model
| DATA | Multiplier |
| Low (D/P < 10) | 0.94 |
| Nominal (10 <= D/P < 100) | 1.00 |
| High (100 <= D/P < 1000) | 1.08 |
| Very high (D/P >= 1000) | 1.16 |
Complexity
A term to captuure the totality of all internal attributes.
- McCabe's Cyclomatic Complexity Measures (1976)
- v(F) = e - n + 2
- It measures the number of linearly independent walks through
a program (flowgraph)
- Halstead's Volume Metric (1977)
- Program length = n1 log2 n1 + n2 log2 n2
- Program volume = N log2 (n1 + n2)
- where n1 is the number of distinct operators that appear in a program
n2 is the number of distinct operands
N1 is the total number of operator occurrences
N2 is the total number of operands occureences
- Kitchenham 1984
- No better correlation with external measures than a simple count
of LOC.
- Fenton
- Largest prime subgraph within the control flowgraph of a program
- Does not lend itself to measurement on a ratio scale
Chapter 9 - Fagan Inspection
References
- Pressman, chapter 17.2 - 17.3
- Pfleeger, chapter 7.2
- Sommerville, chapter 24.1
Fagan in 1970's
He noticed that no inspection of designs was routinely practised during
software development ==>
He developed a set of procedures for inspection of software designs, source code,
test plans and test cases.
He argued that software development should
- Clearly define the programming process as a series of operations,
each with its own exit criteria
- Measure the completeness of the product at any point of its development
by inspections or test
- Use the measurements to control the process
Types of Inspection
- I0, Initial design
- The initial design specification is inspected against the
statement of requirements in order to remove defects and ambiguities, and
ensure that it provides a sound basis for detailed design
- I1, Detail design
- The logical specification are inspected against the initial design
specification to remove defects and ambiguities and provides a sound
basis for coding
- I2, Coding
- From source code to testing
- IT1, Test plan preparation
- It is inspected against the functional
specification to ensure that it will accurately and comprehensively
verify the program functions
- IT2, Test case preparation
- The test case listings are inspected against the test plan to ensure that
they match the plan, that proper instructions are provided for carrying
out the tests, and that the test harness or stub code is corrected.
Testing
- Testing of software is a means of measuring or assessing the software to
determine its quality
- Testing is dynamic assessment of the software
- Sample input
- Actual outcome compare with expected outcome
Testing and Debugging
- Testing is different from debugging
- Debugging is removal of defects in the software, a correction process
- Testing is an assessment process
- Testing consumes 40-50% of the development effort
Purpose of Testing
- Testing paradox
- To gain confidence: a successful test is one where the software
does what is should
- To reveal error: a successful test is one that finds an error
- In practice: a mixture of defect-revealing and correct-operation
The Testing Team
- Planning
- Moderator
- Schedule activities, distribute materials
- Overview
- Authors and other team members
- Education
- Preparation
- All members
- Familiarization
- Inspection
- Re-work
- Follow-up
- Moderator and authors
- Assure rework is correct, improve development process, improve
inspection efficiency
Necessary Condition for Testing
- A controlled/observed environment, because tests must be exactly
reproducible
- Sample input - test uses only small sample input (limitation)
- Predicted result - the results of a test ideally should be predictable
- Actual output must be able to compare with the expected output
Exit Criteria
To judge whether a given development phase is complete. It is usually
defined locally by individual organization. Some examples here
- I0: external specifications are completed
- I1: design specifications must be structured
- I2: module prologue up-oto-date and complete
Test Design and Test Execution
When are tests designed
- As late as possible - just before it is needed for test execution - worst
case
- As early as possible - right after the requirement specification is ready - ideal
case
- Strike a good balance
Level of Testing in Software Development
- Unit test
- Function of the basic unit of software is tested in isolation
- performed by programmers who produce the code
- Tests are derived from the detail logic of the unit
- Test is to find errors in the individual units, in either data or logic
- Beware of OO
- Component integrating testing
- Test functionality of components formed by the combinations of units
- Two kinds of erros
- Interface between units
- Functions supported by multiple units in the components
- System testing
- After integration testing is completed
- Entire system is tested as a whole
- Requirements specification is used to derive test cases
- Acceptance testing
- Mark the transition from ownership by the developers to ownership by
the users
- Different from development testings
- Executed by the accepting organization
- A demonstration not an error revealing process
- Include testing of the user organization's working practices,
ensure the system is compatible with the business practices
Check Lists
Sets of questions which the testers will ask, and which will prompt them to look
for certain types of defect
- I0, Is the design consistent with the program requirements? External
specifications
- I1, Are all constants defined? Logic
- I2, Is each field initialized properly before its first use? storage,
usage
Defect Recording and Classification
- Severity
- Category
- Type
- The type code as in checklists
Management of Testing
What test metrics are being used?
- Error rates: in design, in test execution, in operations or field use
- Error severity and distribution in different development stages
- Time/cost in test design
- Time/cost in test running
- Time/cost in bug fixing
- Time/cost in inspections or reviews
More Test Metrics
- Defect present/remaining
- An estimate of the no. of defects in a given module prior to
test/still remaining after testing
- Defect density
- Counts of defects in a module/size of the module
- Defect removal efficiency
- Percentage of defects present that is removed at a given
test step
Management of Testing - continue
What testing policies are used?
- Objectives of testing
- Economic constraints
- Who decide how many, what and severity standard
- Documentation of tests
- Quality and acceptance criteria for test documents
- What tools will be available
Cost of testing
- Test script preparation
- Test data set up
- Debugging tests
- Test execution
- Debugging software
- re-test
- Regression
Inspection rates in NCSS/hour
| Inspection Step | T0 | I1 | I2 |
| Overview | 500 | 500 | - |
| Preparation | 200 | 100 | 125 |
| Inspection/testing | 250 | 130 | 150 |
| Rework | - | 50 | 60 |
Effective Testing
- Right balance between the cost of testing and the cost of not testing
- Concentrate on the most difficult and complex part of the system
- Sofwtare which is difficult to use or least liked is often the most
error-prone
- The most likely place to find erros is where you have found the most
errors already (Demarco'82)
Management of Testing - continue
How much testing is enough?
- Stop when you have no more time
- Stop when no error has been discovered in the current test cases
- When enough errors have been found
- Coverage measurement -% of the structural elements of the
software exercised (Graham)
Personnel assessment - Must not be DONE
Who makes the best tester?
- Make it an important/prestigious job
- Make the tester to represent the true interest of the user
- Independent test team
- A professional but friendly team
Test Documentation
- Test plan
- Features to be tested and not to be tested
- Test approach
- Pass/fail criteria
- Test deliverables
- Test environment
- Schedule
- Risk
- Test case specification
- Test case id
- Input specification
- Expected outcome including non-functional qualities, e.g.
response time
- Environments needs
- Procedural requirement: setup, precondition
- Intercase dependencies
- Test procedure specification
- Purpose, reference to test case
- Special requirement, if any
- Procedure steps
- Logging
- Set up
- Start
- Actions
- Measurement taken
- Shut down/restart/stop
- Test log
- Items or cases tested
- Activities/events/time
- Execution description
- Success/failure of the test
- Unexpected events
- Incident report id
- Incident report
- Summary
- Incident description
- Impact on test plan
- Severity assessment
- Test summary report
Test Management Pitfalls
- Absence of policy/strategy/plan
- Inadequate and inefficient test data
- Uneven testing
- Important test omitted
- Too much testing a simpler area
- Lack of support
- Tool support
- Developer support
- Technical support
Test Quality
How to ensure testing quality?
- Inspection
- Pre-test document: plan, cases, ...
- Test results: log, incident, ...
- Test result analysis
- On-time support
- Enough tool support
- Standby design support
- Use team approach
Static Testing
- Software is examined but not executed to discover programming errors
- Error types
- Dead code
- Infinite loops
- Uninitialized variables
- Unused data values
- Standard violations
- Use compiler like techniques to build tool to examine code
Black Box (Functional) Test
- Given all functionality defined in the requirements
- Testing to validate each function is fully operational
- Practitioner found black box approach more useful
- What are the techniques available to design functional tests
Equivalence Partitioning
- Input values which are treated the same way by the software can be
regarded as equivalent.
- e.g. if x>5 then y := y+ 1
else y: = y - 1 - Equivalent classes: { x>5 } and { X | X <l= 5 }
Boundary Value Analysis
- Values lie at the edge of an equivalence partition are boundary values
- Boundary values are very sensitive; very often, software may treat it as a
value in a wrong equivalent class
Other techniques
- Finite State Machine Testing
- Seek to exercise sufficient states and transitions
- Random Testing
- Generate randomly according to user profile
- Error guessing
White Box (Structural) Test
- Assume full knowledge of designs and implementation
- To validate all components work as specified in the
design (documentation)
- Academic tends to emphasize more on white box test
Code Structure Coverage
- Run a set of test cases
- Analysis the % of structures that the test cases have exercised
- Module coverage
- Procedure coverage
- Code statement coverage
It is found (Open Univ.'84) that a good set of functional tests often
achieve only 60%-80% code coverage
Branch or Decision Coverage
- IF-THEN-ELSE has 2 branches
- Case statement has multiple branches
- Decision coverage is the % of branches exercised by a test case
- Typical ``good" functional test cases only achieve 40% branch
coverage
Testing in Maintenance
- Tests performed for
- Planned changes
- Emergency fixes
- Objectives
- Establish confidence on the changed system
- Use a cost-effective approach to test maintenance changes
- Breath testing: to get a basic level of confidence on the working of the
whole system
- Depth testing: probes the weak area in the system which has a high
risk of generating errors due to the change
- Regression testing: a continuous effort to ensure that the software has
not degraded from its previous quality
Audit of Testing Activities
What should be looked at if you want to audit the testing activities
of the vendor/contractor???
Testing Audit
- Test plan, test case
- Test log,, incident report
- Test progress, result analysis
- Test result database
- Regression test
Critique of Testings
The final number of defects is only known at the end of the product
life-cycle, when we assume that any remaining defects can be regarded as
causes of failure.
We want to retrospectively estimate the ``defect removal efficiency" of testings
at each step of development ==>
To predict the quality of the current product by analogy with past products
and psat tests.
Remus and Zilles 1979
- Assume 2 defect removal steps
- Number of defects after step i =
di+1 = (1 - pi) di + bipidior
di+1 = (1 - pi (1- bi)) di
- di is the number of defects existing in the product on entry to
step i
- pi is the fraction of defects detected at step i
- bi is the fraction of detected defects that are incorrectly repaired
in step i
- Assuming
- Detection fraction pi is constant at all steps i
- Bad fix fraction bi is constants at all steps i
- Defect retention fraction = (1 - pi(1 - bi)) = A is a constant
- Let n = total defects inserted at all steps
- k1 = defects detected in step 1 (inspection)
- k2 = defects detected in step 2 (testing)
- k2 = n - k1= An
- d3 = n - (k1+ k2)= A2n
- Therefore, A = k1/k2 , and n = k12/(k1 - k2)
-
d3 = A2 n = k22/ (k1 - k2) = number of defects shipped
A note from the authors: n is not measurable during the development process
but it can be measured for products that have expended their useful life
already. These values of n can be used to estimate other quantities .....
and to manage the defect removal process.
Problems with Defect Density Estimates
- Every software product is a ``prototype" ==>
No objective assessment of the predictive value of such techniques
seems to be available - Inspection and testing effort may not be constant ==>
We are certainly not justified in assuming similar efficiency of
defect detection - The estimation takes no account of product use ==>
Meaning of ``product life"?
Testing Standard
- ANSI/IEEE Std 829-1983, Software Testing Documentation
- ANSI/IEEE Std 1008-1987, Software Unit Testing
- ANSI/IEEE Std 1012-1988, Software Verification and Validation Plans
Chapter 10 - Dependability Measurement
References
- Pressman, chapter 17
- Pfleeger, chapter 7.6
- Sommerville, chapter 19-20
- Fenton chapter 8, 11
Failure of Complex Systems
Physical: A hardware component develops a fault. The cause if physical and
the fault appears at a certain time during operation. Repair consists of replacement
of the faulty component to restore the system to its previous good state.
Design: A fault in the design of the system is activated under certain trigger conditions.
The fault may have been present for some time, but latent. Repair consists of a
corrective change to remove the fault.
Dependability
The extend to which the user can justifiably depend on the service
delivered by a system (Laprie 1992)
Dependability is not a single attribute, but is consisted of
``RAMURSES"
- Reliability
- Availability
item Maintainability
- Usability
- Recoverability
- Safety
- Extendability
- Security
Dependability - Indirect Measures
The probability that the system will deliver a required service, under given
conditions of use, for a given time interval, without
relevant incident.
- Maintainability
- Availability
- Recoverability
- Extendability
Dependability - Direct Measures
Two types
- Event time data: list of inter-event times
- Event count data: counts of events, total operational time, ...
Software Quality Assurance
What is SQA?
- Is SQA an organization?
- Is SQA a philosophy?
- Is SQA a set of activities?
Software Quality Assurance
- Stage one of SQA - 1960-1979
- Starting with late 1960's, several IBM locations occasionally
used the term software quality assurance. The context is
primary final product testing
- 1960-1970, requests for proposal for complex military equipment
would have ``software quality assurance": conduct test program
on embedded software
- Influence from DOD - 1974
- DOD Std MIL-S-52779 (AD) ``Software Quality Assurance Program
Requirements"
- Require contractors to address the following aspects
- Methods to evaluate design and design documentation
- Documentation of standards
- Library control procedures
- Testing, plans, test results, etc. ...
- 75 SQA Program Requirement
- Force contractor to identify technique to deliver quality product
- No address on quality improvements
- SQA is basically an extension of testing and review, and book
keeping of all the results
- In short, SQA is a police or arbiter, not a direct agent to
promote quality
- Stage two of SQA - 1979-1988
- Commissioned by the DOD, a workshop was held to hopefully define
a minimum acceptable standard for SQA
- DOD-STD-2168 was released in 1988
- After that many major defense contractors have started to establish a
SQA dept, usually under quality assurance directorate
- Stage three of SQA - 1979-
- Influence from TQM
- Bell Lab: in the ESS program, SQA us ``a set of rules and procedures used
to guide the development, administration, maintenance, and
improvement of the ESS program".
- Three distinct meanings of SQA - 80's
- A comprehensive approach to improve software, the process of
programming, and the control of software projects
- Diligent testing
- Verification and validation
- Impact of the new approach: SQA does not assure the quality of software; it
ensures the planning and execution of a quality program
- Emphasis on analyzing the attributes of software structures
- Emphasis on accumulating and analyzing fault data
- Emphasis on accumulating and analyzing process data
- Bellcore SQA criteria (85)
- Software life cycle plan
- Management commitment and organization
- Development support environment
- Documentation
- Verification and validation procedures
- Configuration management
- Problem reporting and corrective action
- Data collection, analysis, and use
- Customer engineering and operations
- Gear more towards broader sense of SQA
Current Practices
Emphasis on defects removing. Passing methods:
- Requirement and design reviews
- Inspection of pseudo code
- Code coverage analysis
- Code inspection
Emphasis on defects removing. Active methods:
- Functional testing
- Structural testing
SQA role:
- Independent test team
- Independent reviewers
- Oversee review process
- In-house audit team
SQA - besides defect removal
SQA also has effect on:
- Delivery of reliable and maintainable product
- Control of project to reduce risk of late delivery
- General improvement in the quality of future software products
- These goals are driven by customer satisfaction
SQA - also a TQM issue
Also a technology-people-management combination.
TQM: A quality system is the agreed on, companywide and plantwide operation work
structure, documented in effective, integrated technical and managerial procedures,
for guiding the coordinated actions of the people, the machines, and the
information of the company and plant in the best and most practical ways to assure
customer quality satisfaction and economical costs of quality.
SQA Quality Program
- Training programming staff in new techniques, methods, and tool use
- Evaluation of effectiveness of current development methods and tools
- Project, quality program, and test program planning as policy or
standards dictate
- Use of reviews, analysis tools, and tests to find defects at the
earliest possible time
- Library control, change control, distribution, and storage per project
plan and relevant policies or standards
- Recording of all defects data and subsequent analysis of defect, fault
detection and failure modes
- Use of defect data to improve processes
- Generation and analysis of various data for early indication of adverse
product or project control trends
- Gathering, analyzing, and evaluation user feedback
- Survey of potential software vendors and surveillance of their performance
- Objective evaluation of the fidelity with which plans and applicable standards
are followed
- Empowerment of staff to prevent defective code, artifacts of development, and
user documentation from being entered into the system or delivered
SQA Techniques
- Pareto Analysis
- Vilfredo Pareto (1906) published a treatise on skewed distribution
on wealth among various people
- Albert Endres of IBM found that a remarkably small number of modules in a
large system produced most of the errors
- E.g. 0.7% of the modules accounts for 15% of the faults
- 10% accounts for 24%
- Look for trouble making modules or process
- Trend Data
- Use trend data to evaluate quality of software product
- Set control curves with the help of past data
- E.g. a code inspection that results in a density of detected errors
greater than a given upper limit suggests disturbingly faulty code;
one with a density smaller than lower limit may be an ineffective
testing

- Audits
- To confirm compliance with approved procedures or standard
- Audit is different from review
- It is directed to seeing that things are being done in the correct
way, while reviews focus on the substance of the material or
product under review
- Failure analysis
- Look for common failure modes
- Correlation of failure with operation conditions
- Root cause analysis
- Support quality improvement
- Direct measurement of software
- Measurement of goodness
Number of defects with severity
Time required to fix a bug (maintainability)
Time to take a user to do an operation (usability) - Structure measurement
Complexity of the software: cyclomatic measure
Module coupling and cohesion
- Error seeding
- Plant A number of seeds in the software
- B is the number of bugs found
- C is the number of seeds recovered in testing
Then N = A x B / C is the number of bugs expected in the
software
- Regression
- New code for bug fixing is the most likely place for introduction
of new bugs
- Best case: re-run all test cases in the new load
- Practical case: select a subset that have good coverage to be
re-tested
SQA Organization
- SQA should be an independent organization
SQA Costs
- Prevention cost: action taken to prevent defects in products
- Appraisal cost: tests, evaluations used to detect defects
- Failure cost: correction of defects
- Cost in general ranges from 2% to 8% of staffing cost
Quality Factors

It provides
- A goal-oriented methodology for measuring software quality
- Another dimension that software requirements can be addressed
- Identify attributes important in a software product that will have
impact later in the life cycle
- A basis for process improvement
Definitions of Quality Factors

- Coreectness: extent to which a program satisfies its specifications and
fulfills the user's mission objectives
- Reliability: extent to which a program can be expected to perform its
intended function with required precision
- Efficiency: the amount of computing resources required by a program
to perform a function
- Integrity: extent to which access to software or data by unauthorized
persons can be controlled
- Usability: effort required to learn, operate, prepare input, and interpret
output of a program
- Maintainability: effort required to locate and fix an error in an operational
program
- Testability: effort required to test a program to ensure it performs
its intended function
- Flexibility: effort required to modify an operational program
- Portability: effort required to transfer a program from one hardware
configuration and/or software system to another
- Reusability: extent to which a program can be used in other applications
- Interoperability: effort required to couple one system with another

SQA Plan
Each project should have a software quality assurance plan (SQAP). The IEEE Std
for SQAP (84)
- Purpose
- Reference documents
- Management
- Documentation
- Standards, practices and conventions
- Reviews and audits
- Software configuration management
- Problem reporting and corrective action
- Tools, techniques and methodologies
- Code control
- Media control
- Supplier control
- Records collection, maintenance, and retention
SQA Considerations
- SQA organization are rarely staffed with sufficiently experienced or
knowledgeable people
- The SQA management team is often not capable of negotiating with
development
- Senior management often backs development over SQA on a large percent
- Many SQA organizations operate without suitable documented and approved
development standards and procedures
- Software development groups rarely produce verifiable quality plans
- SQA should avoid to fix development problem, they are not fire fighters
SQA Staffing
Good people is hard to find without paying enough $$$ !!!
Chapter 11
References
- Pressman, chapter 17
- Pfleeger, chapter 7.6
- Sommerville, chapter 19-20
- Fenton chapter 8, 11
Software Reliability
Software Reliability Engineering (Musa): apply a statistical approach to
measuring, managing, and containing the risk of software failure, which
never goes to zero.
Ref. Musa, Iannino, Okumoto, Software Reliability, Measurement, Prediction,
Application
The Fundamental Modelling Problem
A prediction system which will allow us to predict the future (Ti,
Ti+1, ...)
from the past (t0, t1, ..., ti-1) comprises:
- The probabilistic model which specifies the distribution of any subset of
the Tj's conditional on an unknown parameter p.
- A statistical inference procedure for p involving use of available data
(realization of Tj's)
- A prediction procedure combining (1) and (2) to allow us to make probability
statements about future Tj's.
Basic Concepts
Fault density: faults removed (after system test) per thousand lines of
delivered source code.
Wrong and misleading: faults removed has no correlation with reliability. Removal
of a large number of faults can indicate either high quality testing or low
quality coding
Customer-oriented approach
- They are not concern about the number of faults in the system
- They are concern about how often failure occurs and the cost associated
with it
- What is error, bug, fault and failure??
- Failure: a department of operation from user expectation
- Fault: a defect in the program, when executed under particular condition,
causes failure (also known as bug)
- Error: a mistake that a programmer made which creates a fault
- Time measurements in reliability
- Execution time, calendar time, clock (elapse) time
- Four general ways to characterize failure occurrence
- Time of failure
- Time interval between failures
- Cumulative failures experienced up to a given time
- failures experienced in a time interval
- These are all random variables
- Failure intensity: the number of failures per unit time
- E.g. 0.001 failure/hour or 1 failure/100 hour
- Reliability: the probability of failure-free operation of a software for
a specified time in a specified environment
- E.g. a system may have a reliability of 0.92 for 8 hours when used by
the ``average users". I.e. there will be 92 failures if the system
is run 100 eight-hour periods
Failure Behaviour
Affected by two principal factors:
- The number of faults in the software
- The operational profile of execution
Inverse relationship between reliability and fault density: through testing and
defect removal, FI tends to decrease and reliability get increased.
Mean Time to Failure
Mean time to failure (MTTF) - average of the next failure interval
- MTTF is used mainly in hardware failure
- Failure intensity is more preferable in software reliability study
- MTTF and FI, nonrigorously, are inverses of each other
In hardware, with respect to repair and replacement, there is Mean Time Between Failure
MTBF = MTTF + MTTR (mean time to repair)
Classification of Models
Models can be classified by their mathematical structure, assumptions, parameter
estimation method, ...
- Time domain
- Stochastic/deterministic
- Failure modeling
- Models directly
- The manifestations of faults
- Repair modeling
- The repair process is modeled, and imperfect and delayed repair
have been taken into account
- Additional features
- Inclusion of considering structure of program
- Parameters
- Data types
- Functional form of the derived measures
Reliability Modelling
- A software reliability model specifies the general form of the
dependence of the failure process on the factors concerned in the model
- To develop a model to predict failure intensity (reliability) on execution time
- Parameters of the model can be established by
- Estimation - statistical inference applied on failure data collected
- Prediction - determine from the properties of the software product
or the development process
- Once the model has been chosen and parameter estimated, many failure characteristics
can be determined
- Average number of failures experienced at any point in time
- Average number of failures in a time interval
- Failure intensity at any point in time
- Probability distribution of failure intervals
- What constitutes a good software reliability model
- Good predictions of future failure behavior
- Produce useful quantities
- Simple to use
- Widely application
- Based on sound assumption
Geol-Okumoto Model
- The failure process is assumed as a non-homogeneous poisson process
- Number of faults in the program, N
- A random variable
- Have a poisson distribution
- n = mean (N)
- m(u) = expected number of faults found by time u =
n [ 1 - e(-Zu) ] where
- m'(u) = Z (n - m(u)) and
- m(0) = 0
- Z - a constant of proportionality between the failure rate and (n - m(u)),
the expected number of faults left
- Assumption: all faults make the same contribution to the failure rate
Duane's Learning Curve
It was originally to model the trend of the increasing familiarity of a user with
a machine ==>
After a fault once found is removed, it will not be repeated. No assumption about
the repair mechanism
- Cumulative failure rate
- q(u) = c(u)/u over operational time u
- The points (u, q(u)) tended to lie on a straight line with negative slope
when plotted on log-log graph paper
- E(Q(u)) = E(C(u))/u = m(U)/u = a ub-1 for
b<1
- m(u) = a ubb<1 reliability growth
- b=11 constant reliability
- b>1 decaying reliability
Jelinski-Moranda
It models failure as manifestation of a fault, and assumes perfect immediate repair
- pdf(t/c) = z(n-c) e( -Z(n-c)t)
- n: number of faults in the program
- Z: the rate at which each individual fault is activated
Problems with Early Models
All faults cause failure at the same rate
Wrong assumption (Adams 1984)
Two lessons to learn
- Large software contains many small faults
- The manifestation rates of individual faults can differ by many orders
of magnitude
The Bayesian Approach
A measure of our subjective degree of belief about an event
- Manifestation rate of fault i = random variable
- Failure rates can decrease between failures
- If we use a program for a long period and find no faults, then our
subjective belief that the program will not fail increases in
strength
Littlewood-Verrall Model
It represents the inter-failure times directly as random variables
- Assume: after each failure a repair is attempted, and that this changes
the failure rate
- It can allow for imperfect repair, since there is only a certain probability
that the failure rate will decrease
- 3 parameters
- , b1, b2 : estimated by ML from continuous data
- 2 sources of randomness
- input selection, debugging
Littlewood Stochastic Reliability Growth Model
It takes account of the difference in fault manifestation rates by modeling the
individual rates Zi as independent identically distributed random variables with a
gamma, gamd(z,h,s), distribution
- Failure rate = R
= Sum (n-c) of the rv's has gamd(r,h,(n-c)s) distribution - r(u|c) = (n-c) s / (h + u + t)
- pdf(t| Ri = ri) = rI e(-ri t)
- Assumes that times to failure Ti after ith fix are independent
exponentially distributed random variable's whose rates are themselves random
variable's with a gamma distribution
- pdf(ri) = gamd(ri, h(i), s)
- Rate of occurrence of failure
- r(c) = s / (t + h(c))
- h(c): scale factor of the gamma distribution
Most Common Models
- Basic execution time model
- Logarithmic poisson execution time model (Musa, Okumoto 1984)
Tools
- Tools exist to perform the statistical inference for different models
- To produce various prediction results
- SRE Toolkit (Unix or DOS, from AT&T)
- SMERFS - Naval weapon lab
- SQMS - SQT Corp
Basic Time Execution Model
The failure intensity function decreases at a constant rate.

The failure intensity for the basic model as a function of failures is
- l(u) = l0 (1 - u/v0)
- u: mean failures experienced
- l0: initial failure intensity
- l: failure intensity
- v0: total failures
- E.g. Assume that a program will experience 100 failures in infinite time. It has now
experienced 50. The initial failure intensity was 10 failures/CPU hr. The
current failure intensity is
- l(u) = l0(1 - u/v0) =
10 ( 1 - 50/100) = 5 failures/CPU hr
Logarithmic Possion Execution Model
The decrement of failure intensity per failure is not constant, it becomes smaller
as more failures are being experienced.

The failure intensity for the logarithmic possion model is
- l(u) = l0exp( -t u)
- t: failure intensity decay parameter
- E.g. Assume that the initial failure intensity was 10 failures/CPU hr. The
failure intensity decay parameter is 0.02/failure. We will assume that 50
failures have been experienced. The current failure intensity is
- l(u) = l0exp( -tu) =
10 exp ( - 0.02 x 50) = 3.68 failures/CPU hr
Parameter Determination
- Initial failure intensity
- Total failures for the basic model
- The decay parameter for logarithmic model
By estimation: e.g. use maximum likelihood estimation based on the data
set collected in the testing
Incorporate Reliability in Life Cycle

Chapter 12 - Safety-Critical Software
References
- Pressman, chapter 9
- Sommerville, chapter 21
- Software Safety in Embedded Computer Systems, CACM Vol. 34, No. 2, Feb 91
Criticisms of Models
- Lack of independence of failures
- Faults can mask one another
- A single fault will manifest itself in a ``burst" failure
- Several faults in a single area will manifest themselves together
- Usage effects
- Imperfect observation and diagnosis of faults
- Imperfect repair
- Errors in previous repairs
- Delay in reporting, diagnosis and repair
- Update
- Read software is enhanced and re-issued frequently
- Discrete data
Software Safety
A software quality assurance activity that focuses on the identification and
assessment of potential hazards that may impact software negatively and cause an
entire system to fail
E.g. A computer-based cruise control for an automobile
- Causes uncontrolled acceleration that cannot stop
- Does not respond to depression of brake pedal
- Does not engage when switch is activated
- Slowly loses or gains speed
Software Ultra-Reliability
- Fault avoidance
- Proof of correctness
- Static analysis
- Structured programming
- Fault removal
- Fault tolerance: ``N-version" technique
Hazard Analysis
All hazards are identified: features of the system with a potential for leading
to an accident
- Risks are assessed with regard to their probability of materializing, and
the likely severity of the consequences if they do
- Software hazard analysis: an examination of the risks posed by the
possibility of software failure
Hazard Assessment
- Hazard can be viewed as falling along a continuum in terms of severity
- A threshold
- Several cutoff points
- Assessing their likelihood
- Probabilities of individual events and the overall probability for
the hazard is calculated
difficult because sufficient design information is usually
not available - Qualitative assessments
they may be sufficient to provide the information needed for
resource allocation during development
Hazard Control
- Eliminate the hazards
- Minimize the hazards
- Include control devices: automatic control, lockouts, lock-ins,
interlocks
Fault-tree Analysis
- A given top-level event
- An accident that we wish to avoid
- All events that could lead up to it are identified
- The causes of them are in turn identified
- Repeat until a sufficient level of detail has been achieved
- A graph/tree of the sequential and concurrent combinations of events that can
lead to a hazardous event
- Z - a constant of proportionality between the failure rate and (n - m(u)),
the expected number of faults left
- Failure modes and effect analysis
- Works from bottom-up
- Each way in which each component can fail is examined, together with
its effect on the whole system
Patient Monitoring System Fault Tree

Process Certification
Faced with the impossibility of achieving and assuring ultra-dependability
- Process certification
- Follows a set of guidelines which are essentially a code of good
development practice, and produces evidence to the certifying
authority that this has been done
Chapter 13 - Data Collection
References
- Pressman, chapter 2.2 - 2.6
- Fenton, chapter 6, 8
Guiding Principle of Data Collection
Data should be collected with a clear purpose and a clear idea as to the
precise way in which they will be analyzed so as to yield the desired information
(Moroney 1951)
Six Crieria for Data Collection
- The data must contain information permitting identification of the types of
faults and changes made
- The data must include the cost of making changes and correcting faults
- Data to be collected must be defined as a result of clear specification of
the goals of the study
- Data should include studies of projects from production environments, involving teams
of programmers
- Data analysis should be historical, but data must be collected and validated concurrently
with development
- Data classification schemes to be used must be carefully specified for the sake
of repeatability of the study in the same and different environments
Some Observations
- Most data collection depends at some point on the willingness of people to
provide it
- Under pressure, data collection is always the first thing to be ditched, and therefore
- Schemes which ask for too much are doomed
Data Requirement for Reliability Modeling
Mellor 1986
- Time on all sample systems up to each failure
- Is failure the first manifestation of a fault or a repeat?
- What's the type of the fault and severity of failure?
- in which product, version and module is the fault located?
- Data must be ``clean"
- Apply to a single version of the product
- Run on a defined sample of systems for a given time
- Not to subject to interference from outside events
Common Problems
- Inadequate definition
- Definition of time
- Definition of failure
- Definition of modular structure
- Classification of severity of failure
- Type of fault
- Nature of the products
- Multiple installations
- Multiple products
- Multiple versions of product in field together
- Multiple repair levels
- Module breakdown
- Support activities
- Imperfect diagnosis
- Imperfect repair
- Delay in diagnosis
- Suppression of data
- Data collection
- Omission of running time
- Omission of failure
- Omission of time of failure
- Omission of other essential item
- Wrong classification
- Database integrity
- Cannot correlate running time with failure
- Faults and failures not distinguished and cross-referred
- Missing cross-references
- Confusing classifications
- Incompleteness
- Lack of standard identifier formats
Particular problems
- Lack of running time records
- Often due to the fact that often reporting systems are set up solely
to chase progress on fault clearance rather than measure reliability
- Mission of occurrence time from failure report
- Suppression of failure reports
- Use of inappropriate classification
- Failures are not necessarily diagnosed in the order in which they
occurred
- Faults which were manifest on the sample as repeats but not as first
manifestations
Need for Automatic Recording
To overcome the problems, automate collection systems are developed
- Static
- Kitchenham 1984. A program history record system
- Dynamic
Chapter 14 - Process Maturity Framework
Reference
- Pfleeger, chapter 11.4
- Sommerville, chapter 18
- Fenton, chapter 13
Process Improvement Movement and CMM
Process Improvement program (Denning)
- Understand the current status of the process
- Develop a vision of the desired process
- Establish a list of required process-improvement actions in
priority order
- Produce a plan to accomplish these actions
- Commit the resources and execute the plan
- Start over at step 1
Some Principles
- Major changes to the software process must start at the top
- Process problems are management's responsibility
- Ultimately, everyone must be involved
- With a more mature process, individual actions are more structured,
efficient, and reinforcing
- Change is continuous. Human intensive processes are never static
- Reactive changes generally make things worse
- Every defect is an improvement opportunity
- Crisis prevention is more important that crisis recovery
- Software process changes won't stick by themselves - in the absence of conscious effort,
human processes behave like entropy
- It takes time, skill, and money to improve the software process
- Dedicating resources to improvement is self evident
Capability Maturity Model
- A framework to improve software process
- A basis for process evaluation and assessment
- CMM consists of five maturity levels
- CMM was developed by SEI (Software Engineering Institute)
- Sponsored by US Defense Department
- Used by DoD as a reference to evaluate tenders from software contractors
History
- Initiated by the DoD in 1986
- Mov. 1986 SEI began developing a process maturity framework
- Sept 1987, SEI released the first version of the framework
- 1990 SEI evolved the framework into CMM 1.0
- 1992 complete review of CMM 1.0
- 1993 CMM 1.1 was released based on the 92 review
Inmature Software Organization
- No clearly defined processes to follow
- If processes are specified, not rigorously followed
- A reactionary software organization, focus on solving immediate crisis
(fire fighting)
- Schedule and budgets routinely exceeded
- If hard deadlines are imposed, product functionality and quality are often
comprised
- No objective way to judge quality of product or process
- Quality activities like testing and reviews often are scarified to meet
schedules
Mature Software Organization
- Roles and responsibilities are clear within a project and across
organization
- Managers monitor quality of products and processes
- Objective and quantitative basis exist to measure quality
- Schedules and budgets based on historical performance
- Expected results for cost, schedule, functionality, and quality are
usually achieved
- Process very adaptive to new technology
Five Mature Levels

Level 1: Initial
Behavior characterization:
- No stable environment for developing and maintaining software
- Difficult to meet commitment with an orderly engineering process
- During crisis, projects typically abandon planned procedures and revert
to coding and testing
- Success depends entirely on having an exceptional manager and a seasoned
development team
- Frequently over budget and behind schedule
- Success depends on individual but not the organization
- No guarantee that a successful product can be repeated in the next project
Level 1: Key Challenges
- Project management
- Project planning
- Configuration management
- Software quality assurance
Level 2: Repeatable
Behavior characterization:
- Policy and procedures for managing software project established
- Planning and management of new project based on experience with
similar project
- Basic management controls have installed
- Project managers track costs, schedules, and functionality and identify
problems in meeting commitments when they arise
- Software requirements and work products are baselined
- Project standards are defined, and the organization ensures they are
faithfully followed
- Process capability can be summarized as disciplined because project planning
and tracking are stables and early success can be repeated
- Strength in doing similar work, but faces risk when presented with new
challenges
- Lacks orderly framework for improvement
Level 2: Key Process Areas
Key: to establish basic project management
- Requirement management
- Software project planning
- Software project tracking and oversight
- Software subcontract management
- Software configuration management
- Software quality assurance
Level 3: Defined
Behavior characterization:
- Typical process for development and maintenance across the organization
is documented (standard software process of the organization)
- Exploits effective software-engineering practices when standardizing its
processes
- A Software Engineering Process Group (SEPG) is responsible for the
organization's process activities
- An organization-wide training program ensures that the staff and managers have
the knowledges and skills
- Project team tailor an organization's standard software process to develop
their own defined process
- A well-defined process includes readiness criteria, inputs, standards and
procedures for performing the work, verification mechanisms, outputs,
completion criteria
- The capability can be summarized as standard and consistent because both software engineering
and management activities are stable and repeatable
Level 3: Key Process Areas
Need to establish an infrastructure to institutionalizes effective software
engineering and management process:
- Organization process focus
- organization process definition
- Training program
- Integrated software management
- Software product engineering
- Intergroup coordination
- Peer reviews
Level 4: Managed
Behavior characterization:
- Well-defined quantitative and quality goals for products and process set by
the organization
- Productivity and quality are measured for important process activities
across all projects
- Products are of predictably high quality
- Organization-wide process database is established with resource to analyze
its data and maintain it
- Data in the database is used to evaluate key process activities
- Projects control their products and processes by narrowing the variation in their
performance to fall within acceptable quantitative boundaries
- Capability can be summarized as being quantifiable and predictable
- Organization can identify and address special cause when some exceptional
circumstance occurs
Level 4: Key Process Areas
To establish a quantitative understanding of both the software process:
- Quantitative process management- control a project's process
performance quantitatively
- Software quality management - quantitative understanding of the quality
of a project's products to achieve specific quality goals
Level 5: Optimizing
Behavior characterization:
- Focus on continuous improvement
- Organization has the means to identify weaknesses and strengthen the process
proactively, with the goal of preventing defeats
- Data on process effectiveness is used to perform cost-benefit analysis of new
technologies and propose changes to process
- Best SE practices are identified and transferred throughout the organization
- Data in the database is used to evaluate key process activities
- Rigorous defect-cause analysis and defect prevent
- Organized efforts to remove waste by changing the common cause of inefficiency
- Capability can be summarized as continuous improving by striving to improve
the range of their process capability
Level 5: Key Process Areas
To implement continuous and measurable process improvement:
- Defect prevention: identify cause and propose prevention
- Technology change management: identify beneficial new technology and transfer
them in an orderly manner
- Process change management: continuously improve the processes orderly
Software Indicators
CMM has a set of software indicators which is consistent with the key practices
(key processing areas) in CMM
- Progress
- Effort
- Cost
- Quality
- Stability
- Computer resource utilization
- Training
SEI Process Assessment




Chapter 15 - Maintenance Cost Estimation
References
- Pfleeger, chapter 20
- Pfleeger, chapter 10
- Sommerville, chapter 28
Any reasonably large software product is bound to need maintenance, and the
bear of maintenance cost is lumbering up on all vendors
Two Main Aspects to Modelling Maintenance Cost
- The flow of failure reports from the whole field
- The distribution of cost over failure reports
Maintenance Cost
- McCracken 1980
- Other intangible costs
- Customer dissatisfaction when seemingly legitimate requests for repair
or modification cannot be addressed in a timely manner
- Reduction in overall software quality as a result of changes that
introduce latent errors in the maintained software
- Upheaval caused during development efforts when staff must be pulled to
work on a maintenance task
- Final cost: a dramatic decrease in productivity
An Example - Belady 1972
Total effort expended on maintenance
- M = P + K e (c-d)
- P is the productivity effort, e.g. 40:1, $40 per line of code
- K is a constant
- c is a measure of complexity that can be attributed to a lack of good design
and documentation
- d is a measure of the degree of familiarity with the software
A Second Example - COCOMO
- E.maint = ACT x 2.4 x KLOC1.05
- ACT = annual change traffic
= KLOC for a system undergoing maintenance / CI - CI is the number of source code instructions that are modified or added
during 1 year of maintenance
General Problems
- Difficult or impossible to trace the evolution of the software through
many versions or releases
- Difficult or impossible to trace the process through which software was
created
- Hard to understand some other's program
- Previous developers are not here
- Documentation does not exists or is useful
- Most software is not designed for change
Maintainability Metrics
Gilb 1979
- Problem recognition time
- Administrative delay time
- Maintenance tools collection time
- Problem analysis time
- Change specification time
- Active correction time
- Local testing time
- Global testing time
- Maintenance review time
- Total recovery time