A Method for Quantitative Risk Analysis
By James W. Meritt, CISSP
ศ๑๒๎๗ํ่๊:http://csrc.nist.gov/nissc/1999/proceeding/papers/p28.pdf
I Introduction
There are two primary methods of risk analysis and one hybrid method:
Qualitative - Improve awareness of Information Systems security problems and
the posture of the system being analyzed.
Quantitative - Identification of where security controls should be implemented
and the cost envelope within which they should be implemented.
- Hybrid method - A selected combination of these two methods can be used to
implement the components utilizing available information while minimizing the
metrics to be collected and calculated. It is less numerically intensive (and less
expensive) than an in-depth exhaustive analysis.
The first, qualitative analysis, is simpler and widely used. Qualitative analysis helps in
the identification of the assets and resources at risk, vulnerabilities that might allow the
threats to be realized, safeguards already in place and those which may be implemented
to achieve an acceptable level of risk and increase overall awareness. This analysis uses
simple calculations and uses procedure in which it is not necessary to determine the
dollar value of all assets and the threat frequencies or the implementation costs of the
controls. Quantitative analysis does this as well as identifies the specific envelope in
which the losses and safeguards exist. It is based substantially on independently
objective processes and metrics and requires an accordingly increased degree of effort be
placed in deterring the cost values and an increasing amount of effort be placed into the
calculations. It does, however, present its results in a management-friendly form of
monetary values, percentages, and probabilities. Since the Office of Management and
Budget Circular A-130 no longer requires a full-blown risk analysis the hybrid model
using a facilitated risk analysis process is gaining in popularity due to its reduced costs
and efforts required in spite of not providing the metrics desired for management
decisions.
II Methodology
1. Scope Statement
The scope statement is your first step. This single statement is what you will give to the
Information Technology Committee meeting recorder, incorporate into the submitted
proposal. The scope statement must be committed to by all concerned.
The scope statement must:
a. Specify exactly what is to be evaluated
b. State what kind of risk analysis will be performed
c. Provide the expected results
For example "A quantitative risk analysis will be performed on the Glimby information
system to determine what controls, if any, are needed to reduce the risks to the system to
an acceptable level using benefit-cost analysis methodologies for determining applicable
controls."
2. Asset pricing
The information system specified in the scope statement next will be broken down into its
components which will then be individually priced. While it is possible to break down
the system into functional units, I find it much easier to disassemble the overall system
into its tangible components which may be more easily priced. I recommend the
following breakdown:
Network/telecommunications: Modems - This category consists of the various modems
both internal and external. Any system used to connect information systems to
communication lines is contained within this category
Network/telecommunications: routers - This category contains those items of
information technology which are identified as routers, gateways, hubs or serve a similar
purpose.
Network/telecommunications: Cabling - This category includes special purpose cabling
identified for the information technology but does not include that which is installed as
part of the operating area (e.g. built in).
Network/telecommunications: Other - This category includes those items of
information technology that are used for networking and/or telecommunications but do
not fit within other designated categories. It includes, but is not limited to, specialpurpose
communication cards and adapters.
Software: Operating System - This is the programming, which enables the information
technology to operate. The vendor along with the hardware that it operates provides it.
Examples are MVS, DOC, UNIX,
Software: Applications - This category contains those items of software which are
directly necessary for the business operations of the organization. It is usually developed
in-house or under contract and does not contain those items of software directly
necessary for the operations of systems within it.
Software: Other - This includes any programming which is not either identified as a
component of a system Operating System or as one of the primary applications. Typical
examples are provided by third-party vendors.
Equipment: Monitors - This category covers items which are used to display
information from the various units of information technology. It contains, but is not
limited to, stand-alone computer monitors and terminals.
Equipment: Computers - This category includes all information processing equipment
maintained by the organization. It contains, but is not limited to, PCs, front-end
processors, fileservers, mainframe computers and workstations.
Equipment: Printers - This category contains items of information technology used to
impress information upon paper. It includes things such as a variety of printers (varying
from dot matrix through laserprinters) and plotters.
Equipment: Other - This category contains items of equipment not covered by other
designated categories. It contains, but obviously is not limited to, such things as memory
cards, disk drives, tape units and power supplies.
Data/information: System - This category includes that information which is maintained
for the operation of the information system. It includes, but is not limited to, such things
as schedule information, error logs, usage logs, and similar logging data.
Data/Information: Business - This category includes that information maintained for the
business purposes of the overall organization. The system business databases, for
example, would be included in this category.
Data/Information: Other - This category includes all information sources not readily
identifiable as belonging in one of the other two.
Other: Facilities - This may be the entire building itself and its supplied services or
simply the table the system is on. It depends, of course, on the system being analyzed.
Other: Supplies - This includes supplies for the information system. Included are such
things as spare parts, backup components, repair kits, paper,
It does NOT include
supplies for non-IS functions associated with the business.
Other: Documentation - This is the documentation associated with the operation of the
information technology. It does NOT include that documentation which may be present
for non-IS purposes.
Other: Personnel - These are the people which work with the information system in all
capabilities. It does not include manning at the organization for non-IS duties. As a firstorder
estimate the sum of salaries of all operating personnel may be used, as long as you
remember that there are non-tangible assets such as experience and loyalty which are not
necessarily appropriately priced.
It is a basic axiom that you should not spend more protecting an asset than that asset is
worth. So by completing this exacting process of enumerating and pricing you
components of the information system under consideration you have established a
number of upper thresholds for system safeguards.
This leaves the intangible assets involved, such as client confidence and experience.
These things, while important, are not so easily priced and will not be included in the
quantitative analysis but it must be remembered that they are present and will be included
in deciding what risks have been reduced to acceptable limits, controls and rate a special
mention in the final report.
III. Risks
A risk to the information system is something that can, in some way, cause harm or
reduce the operational utility of the system. Threats are those things which may occur
independent of the system under consideration and which may pose the risk.
The threats considered are:
-
Power Loss - The loss of the electrical power supply to the information systems. -
Communication Loss - The inability to transfer information to and from the organization through the defined system parameter. -
Data Integrity Loss - A realized, or perceived possible, alteration of the data and/or information maintained by or consisting of the specified asset. -
Accidental Errors - Improper use of information technology not due to malicious intent but solely through mistaken incorrect use. -
Computer Virus - A Program which spreads by attaching itself to "healthy" programs. After infection, the program may perform a variety of non-desirable functions. -
Abuse of Access Privileges by Employees - Employees are authorized by the Security. Policy of the organization and further narrowed by their job responsibilities to perform a small selection of functions with the information system. This category covers those acts which may be performed but which are not authorized. -
Natural Disasters - Those occurrences which degrade some aspect of the information system other than fire and earthquake and are not manmade. Examples would be flooding, a tornado, -
Attempted Unauthorized System Access by Outsider - Non-employees or personnel not contracted to perform work with or on the information system who are not appropriately authorized yet are attempting, but not succeeding, in gaining access to the information system. -
Theft or Destruction of Computing Resource - A primary resource of the organization is the computing capability of its information systems. This threat addresses the unauthorized use of this resource and the destruction of this resource - through physical or other means. -
Destruction of Data - Information held by an organization is not only that used by their business applications, but includes that used by the systems to operate, manuals, personal experience and other forms. This threat may destroy that information, or simply prevent the organization from using it. -
Abuse of Access Privileges by Other Authorized User - While an employee is authorized to perform - and indeed may be required - to perform many actions using the information system, he or she limited to what may be done through organizational policy, job restrictions and technological controls. But an authorized user - whether an employee or contractor - may attempt to perform operations which are denied them.
Successful Unauthorized System Access by Outsider - This covers non-employees and non-contractors using, and possibly destroying, information system resources. "Hackers" fit within this threat description.
Non-disaster downtime - This covers those times when the information system is unavailable for use not caused by disaster. Examples of this would be maintenance, component failure and the system 'crashing'.
Fire - This includes both major fires that destroy resources to those which prevent assets from being used for any reason.-
Earthquake - This includes both directly destructive and influences of lesser and distant 'quakes.
A magazine survey in Information Week (October 1996) asked "What Security Problems
have resulted in financial losses?" Another magazine survey, in InfoSecurity News May
1997 asked "In the past 12 months, which of the following breaches have you
experienced?" The following statistics are the results of combining the information from
these surveys. The number is the number of times that particular threat may be expected
to happen in one year.
Power loss: 2.00
Communication Loss: 2.00
Data Integrity Loss: 2.00
Accidental Errors: .72
Computer Virus: .68
Abuse of Access Privileges by Employees: .4
Natural disasters: .29
Attempted Unauthorized System Access by Outsider: .27
Theft or Destruction of Computing Resource: .24
Destruction of Data: .17
Abuse of Access Privileges by Other Authorized User: .09
Successful Unauthorized System Access by Outsider: .08
Non-disaster downtime: .06
Fire: .01
Earthquake: .01
This leaves some risks for which statistical information is not available but is nonetheless
important. Examples (not to be taken as exhaustive) are:
Access via "back doors" created by application developers
Improper editing routines for data entry functions
Improper editing routines for external feeds
Timeliness of external feeds
Program bugs
Lack of Change Control Process
Lack of Version Control Process
Corrupted Data base
Can't accept Year 2000 or greater
Unattended workstations
Hardcopy management
User awareness
Servers unavailable
Wide Area Network unavailable
Lack of proper backups
IV. Exposure/Impact coefficient
All assets are not equally vulnerable to every risk. The degree to which an identified
asset is vulnerable to a specified risk may vary from totally invulnerable (example:
cabling systems are reasonably invulnerable to computer virus infections) to absolutely
destroyed (example: printers. Once they are stolen, result in a total loss).
While the possibility of a particular threat of occurring doesn't change regardless of the
controls implemented for the system, the degree to which the system is vulnerable to the
threat may be reduced. This is the value which may be modified to compensate for
already-implemented controls.
Using the threats identified is (3) above against the assets components recommended in
(2), the following coefficients are recommended as initial values. Please note that they
must be individually investigated and approved or modified to suit local conditions.
Expected losses are based upon the anticipated effect of threats on assets and processes.
The amount of the loss depends on the vulnerability of an asset or process to a given
threat.
The vulnerability factor (0.0 to 1.0) of an asset with respect to a threat is the ratio
of:
(1) the expected loss from a single impact of the threat on the asset to
(2) The loss potential of the asset.
Different assets respond differently to different realized threats. The effectiveness of
particular threats against varying assets are different from "no impact" to "total
replacement necessary" depending upon the sensitivity of the asset and threat under
consideration. What follows is a first-order evaluation of this sensitivity. The threats
considered are REALIZED - they are assumed to have occurred. The likelihood of a
particular threat occurring is considered separately and will not be contained within this
evaluation. The SLE (single loss expectancy) is given as a value between 0 and 1 and
represents the damage to the asset. Sample values are:
Value Description
0 The asset is resistant to the threat and no damage results from the realization of
the threat.
.3 When the threat occurs, there is usually no damage resulting but it is possible that
catastrophic damage will result requiring total replacement.
.5 When the threat occurs it is equally likely that no damage will result or that total
replacement will be necessary. All outcomes between these extremes are equally
likely.
.7 After a successful threat is executed, the effected system will usually require
replacement. On occasion - if you are lucky - it will miss total damage, possibly
even entirely.
1 When the threat is realized, total replacement of the identified asset is the only
outcome possible.
See Appendix A. for sample values. These values were arrived at in consultation with
numerous subject-matter experts.
V. Group Evaluation
A group evaluation meeting should next be held that is composed of the stakeholders
(those with a vested interest) of the system. This meeting includes individuals who have
knowledge of the various components in, threats to and vulnerabilities of the system as
well as management/operations responsibilities to help in the determination of overall
acceptability. This structured meeting is greatly like that advanced by the hybrid
facilitated risk analysis methodology developed initially by the Computer Science
Institute.
The meeting provides a common focus on the methodology among the various
stakeholders and business units. It is up to you to maintain an open and balanced flow
through the meeting by providing clearly defined roles and responsibilities for the various
members and to identify or verify:
a. Assets
b. Risks
c. Exposure/Impact
d. Acceptability determination
The overall meeting time should be approximately 4 hours. The facilitator (you) should
already have the numbers ready and waiting with just a few blanks to be filled in. If
possible, disseminate the information a day or more earlier when you give out the agenda
and attendee list so that the participants may research any value they question or need to
provide as you rapidly go through the form.
Be careful not to get out of scope, addressing areas of concern or disagreements that are
not relevant to the scope statement (see section I). For anything that may come up that is
plainly important but out of scope, write it down for later consideration and move on.
Make special note of any area that is determined to already be at or near acceptable risk
levels, as determined by the appropriate members. By acceptable items from the list and
concentrating on those near acceptability you may narrow the focus of your post-meeting
calculations.
VI. Calculation
The values were entered into a simple spreadsheet which contained the assets on one
axis, the threats on the other and the vulnerability coefficient at their intersection. The
second sheet contained a table with similar threat and asset values, but the intersection
points were the product of the particular asset times the probability of the threat times the
vulnerability of that specific asset to that specific threat.
VII. Analysis
a. Across asset - A sum of the product of terms was made across each asset for all
threats. This summation gives the expected annual loss to that asset for all threats.
Looking across these terms will show you which assets need the most protection (the
most can be lost there).
b. Across risk - A sum of the product of terms was made across each threat for all assets.
This summation gives the expected annual loss due to that threat to the entire system.
Looking across these terms will show you what threats need to be guarded most
against (the most can be lost to them)
The sum of all the values in (a) should be the same as the sum of all values in (b). This
value is the Annual Loss Expectancy (ALE) for the entire system to all threats.
VIII. Controls
Controls are those things which are implemented to prevent the exposure to the threat in
the first place, detect if the threat has been realized against the system, mitigate the
impact of the threat against the system or to recover/restore the system. Examples of
possible controls are:
Develop, document, and test backup procedures
Develop, document, and test continuity of operations procedures
Implement and access control mechanism
Implement user authentication mechanism
Implement encryption mechanism
Implement a configuration management process for software
Implement a version control process for documentation
Prepare, distribute, and maintain plans, instructions, guidance, and standard operating
procedures concerning the security of the system operation
Develop user documentation on proper use of the system
Conduct training on the proper and secure use of the system
Implement mechanisms to monitor, report and audit activities identified as requiring
independent review
Implement operations controls to ensure proper separation of data
Negotiate maintenance or supplier agreements to facilitate the continued secure
operational status of the system
Ensure consultation with Facilities Management to implement physical security
controls to protect the system
Ensure appropriate personnel security investigations are performed
Implement QA performance monitoring/testing procedures
Ensure LAN Administration installs the corporate standard anti-viral software
throughout the system
Train backup personnel
Request management support to ensure the cooperation and coordination of various
business units
Production Management controls such as search and remove processes to ensure data
stores are clean
Time requirements will be tracked for technical maintenance
Document possible security exposures such as program access "backdoors"
Backup sites (hot or cold sites) locations and requirements determined and
appropriately implemented
Test for Y2K compliance
Please note that this list consists of EXAMPLES and is not exhaustive
The controls, once the candidates have been identified or discarded by the group meeting,
must be analyzed. Two recommended methods are:
a. Cost/Benefit ratio - Where the reduction in the ALE when applying the
particular control across all assets and all threats is compared to the local cost for
implementing the control. Another basic axiom is that you should not invest more
in the control (cost) than it would be worth (benefit). Hence, no system which has
a cost/benefit ratio of 1 or above should even be a candidate for implementation.
This affords yet another upper threshold that you could not get without
performing a quantitative analysis.
b. Risks/Control (e.g. "bang per buck") - Where the risks reduced by the control
are enumerated (either by count or summed ALE reduction) are compared to the
control (either unity or implementation cost). This is a method utilized in
qualitative analysis that may be used or enhanced by quantitative methods.
These ratios are then compared to determine which - if any - of the controls may best be
implemented.
IX. Conclusion
It is possible to answer management's questions which inevitably turn to costs and other
metrics. While there are a number of intangible assets and unquantifiable risks it is
possible to perform some basic mathematical calculations yielding results which may
give firm guidance towards improving the security of any system.