|
LeRoy A. Stone, Ph.D., (Forensic Dip.) ABPP
Note – Since the PSSPQ test has recently been the subject of a good
deal of interest, to the degree that it can now be considered to be a commercially
viable product, it now seems to be timely to introduce the PHI test to
those who "click on to" to Dr. Stone’s Web Site (http://www.home.earthlink.net/~lastone2/home.html)
and who have read or otherwise become familiar with his Web Site sub-pages
that have focused upon the PSSPQ test (i.e., http://www.home.earthlink.net/~lastone2/psspq.html).
The major purpose of the following presentation is to introduce the PHI
test and also provide some detailed description of the PSSPQ as it represents
that from which the PHI was
Introduction The tale of how Diogenes (412?-323 BC) devoted almost an entire life, traveling almost constantly, and making use of some kind of lamp while searching for the so-called honest man. This story apparently has been told for perhaps a couple thousand years. If my memory serves me right, I cannot recall that Diogenes was ever successful in his task that captured almost his entire life. Is this tale some type of metaphor for what the situation is in real life? Is there such a thing as an honest man and if there is, is there any way to identify him/(her)? Throughout history, around the world, there are described different means for identifying lying and other impediments that can get in the way of attempting to identify whether an individual is honest or not. Horrible procedures have been used to determine whether persons are guilty or not, whether they are good or evil (such as being a witch or warlock), whether they are truthful or not, as well as other dichotomies regarding the integrity, honesty, goodness, trustworthiness, etc. of mankind. Presently, we (in our Western society) also employ a wide variety of techniques to help us identify a person who we can trust when we are interested in evaluating people for employment, for getting bonded, for obtaining security clearances, etc. What quickly comes to mind is polygraphing, psychological testing/interviewing, complicated vetting techniques, various levels of background investigations, recommendations and evaluative impressions of other people, etc. Interestingly enough, all of these techniques/procedures are utilized by the U.S. Government when evaluating its citizens for high-level security clearances. When evaluating them for lower level clearances, generally only one or two of the listed procedures are normally employed. What Type of Psychologists? With focus upon evaluation of persons because
they are being processed for high-level security clearances, any involvement
of psychologists to conduct evaluations of those being processed, usually
(in the past) has involved and has been generally limited to clinical or
counseling type psychologists. These particular type psychologists
are typically trained to work (i.e., make diagnoses
One problem with most of these "fake-good" (or also sometimes known as "lie scales") is that they do not usually work very well. Persons of average or lower intellect are the ones who are generally ‘caught’ but high-intellect individuals normally can see-through the purpose of many of the items in these so-called lie scales. Highly intelligent people are not too often caught by the sometimes called deception or lie scales. Psychologists in the specialty area of industrial
and organizational psychology (who are generally regarded to be trained
quite differently than clinical/counseling psychologists), in the past
decade or so, have assisted in the development of what is generally known
as integrity tests. In the past decade, these integrity tests have
been quite successfully marketed, especially to human resources and security
offices in the private business/industry labor market. However, most
users of these tests (normally labeled as integrity tests), in the past
decade of using them have, in the past several years, become disenchanted
and disappointed with the results obtained with using these tests.
With some of those who have been administered these integrity tests, the
results obtained are generally believed to be somewhat correct and valid.
However, dependent upon the characteristics of the populations from who
those being tested came, many potentially "bad" or unsatisfactory type
employees simply are not spotted or identified, based upon their test performances.
In other words, many people (e.g., job applicants or others being evaluated
for some type of position of trust) simply were found to be able to ‘beat’
these integrity tests. Another problem, and not overly small, was
that use of the integrity tests sometimes could be challenged using the
Americans for
Does this mean that use of psychologists
(of any kind or type) in the evaluation of persons to diagnose honesty,
trustworthiness, possession of high integrity, etc. is something that should
be done away with or minimized? No, not at all – if the matter of
mental health is something that can affect whether a person can be trusted
or not, then the use of clinical/counseling psychologists, with their mental
health/illness detection tests, can prove to be of great value in certain
respects. However, they should be limited to evaluation of mental
health and not of trustworthiness, in its totality. It should be noted
that mental health may be one possible component of trustworthiness, but
this concept is more complex than a mental health evaluation might reveal.
Users of the so-called integrity tests, unless they are tasked to evaluate
integrity of persons, limited by possession of quite low intellect, their
integrity tests leave a great deal to be desired. With more intelligent
persons responding to them, these tests simply have not sufficient validity
for use in the real world. Psychological
Validity Evaluations One of the most major problems with the
integrity tests is that they have really had little validity evidence supporting
them except for face or content validity. If an individual answers
an item that asks about committing past thefts from his employer and he
indicates that he had stolen something of value from a past employer, we
then decide that such is true without actually knowing whether it truly
happened or not. A number of different causes can be argued that
might explain why someone would admit to thievery when in fact no such
behavior had actually taken place. Lack of required reading skills,
eyesight problems, a simple misunderstanding of how the item was written,
English being the individual’s second language, have some mental health
difficulties, etc. All of these kinds of matters very frequently
are are encountered that might cause someone to respond to a question or
statement in an invalid fashion. This type
However, the greatest source of difficulty
with integrity tests is that, so far, there has been a great deal of difficulty
in using empirical or predictive validity type validity demonstrations
with these type tests. One of the major difficulties has been the
definition of what group of people can be utilized to meet the definition
of an honest man, or one who unquestionably represents an adequate standard
of good integrity. Police persons come to mind, but they very quickly
can be ruled out simply due to the wide understanding, in just about all
cultures, of all of the immorality, dishonesty, and non-law-abiding that
is encountered, almost daily in the popular press, that pertain to this
particular occupational group. Unfortunately, the understanding of
the so-called "bad cop" is a widely held concept. The same can be
said for those in the clergy; especially recently in the sexual offense
domain that seems to have been an ever-occurring story line that is seen,
heard and read about in our popular media. Just about a
One of the major problems in the creation of an honesty or integrity test is the establishment of definitional criteria for what constitutes an honest individual. As suggested in the previous paragraph, one could merely assume that certain occupational and social groups were mainly made up of honest and trustworthy people. However, this is a rather dangerous and easily challenged position to take. Who (as a defined group of people) can be perhaps generally believed to represent an honest and trustworthy groups who then could be employed as a criterion defining sample for such a construct? A New and More Narrow Definition for Trustworthiness Dr. LeRoy A. Stone may have found the solution
to the problem described in the preceding paragraph. In his employment
as a senior clinical psychologist for a decade and a half, in this Country’s
largest intelligence agency; in his last eight years of employment in this
agency he was promoted to the position of
In our country there is a group of individuals
who are quite small in number and who have been the focus of the most stringent
initial and then ongoing investigation and monitoring program imaginable,
which is concerned with the human characteristics of honesty and trustworthiness.
Most Americans are very familiar with the terms "secret" and "top secret,"
and that they are descriptive of levels for security clearances, but very
few have ever encountered or heard of "Sensitive Compartmented Information"
(or SCI) in association with the Top Secret concept. Top Secret –
Sensitive Compartmented Information (or TS-SCI) is just about the highest
level, security clearance, access level granted by the USA Government.
It is difficult to obtain the current number of USA citizens who hold TS-SCI
access but back in the late-1980s, the Washington Post (i.e., June 8, 1986
issue, A Section, page 18) wrote that, as of March 1985 there were only
98,715 persons within the
The process followed in the initial obtaining
of TS-SCI access for any given individual is a rather long and expensive
process. Actually on a small fraction of those who are initially
considered for employment that requires successful obtaining/granting of
this particular clearance are ever successful in obtaining the employment.
It is not at all unusual for the entire investigation process to take two
years or so. There are some cases in which almost three years were
required before a final decision to grant or not grant TS-SCI access to
the involved individual(s) could be made by the Government. The process
involves extensive psychological testing including psychological interview,
polygraph
Individuals who have been granted TS-SCI access are reinvestigated every five years. Reinvestigations involve polygraph examinations, security interviews, and additional and complete up-dated background investigations that require investigating agents having face-to-face interviews with neighbors, work associates, and other possible significant persons in the employee’s life. If an individual having TS-SCI access has lived in various locations during the previous five-year period, interviews with pertinent persons are carried out with respect to all locations. This of course is also true with respect to the initial (i.e., pre-employment) interviews which were carried out prior to obtaining TS-SCI access. If an individual has lived in a number of states (or even in foreign countries), persons associated with each of those locations are interviewed regarding the subject individual. As part of the reinvestigation process and updated agency records check is also again carried out; much involves matters of financial credit status and law enforcement records. If an individual obtains TS-SCI access and
successfully maintains this clearance status, it is safe to conclude that
the subject individual is, for all practical purposes, an honest and trustworthy
person. There really is no other clearly defined group of USA citizens
in our society who have undergone such extensive and expensive scrutiny
(which is constantly ongoing) regarding
For the very first time when a psychological trustworthiness
focused test has been constructed, a criterion sample representing
the possession of high levels of honesty and trustworthiness has been employed.
The possession of high levels of honesty and trustworthiness has been operationally
defined based on the fact the employed sample(s) possessed TS-SCI access
status (as granted by the Government of the USA). In other words,
the Diogenesian tool used to find "an honest man" was whether he or she
possessed TS-SCI access status. Of course, we are aware that even
the possession of TS-SCI access status does not totally or absolutely guarantee
complete honesty or trustworthiness. In fact, during the past couple
of decades, several dozen USA citizens have been caught as traitorous spies,
a couple of them did (or had in the past) possess TS-SCI access Still,
such TS-SCI access status, probably represents the very best that an open
society can do to designate the possession of a high level of honesty and
trustworthiness. No other honesty/trustworthiness measuring psychological
test, currently on the market, has had honesty and
Establishment of Trustworthiness Norms for the PHI The males (N =114) and female (N = 92) samples were obtained over a period of a couple years. All were civilian employees who held TS-SCI access and all volunteered to respond to the PHI in an anonymous fashion. Their anonymity status was promised for a couple of reasons. One reason was that such facilitated volunteering for this project. The major reason though was that such anonymity status promoted more honest and candid response to the PHI items. Upon initial contact, these employees were requested to complete the answer sheet, on their own time and not while at work, and to then return the answer sheet and the list of 50 items. The mean age for the male sample was 37.
07 (SD = 9.35); for the female sample the mean age was a little younger,
29.65 (SD = 6.78). The age range for male was from 20 to 55 years;
for the females the range was more restricted, from 21 to 43 years.
For male their mean number of years of formal education was 15.91 years
(SD = 2.72); for females their mean number of years of formal
As noted in the above paragraph, it is understood
that the PHI was responded to by each subject in non-group situations;
such is believed that this ‘private’ responding to PHI items also somewhat
guaranteed anonymity. In general, almost none of the subject employees
asked any clarifying questions pertaining to any of the PHI items.
The great majority of the subjects, when asked, indicated that it only
took no more than about 10 minutes for them to
The PHI item responses, from the 206 subjects (males = 114, females = 92) were scored so that scales scores for the seven PHI sub-scales were determined. With each of these PHI sub-scales a t-test was computed so as to examine differences between male and female mean values. The computed t values, along with the involved degrees of freedom, and the associated probability values are shown in Table 1. [Note – the male and female mean values, along with the standard deviations, for each of these seven PHI sub-scales comparisons are purposely omitted here as they are regarded as being proprietary information having commercial value.] _______________________________________________________________________
Table 1 Degrees of Freedom and Associated t Values for Differences
Scales
t df
p
These gender difference results suggest that perhaps it would be unwise to combine the male and female PHI data for at least three of the PHI scorings; namely, that for Scales a. and e., as well as for the Total (summation of Scales a.– f.) Score. However, there is nothing here that would suggest any problem with combining the male and female normative data for the following listed PHI sub-scales: b., c., d., f., and g. As a result, an individual’s eight PHI scorings (for Scale a. – g. plus the Total Score) are to be transformable and reportable as eight different T-scores (having a mean of 50 and a SD of 10). Males and females will have their own T-scores for Scales a. and e., as well as for the Total Score computed using gender-based norms. Actually, the Total Score for the PHI will prove to be of very little value as it’s equivalent Total Score for the PSSPQ has shown to possess little diagnostic or predictive value. T-scores for the remaining other five PHI scales are computed using normative information based on combined male/female normative data. A raw score conversion table (i.e., from PHI sub-scale scores to T-scores has been developed and because it is considered as being proprietary information it is not presented here. As noted in an earlier paragraph, highly
favorable reliabilities have been repeatedly found and reported for the
PHI’s antecedent psychological test, from which it was developed, the PSSPQ.
Test-retest reliability for the PSSPQ has been estimated at about 0.94
and there has been very good reason to believe that this may actually have
been a low estimate due to the fact that the sample employed were most
likely not overly interested in the task at hand and this may have been
a factor that only could have lowered any estimation of
Similar reliability estimation results have now been obtained for the PHI itself. Using PHI data obtained from the already described normative sample of 206 civilian, government employees, all holding TS-SCI access, several attempts to estimate reliability for this instrument were completed. Coefficient alpha, which is one of the Kuder-Richardson models for estimating an internal consistency form of reliability, was computed for each of the sub-scales of the PHI. Coefficients alpha for these sub-scales are as follows: a. Undesirable Character Traits, 0.65; b. Financial Irresponsibility, 0.38; c. Alcohol Abuse, 0.82; d. Illegal Drugs & Drug Abuse, 0.41; e. Record of Law Violations, 0.44; f. Security/Confidentiality Violations, 0.56; and g. LIE, 0.78. Product-moment correlation coefficients between the sub-scales scores and the Total Score of the PHI were also calculated as another form of internal consistency; these correlation coefficients are as follows: a. 0.81, b. 0.62, c. 0.91, d. 0.41, e. 0.67, f. 0.75, and g. 0.88. Those scales that had the greater number of items did correlate the highest with the Total Score measure, as should be expected. For each PHI sub-scale, all the items were cast into two equal-item number of groups in all possible item combinations. With each combination, a product-moment correlation coefficient was computed; using Fisher r to z transformations, an average correlation was computed and this average correlation was regarded as a type of split-half reliability estimation for the involved sub-scale. These split-half estimates were corrected for length using Brown-Spearman logic. These corrected split-half reliabilities for the PHI sub-scales are as follows: a. 0.53, b. 0.51, c. 0.80, d. 0.50, e. 0.44, f. 0.70, and g. 0.78. Of these split-half reliability estimations, three were statistically significant at the .001 level, one at the .02 level, two at the .05 level, and one at the .10 level (this lowest level of statistical significance was in association with Scale e, which only has three items, hardly lengthy enough for this kind of reliability estimation approach). Another form of internal consistency was calculated; this was simply the average item/sub-scale score correlation calculated for each PHI sub-scale. The item/sub-scale score correlation coefficients were averaged using the Fisher r to z transformation model. These average item/sub-scale score correlation coefficients, for each PHI sub-scale, are as follows: a. 0.51, b. 0.53, c. 0.77, d. 0.51, e. 0.48, f. 0.80, and g. 0.55. Some of the above reliability estimations
for the PHI sub-scales appear
Another one of the major problems may have
to do with the possibility that some of the PHI sub-scales, even though
they may only contain several items, may be measuring markedly multidimensional
qualities. This has been shown to be the case with the PSSPQ.
Many of the PSSPQ sub-scales have been shown, through use of factor analysis,
to measure multidimensional characteristics. An additional psychometric
matter which may be causing some of the reliability estimations to be lower
than one would like to see is that most of the sub-scales are in fact extremely
brief and contain on a few items. Scale f., for example, contains
only three items. The longest sub-scale of the PHI is
An Attempt to Obtain/Use Another Even More
This perhaps even more trustworthy designated
group, from the same very ‘sensitive’ federal agency was composed of both
male (N =43) and female (N = 20) employees who had volunteered and applied
for a new employment assignment into a very special program. In order
to be considered for entry into the program the already employed persons
had to be again polygraphed and have his/her background again systematically
studied. In addition to this, the employee, as well as just about
his/her entire family (the only exception being young children) who would
accompany the employee on a ‘permanent change of station’ assignment (which
normally was for two/three years). After every assignment was completed,
then the whole entire evaluation/vetting
The relatively small cadre of agency employees in this rather special personal standards/requirements program, most certainly were not perfect people, they were not 100% honest nor did they show 100% perfect integrity. However, it could be argued that they were about as trustworthy, as one could be so defined, based upon a broad, continuously ongoing, expensive, and repeated evaluation system that classified them as being trustworthy in a very general sense. When the agency was ‘recruiting’ candidates (usually from it’s already employed ranks) for this special program, it explicitly indicated that only those with very exceptionally ‘clean’ and ‘straight’ personal backgrounds were encouraged to apply. Therefore, it could be assumed that the agency only wanted to recruit the ‘best of the best’ (i.e., those with the highest levels of trustworthiness) for entry into this special assignment program. The PHI was administered to 63 agency employees
who had been through the above described, elaborate, trustworthiness-evaluation
process and who already had been actively working in the program, many
of them for a number of years. As a consequence of many of them having
been officially assigned in the program for as number of years, many had
completed multiple changes of
In this sample of 63 special program employees, most (i.e., 68%) were males and only 32% were females. Because, of the reduced number of females (about two to one ratio, which was reflected numerically in the relatively small special project work force), some attempt was made to statistically test for differences between the male and female gender PHI sub-scale means. It was not anticipated that the very same arrangement of statistically significant and non-significant gender differences was found that was originally seen with the first normative group (i.e., N = 206). However, upon further reflection, this seemed not overly surprising. When the PHI sub-scale means, based upon this smaller group (i.e., N = 63), were compared to the sub-scale means, computed from the larger group (i.e., N = 206), it initially surprised the developers of the PHI. All of these sub-scale means (for both male and female groups when it had been previously determined that the differences between male and female means were statistically significant; a single mean based upon combined data when it had been shown that the observed differences between male and females means were not statistically significant) were then compared to the sub-scale means based upon the earlier larger standardization group. The findings, based upon the results obtained from what is described in the last sentence of the previous paragraph really were not at all expected. In fact, what was expected were different intensity scorings obtained by the employees who had been assigned in a very special project group and who (along with most of their immediate family) had originally had gone through some sort of super type evaluation for trustworthiness and who again went through reevaluations that were quite similar to the original ‘entry-to-project’ evaluations, every time they were reassigned to a new location. What actually was seen were that the sub-scale means, obtained by this special project group of employees were basically the same mean scorings that had been earlier obtained by the larger group who had been defined as government employees who had been granted TS-SCI access status. However, the more we thought about this, our original expectation of differing scorings was not overly logical. For one thing, possession of TS-SCI security clearance status really was not much less a level of trustworthiness than what was expected for those employees in the very sensitive special project. Only two major differences really exist. For one thing, the special project employees additionally had most of their immediate families go though psychological interview/evaluations, whereas ‘regular’ agency employees do not normally expose their immediate family members to most of the evaluation steps when they are originally processed for being granted TS-SCI access. Another matter is that those employees in the special project normally were completely reevaluated about every three years (i.e., when reassigned to a now location) as compared to every five years for all other agency employees holding TS-SCI security clearances. The obtained PHI sub-scale data, from both of these two groups, suggest that even though this Government agency has its special project employee go through more frequent and sometimes more thorough trustworthiness evaluation than it requires for its more regular employee, who also has been granted TS-SCI access, both of these groups are just about equally trustworthy. In fact, with respect to the matter of trustworthiness, the smaller special project group would appear to merely be, as far as what can be assumed from the PHI obtained data, a representative sample from the larger employee grouping. As a result of what was concluded immediately above, the PHI data from the special project group was combined with the earlier obtained PHI data. This combined data (i.e., now based upon an N = 269) now represents the standardization norms for the PHI. Now, only a single mean for each PHI sub-scale was computed and presented as normative data for the PHI. As noted earlier, for a couple of the sub-scales, mean scores for both male and female genders are presented as normative data. Again, due to the proprietary value of these sub-scale means, along with their associated standard deviations, they are not being presented here. This decision is based upon a belief that this information possesses commercial propensities. With what has been described and stated in the previous pages of this presentation, it can be noted that actually for the very first time, when an honesty or integrity type test has been constructed, a criterion sample representing the presentation of what can be argued as being high levels of honesty and trustworthiness has been employed. Although the term, "trustworthiness," has been previous used, with some previous commercially offered integrity tests, Dr. Stone would like to perhaps redefine the term when used with these type tests. If the tests are actually empirically validated using groups of people as the criterion measure, then they should be referred to as being ‘trustworthiness’ tests. If a test’s validity has been mainly defined in terms of simply ‘face’ or ‘content’ validity, then the test should be regarded as being an integrity test. Only the PHI, and of course the PSSPQ from which it derived, by the above given definition then are tests of trustworthiness. Many of the so-called integrity tests, that have been successful, in a commercial sense, were originally only justified by face validity. Only after they had been in use for some time, did some of the correlational relationship research begin to be seen that related the integrity tests scores, obtained by persons prior to their being hired, with matters such as later becoming some type of problem employee, length of employment, alcohol problems emerging etc. In this fashion, these integrity tests did show some association with whether employees were later regarded as being good or bad – such could be regarded as a type of construct validity. Unfortunately, most of these associations that have been reported between integrity test scores and later found incidences of problem employee difficulties are not very high or useful, although they have been usually reported as being statistically significant. Not infrequently, in applied psychology, is there found statistically significant relationships between variables of interest, however the relationships are so low that they end of being seen as just about useless in ‘real world’ considerations. It is this type of understanding that can be inferred from the several reported mega analyses studies of the so-called validity of integrity tests; the greatest number mainly having been reported in the 1990s decade. Getting back to the PHI, it was with this group (N = 63) of very special agency employees (i.e., those who had been extensively evaluated and who had been employment assigned in a special, ‘highly sensitive,’ program for a least a couple of years), who Dr. Stone choose to serve as one of the empirical standardization groups for the Probity/Honesty Inventory (PHI). The PHI had been hoped to be regarded as being a test that measures (or predicts) degree of trustworthiness. Scores from the very special employee group served as a norm or ‘anchor’ group that defined an exceptionally high degree of trustworthiness. How did Dr. Stone know that this group was trustworthy? The answer is very simple. The U.S. Government defined the group as possessing exceptionally high trustworthiness and this very-hard-to-accomplish regard was based upon past and ever-ongoing evaluative investigations that made (or make) use of just about all that can be done (in the ‘real world’) to accomplish such a goal. In other words, according to U. S. Government national security standards, this particular employee group of people was such that it could be believed that they were all characterized by possessing an unusually high degree of trustworthiness. Therefore, any test or questionnaire type measuring instrument that might be administered to a sufficiently-sized sample from this particular occupational group would result in the establishment, for that test or questionnaire, something that could be regarded as representative of the type of response to that test or questionnaire that might be expected from highly trustworthy people. This is consistent with what "norms" represent with a standardized psychological test. If someone, not in that particular occupational group "took" the test or questionnaire in question, then that person’s testing results could be compared, for evaluative purposes, to the average (or other parameter) scores obtained by the group. If his scores closely resembled the group’s mean score, it could be said that, with respect to the test scores, his looked like those that were obtained from testing a group of highly trustworthy people. Conversely, if his score was marked deviant (with direction of the deviation being taken into account) from the group’s mean score then it could be said that, based on his test performance, he did not appear too much resemble, in a general sense, those in the group. All of this would have even more meaning if it were understood that the content, of the test in question, were just about all items that involved matters that based upon their content alone on what could be conceptualized as pertaining to the trustworthiness concept. In this type arrangement, then the test in question, in this situation the PHI, could be said to possess at least two quite different types of validity. It, like the integrity tests, possesses face or content validity; but much better, it can also be regarded as possessing empirical validity. In this latter validity arrangement, it can be used to determine whether a test-taker scored, based on an empirical comparison, resembles the type or kind of persons who made up the validity criterion group. In the case of the PHI, this criterion group was those government employees, in the very special group that was described in some of the previous paragraphs. This group could rather easily be conceptualized as being trustworthy, and being almost continuously examined for this very characteristic. It would be hope and fully expected that future research efforts to show construct validity for the PHI would be very successful. Some limited early attempt to show some construct validity have so far been quite successful. Well then, just exactly what is this so-called test of trustworthiness, the PHI? Actually, it is just a portion of another slightly more lengthy test that has been repeatedly researched regarding reliability and validity determinations. The 50-item PHI is a moderately shortened version of the Personnel Security Standards Psychological Questionnaire (PSSPQ), about which a number of Web-site pages have been written. Some of these pages can be found at the following Web addresses: http://www.home.earthlink.net/~lastone2/psspq.html
Although a number of other additional Web pages have been devoted to presentation and discussion of the PSSPQ, these above listed several pages can be regarded as the most ‘major’ ones. For anyone who may want to view all of the Web pages that are focused upon the PSSPQ, the reader is encouraged to enter "psspq" (including the parentheses) into the Google search engine. Basically, the PSSPQ can be regarded as
being a very well researched and well developed psychological test.
It has been shown to possess at favorable levels the following types of
reliability: test-retest and internal consistency (i.e., Kuder-Richardson
Formula 20). The following types of validity have been
It is important to understand just how the PSSPQ was shortened into becoming the PHI. In order to explain how and why this was accomplished, it should be noted that the PSSPQ was composed of 11 different scales plus one LIE (i.e., positive dissimulation) scale. All of the scales, except for the LIE Scale, were designed to be based upon the 11 adjudication concerns that are considered in the processing of an individual who has been nominated for possibly being granted a Top Secret – Sensitive Compartmented Information (TS-SCI; and which is very high) security clearance by the U.S. Government. These adjudication standards or concerns were originally stated in a government document that is titled as: Director of Central Intelligence Directive 1/14 (or DCID 1/14), which was replaced, with very few changes, in 1998 by the DCID 6/4. The 22 items that were omitted when creating the PHI were entirely all from the elimination of five complete PSSPQ scales. As a more brief form of the PSSPQ, the PHI
was also developed so as to be relevant for assessment of honesty and trustworthiness
in the general applicant for employment sector. The PHI was not designed
for use with evaluating persons being processed for security clearance
status consideration as was the PSSPQ. The PHI test consists of seven
of the original PSSPQ scales. These seven are:
Most of the original PSSPQ scales, in these above listed scales were modified a bit from the form that they had as PSSPQ scales. Some items in the scales were modified very slightly and others modified to whatever extent was needed to make them more subject relevant. A few new items were created and they were all for the LIE Scale (which now has 13 items instead of the 10 in the PSSPQ). The number of items involved in each of the now six remaining scales were (same scale designations as above): a. (nine items), b. (six items), c. (six items), d. (six items) e. (seven items), and f. (three items). The three items, comprising scale f., were modified in a fashion so as to make this brief scale sensitive to something other that it was in the PSSPQ. In the PHI, these three items now measure violations of employment confidentiality instead of only past governmental security violation matters . This new focused scale is now more appropriately named as the Security/Confidentiality Violations Scale. All of the other items (in scales a, b, c, d, and e.) were almost unchanged from their representation in the PSSPQ. As one can see from this given description of the items that comprise the PHI, it is rather incorrect to simply describe the PHI as a shortened version of the PSSPQ – it is actually essentially a new test. However, it is enough similar to the PSSPQ, with respect to most of its items and general purpose, that much of the reliability and validity determinations, which have been made for the PSSPQ, can perhaps be somewhat generalized to the PHI. As indicated earlier, the number of items
that comprise this major revised PHI test are 50. Therefore this
revision of the PSSPQ, into a somewhat new test, which is now known as
the PHI test, is only about two-thirds the length of the PSSPQ. As
a consequence, its administration and scoring times are reduced proportionally.
These 50 items generally only take about 8-15 minutes to fully respond
to. Actually, some fast readers have only required about five
Therefore, with the exception of only three additional, new items, the remainder of the PHI (i.e., the other 47 items) is almost totally from the well-researched PSSPQ, which has been shown to possess excellent predictive validity and good reliability in the screening of candidates for high-level security clearances. With some of these 47 ‘original’ items, some slight or mild modifications were made for purposes of making the items more suitable for more general use outside of security clearance adjudication matters. When in the PSSPQ, these 47 items had been ‘proven’ to be up to the test for which they were created. In the PHI, they are entirely suitable. A factor analysis of the seven PHI sub-scales
was carried out in an attempt to explore whether some complex factor structure
might be present. The correlation matrix involving the sub-scales’
inter-correlational coefficients was submitted to a principal components
analysis, with the limiting eigenvalue set at unity. The resulting
commkunalities ranged from 0.56 to 0.94 (average was 0.71). The obtained
three factors were rotated following a varimax strategy. The
first and largest factor was a bit bipolar; the largest positive loadings
(in descending order) were with the Undesirable Character, Financial
What the above principal components analysis
can be believed to show is that the PHI sub-scales basically measure three
quite independent constructs, It would appear that what is mainly
measured is some sort of generalized character concept exemplified by an
undesirable character display, alcohol abuse, a history of law violations,
as well as difficulties with past security violations. Interestingly
enough, admissions of such past behaviors seem to be negatively influenced
by denial and deception. A second construct, measured by the PHI
seems to be mainly centered on alcohol abuse and admission of such behavior
also seems to be negatively influenced by denial and deception. A
third construct appears to be solely involving problems showing a financial
irresponsibility history and surprisingly admissions of such type problems
seem not to be influenced by denial and deception. This reported
principal components analysis of the PHI can be regarded as showing a type
of factorial validity for this instrument. The emerging factor structure
was quite
How the PHI and the PSSPQ Differ from Other
As previously mentioned throughout this presentation, both the PHI and the PSSPQ represent psychometric testing instruments, argued to test for degree of honesty, integrity, or other similar type synonyms, that for the first time have been constructed having a bona fide criterion sample representing the possession of high levels of what the test was designed to measure. With these two tests, the possession of high levels of trustworthiness has been operationally defined based on the fact that the employed sample(s) possessed TS-SCI access status (as granted by the U.S. Government). In other words, the Diogenesian tool used to find "an honest man’ was simply whether he or she possessed TS-SCI access status. In the case of the PHI, one of the criterion defining groups was a rather rare group who possessed TS-SCI access status and then on top of this extremely high security clearance status also had been evaluated and had been reassigned to a very special program that involved even higher security clearance status that was above (or on top of) the TS-SCI level. Of course, we are fully aware that even the possession of TS-SCI access status does not totally or absolutely guarantee complete honesty or trustworthiness. However, can anyone come up with any better groups to use in trying to operationally define honesty, integrity, trustworthiness, etc.? For anyone having familiarity with the U.S. Government security clearances structure design, there are a number of clearances that are above the TS-SCI access level, however they are granted to only very small groups of people at the very highest levels in our government; many times only under 10 or 20 people so involved. These are for the very top White House, DoD, Congressional civilian officials along with the top generals/admirals. Those in what has been explained in this presentation as being the special program in which one had to already possess TS-SCI status, along with some very needed skills, in order to even be considered to start processing for possible entry into the program. If successful, then the candidates for entry into this special program are then granted even ‘higher than TS-SCI’ clearances. If persons in this very special program, in a very sensitive governmental agency, cannot be considered as being very trustworthy, then it is difficult, in a real world situation, to come up with any identifiable group that could be considered as having this distinction. Summary Comment Both the PHI and the PSSPQ test instruments
have been described and discussed in this presentation, hopefully to essentially
communicate the unusual nature of the validity criteria for both tests.
Although both tests are in fact rather similar in form; actually they share
a large number of the same items, they were designed for very different
purposes. The PSSPQ was designed and successfully validated to accurately
predict, for persons who might be processed for possible granting of high-level
security clearance status, who would and who would not be successfully
in finally being granted such clearances. In contrast, the PHI was
designed to establish a response standard (i.e., standardized norms) for
trustworthiness based upon a large data-base constructed using test responses
obtained from a group of employees of a very
_______________________________________________________________ For those readers who might wish to communicate with Dr. Stone regarding the PHI test, his Email address is: lastone2@earthlink.net.
|