|
LeRoy A. Stone, Ph.D.; (Forensic Diplomate) ABFP, ABPP
This represents an announcement that the PHI test is currently available for potential use. It’s initial development was started in the mid-1980s and completed recently. It’s purpose was conceived of as being an integrity type test in that its items questioned matters involving integrity and honesty matters, including a focus upon: alcohol and drug use history, legal violations, financial irresponsibility, security violations and matters involving undesirable character formation. The PHI also includes a LIE scale that has psychometric properties not shared by most such scales that have been touted as measuring dissimulation attempts; this is because scores on the PHI are unrelated to degree of intelligence. The PHI consists of only 50 rather special multiple-choice items. They are special as they are based upon an item model that has been previously labeled as “Scalogram Analysis (by Professor Louis Guttman). This particular item paradigm is unusually “powerful” and the result, in the situation of the PHI test, is that its 50 items are equal to what normally is measured by 250 true/false type items. Administration of the PHI can be done individually or in group mode; it seldom requires more than 10-15 minutes. Reading level testing has been done with the PHI and it appears that, in order that the PHI be responded to in a valid fashion, a reading skills level at about the 7th or 8th grade level is required. Hand or computer scoring of a number of different scales can be accomplished. The PHI is scored for six ‘trustworthiness’ sub-scales and the one just mentioned LIE Scale. Areas of inquiry that are not inquired about, but undoubtedly are associated with the general concepts of integrity and trrustworthiness, are those that are clearly focused upon mental health and counseling/psychotherapy and other areas that are ‘protected’ by the Americans for Disability Act of 1994. In a number of respects, the PHI is not overly different in form from a number of already other existing integrity measuring tests. Already completed and reported PHI test research has produced results that support a contention of acceptable reliability for the test. Acceptable reliability for a psychological test is demanded in order for the test to be ‘accepted,’ especially in commercial matters. Reliability pertains to the stability of what is being measured by a test. For example, if the weight scale shows that an individual weighs 150 pounds, then the person steps off and then again on the scale and the scale now shows a weight of 275 pounds, then the person steps off and then again on the scale and the scale now shows a weight of 55 pounds – this would suggest that the scale measures weight in a highly unreliable fashion. However, if the scales show that the individual’s weight is constantly that same, under the above described conditions, then the scale could be regarded as producing a reliable measure of weight. Research indicates that the PHI produces reliable measurement. Actually, the other available integrity tests would appear to show similar levels of reliability. Where the PHI test can be argued to be superior to other available integrity tests is with the measurement concept of validity. Validity, when considered with regard to psychological tests, is basically the answer to the question – does the test in question measure what it proports to measure? When this question is applied to just about all of the currently available integrity tests, the answer is not very clear. For a number of the tests, the test look like they measure integrity as they consist of items whose content pertains to matters generally regarded as having some association with concepts of honesty and integrity. This is known as ‘content validity’ and with many tests it is a desirable characteristic and with others, just the opposite. A good example of this latter idea are the many constructed LIE or dissimulation scales; with this testing purpose the last thing in the world a test builder would want a LIE scale to be composed of items whose content look like they are LIE scale items. It can be said that the PHI test (except for its LIE scale) possesses excellent content validity. What the many integrity test competitors of the PHI test do not have, when their test validity is being evaluated is a normative or criterion group, upon which the test was standardized, for whom it can be easily and convincingly argued, possess the characteristic that the PHI test proports to measure. The PHI test was empirically validated using a very clearly defined group of 269 persons, all of whom held just about the highest security clearance status granted by the U.S. Government. In fact, 63 of the 269 actually had been granted an even high security clearance status as they were employment assigned in an extremely sensitive intelligence agency project that required an even high clearance status. All of these people (males = 157 and females = 112) had to originally go through security clearance evaluation adjudication processing that involved extremely extensive background investigations, polygraphing, psychological testings, all types of ‘vetting’ examination in order to obtain employment accompanied by the granting of Top Secret – Sensitive Compartmented Information (or TS-SCI) access. This group of agency employees had to undergo, every five years thereafter, a similar repeated level of reevaluation using most of the just mentioned evaluation tools and techniques. The sub-group of 63, in fact, had to undergo even more stringent and strict evaluations for trustworthiness, and normally even more frequently than the five-year period mentioned for the other 206 members of the trustworthiness criterion group. A much more extensive description of the matters described in this paragraph can be found at: http://www.home.earthlink.net/~lastone2/trustworthinesstests.html. A psychologist colleague of Dr. Stone, asked him as to what might be the costs to the Government for evaluation processing a person up to the granting of TS-SCI security clearance status. Costs for such have been estimated and for some individuals the cost may be as low as a couple thousand dollars; however, for some individuals the costs are easily over $25,000.00. Several years ago, in a Government publication, it was stated, that on the average, a cost of about $10,000.00 could be a fair estimate for the costs associated with the processing up to the granting of TS-SCI access. If this figure is used, then the costs to the U.S. Government to conduct all of the processing and work required to just initially grant 269 individuals TS-SCI access would be $2,700,000.00. This would have to be considered to be a very conservative monetary estimation as most of those in the group had been through repeated reevaluations (i.e., once every five years). Also, the costs associated with the special sub-group of 63 persons’ security clearance evaluations, being even more extensive and the reevaluations more frequent, were also more costly to conduct. It would probably be very safe to conclude that the total costs to the U.S. Government, for all of the security investigations regarding trustworthiness of these 269 persons would at least be about five million dollars, and more likely closer to ten million and perhaps even more dollars. A question can be raised at this point, if it cost the U.S. Government (actually, which is us!) many millions of dollars to evaluate whether the group of 269 persons were ‘trustworthy’ or not, then why not rely upon the final adjudicated decision that they indeed were. After all, the evaluation process made use of just about every, widely-accepted, means or ‘tools’ for evaluation of a person’s character that could be involved. If someone could suggest a valid source of evaluative information that could be used in these governmentally conducted character assessments of persons that is not employed in the evaluation of persons for potential TS-SCI access granting, the U.S. Government would most certainly be eager to learn of such. In other words, these 269 people, each possessing TS-SCI clearance status, can be easily, really without any significant argument, be accepted as being a group that could be used as a normative standard regarding the characteristic of trustworthiness. If this last statement can be accepted as being valid, then scorings on tests (or sub scales) obtained by this group, should be considered to represent test-taking behavioral patterns, especially so if the content of the tests (or sub scales) mainly involved trustworthiness type matters. Mean (or average) scorings on the tests (or sub scales) could justifiably be used to define what a group of highly trustworthy persons would tend to, or be expected to, obtain when administered these tests (or sub scales). If the content of the tests (or sub scales) mainly involved trustworthiness, honesty, or integrity, then it would be logical to assume that this studied group of 269 persons’ mean (and standard deviation) scoring values could be used as a type of, or base-rate for, defining a high level of trustworthiness or the like. If a person (who was not in the group of 269) took the same tests (or sub-scales) and obtained scorings that were marked different (perhaps, according to the content of the tests that was suggestive of a lessened degree of trustworthiness), then it would be safe to assume that this individual could be regarded as being “not very trustworthy.” Such and evaluative decision could be justified or backed up when comparing the individual’s scorings with those expected scores obtained by the criterion group of 269 persons who as a group can be defined as all being definable as a highly trustworthy group of people. For the very first time, an integrity type test, i.e., the PHI, can be considered to be a valid test, based upon the presence and use of a criterion group that can be defined as one in which all members of the group all possess a high degree of trustworthiness. Their high degree of trustworthiness was defined by their all possessing TS-SCI access status, a classification that was based upon governmental processing that cost millions and millions of dollars to conduct and a final adjudicative decision that was based upon just about all the information, in a ‘real world sense,’ that could possibly be obtained. When a person is administered the PHI, his/her scores on the six sub-scales, all focused on trustworthiness matters (i.e., Undesirable Character Traits, Financial Irresponsibility, Alcohol Abuse, Illegal Drugs and Drug Abuse, Record of Law Violations, and Security Violations) are scored and are compared to the standardized normative scoring information which was obtained from administering the PHI to the criterion group of 269 persons, all defined as being highly trustworthy. The same is done with scorings from the seventh sub-scale, the LIE sub-scale. Interpretation of an individual’s PHI scores are facilitated by use of the T-score paradigm, where the mean value is equal to 50 and the standard deviation has a value of 10. For example, if an individual obtains sub-scale scores equal to the criterion group’s mean scores on all the subs-scales, then his/her T-scores, on all the sub-scales would be 50. In contrast, if an individual obtains scores on the sub-scales that are all equal to two standard deviations below the criterions group’s mean scores, them his/her T-scores, on all the sub-scales would be 30. Naturally, scorings obtained with all the PHI test’s sub-scales have to be interpreted, not only based on the magnitude of the scores, but also the profile or configuration nature of the sub-scale scores (i.e., the six trustworthiness scales, the LIE scale, the Total score [which is merely the summation of the six trustworthiness sub-scales, and some developed score ratios and summations. One approach that facilitates interpretation of an obtained set of PHI sub-scale scores is use of a strategy known as computing a scatter or deviation index value. A scatter/deviation index is merely the summation of the absolute deviations (on the T-score scale) of obtained scores from the mean value of 50. For example, if an individual obtained the following seven sub-scale T-scores: 70, 60, 20, 50, 75, 35, and 62; then his/her scatter index would be equal to 112 (i.e., 20+10+30+0+25+15+12 = 112). The greater the scatter/deviation index, the greater is the overall psychometric distance from the average PHI performance of the trustworthiness criterion group’s normative level. More distance, the more unlikely that the set of scores, to be interpreted, are comparable to the normative scoring information based upon testing results of the criterion group. If one tested individual showed a very smaller scatter index, compared to the scatter index for a second individual, then the first individual could be psychometrically interpreted as likely being more trustworthy of the two. Another productive way of interpreting PHI sub-scales scores, is to compare a mean of the T-scores for the six trustworthiness scales and compare this mean T-score with the LIE sub-scale T-score [note – normally this is accomplished using a ratio format]. If the ratio of trustworthiness mean T-score to the LIE sub-scale T-score is less than unity, then a “fake-good” response style can be believed to significantly influenced response to the items of the trustworthiness sub-scales. The smaller this ‘less-than-unity’ ratio value is, the greater the degree of “faking good” response style can be inferred to have affected the involved individual’s test-taking. In contrast, this type of ratio, when greater than unity, can be believed to suggest that the test-taker was being relatively honest and candid in his/her responses to the PHI items, in general. The higher the ‘greater-than-unity’ ratio value is, the greater the degree of response honesty in his/her responses can be inferred. Research has shown that the LIE scale can be considered as being relatively independent of intelligence. What this means is that, it as a test of positive dissimulation (or “faking-good”) is quite unlike most of the other lie-type scales, that are contained in many of the most popular psychometric tests of personality constructs, directional response to it has been found to not be related level of intellect. Smarter individuals seem to not be able to ‘see through’ the scale and determine what it was designed to measure any better than lesser intelligence people. Because of this aspect of the LIE scale, it’s scores can prove to be a very valuable tool when attempting to identify individuals on a trustworthiness dimension. Too high a scoring (i.e., perhaps any score above one standard deviation above the mean as seen in the normative data; which would be anything above a T-score of 60) should definitely raise a red flag. T-scores below 60 can be interpreted as suggestive that the individual was open and candid in his self-descriptions. T-scores below 40, although perhaps suggesting the same conclusion should be considered very carefully as what they are suggesting is that the involved individual is even more open and candid than was the normative group, who defined very high levels of trustworthiness. This warning should definitely apply if the seen T-score is below 30; such a scoring situation most likely suggests that the individual has “seen through” the LIE Scale and has attempted to, in an invalid fashion, appear very open and candid in his/her self- descriptions. Direct interpretation of the scorings on the six individual trustworthiness scales definitely should be attempted. For example, suppose five of the six sub-scales all showed T-scores of 50 but his T-score on the Ilegal Drugs & Drug Abuse sub-scale was 75, this single score might provide sufficient information upon which to base an ‘unfavorable’ decision regarding this individual. However, before making such a decision, it would be a wise move to firs attempt to discuss this individual’s past drug involvement history and use with him. It is possible that some type of medical origin was the basis for a subsequent drug use problem that he long ago was successfully treated for. Non-response to one or more items definitely should be followed up by quizzing the involved individual as to how and why some particular item(s) were not responded to. Simple carelessness most likely will be his/her response, but once in a while a very purposeful response omission will be encountered and such are usually very fruitful to follow-up with the involved individual. One type of scoring interpretation that has yet to be explored is the configural or profile interpretation type. However, it is somewhat anticipated that not much interpretative value will be found using such a format. Profile interpretation was examined with the PSSPQ and although some rather low, but statistically significant, correlation coefficients were found between some specific configural patterns and subsequent success/failure to be granted high-level security clearance status, not much additional predictive power was obtained by their usage. However, even with some early understanding that configural analysis of the profiles of PHI sub-scales scores may not produce any major diagnostic information regarding trustworthiness possession, some further research along these lines is planned. By now, readers of this presentation
have been made aware that certain types of scorings, from most available
integrity tests, may be interpreted as possibly diagnostic of future
problem employees, there really has been no way to compare such scorings
to the actual normative scoring information from a sizeable group of people
who unquestionably could be regarded as possessing a high degree of trustworthiness
Only the PHI has this latter suggested advantage. Obtained scores,
on all seven of the sub-scales, as well as various combinations of these
scorings, can be compared and contrasted with standardized normative information
involving a real world definition of the concept hopefully being measured.
With integrity tests, other than the PHI, they only are capable of comparing
obtained scores to some normative scaled scoring data that is simply nothing
more than collections of data from (sometimes) large numbers of persons
who were tested simply because they were applicants for employment or something
similar. These other integrity tests ARE NOT empirically validated
using a goup(s) of persons who legitimally, and without argument, can be
believed to possess a high degree of trustworthiness.
Note - Another psychological test, developed by Dr.
Stone and having some relationship with the PHI, is the Personnel Security
Standards Psychological Questionnaire (PSSPQ. It is capable of quite
accurately predicting (i.e., with better than 95% accuracy), whether persons
facing the evaluation/adjudication process leading to possible granting
of high-level security clearance status, will be successful or not.
Information regarding the PSSPQ can be found at:
Click
Here to Go to
Copywrited by Dr. L. A. Stone, May 2003 |