|
Personnel Security Standards Psychological Questionnaire (PSSPQ)
Note – In recent months, Dr. LeRoy A. Stone, the developer and current commercial offer of this test, has been repeatedly asked, mostly by trained psychologists, to provide the psychometric statistical information that is descriptive of past reliability and validity estimations for the PSSPQ. Although the several other Web pages, which have been on the Web for generally over a year, that are descriptive of the PSSPQ, provide a good deal of description of both the historical as well as the technical background of this test, no in-depth description of the reliability/validity determinations have yet been presented. Communication of this information is the raison d’etre of this current presentation of Web sub-pages that are devoted to an adequate description of PSSPQ reliability/validity estimation determinations. Due to the extreme length of the following presentation, it is suggested that readers print out a ‘hard copy’ to facilitate study and their understanding. As indicated in the Note-paragraph above,
additional information regarding the PSSPQ’s estimations of reliability
and validity has been mainly requested by psychologists, especially those
with strong interests and backgrounds in psychometrics. To satisfy these
requests, the following set of Web sub-pages is presented. These pages
will most likely present a kind of technical information that either will
not be understood by a goodly number of people, as it has been written
especially for those with sufficient psychometric expertise backgrounds.
However, for those with true or bona fide interest in the PSSPQ
and possible usage of it, a reading of the following pages should be considered
to be essential. The contents of the following pages can be considered
to provide a sound basis for truly believing that the PSSPQ can accomplish
that what it as been described as being capable of doing. The
PSSPQ is, without question, a highly reliable and valid psychometric testing
instrument!
Estimations of Reliability Test-Retest
Internal Consistency
Based upon their MMPI scores (each subject was also administered this test in addition to the then beta version of the PSSPQ, this sample appeared to be rather representative of the general population as their mean MMPI scales scores seemed to be close to recognized norm values. In the studied sample of 102 individuals, 61 were eventually and finally successful in obtaining TS-SCI access; 41 therefore were not successful in obtain this access. This particular sample was especially well suited for the validity testing of the PSSPQ. Based upon PSSPQ data from this standardization group, some internal consistency forms of reliability were determined. The internal consistency forms (i.e., based upon Kuder-Richardson models) were employed with the PSSPQ and its scales and such estimation attempts resulted in very encouraging and supportive results. When the five-level item responses are ‘collapsed’ to only two response levels, Kuder-Richardson Formula 20 reliability estimates for the Close Relatives and Associates, Sexual Considerations, Undesirable Character Traits, Financial Irresponsibility, Alcohol Abuse, Illegal Drugs and Drug Abuse, Emotional and Mental Disorders, Record of Law Violations, Security Violations scales, PSSPQ Total Score and LIE Scale are as follows (respectively): 0.93, 0.90, 093, 0.80, 0.94, 0.96, 0.91, 0.80, 0.93, 0.97, and 0.86. Again it should be noted that these Kuder-Richardson reliability estimates are based upon the sample of studied contractor employees who were, at the time, being processed and considered for possible security clearance status. Split-Half
The reliability estimations for the involved
remaining sub-scales were lower were (with one exception) statistically
significant at the .001 level (the exception was significant at the .01
level). The exception sub-scale was the three-item Security Violations
Scale) and it is believed that this scale was much too short a scale for
appropriate utilization of the split-half reliability estimation paradigm.
Of course it was impossible to utilize this reliability estimation paradigm
with the two PSSPQ sub-scales which involved only one item each. In general,
utilization of split-half methodology has resulted in quite favorable reliability
estimations of the PSSPQ scales. Again, it should be noted that these estimations
were computed from the contractor employee (N = 102) sample.
The Reliability Estimations Hold Up in Cross-Validation Research When the PSSPQ was cross-validated with
an larger and independent sample (i.e., N = 179), approximately
one year following the earlier initial validation (i.e., with an N
of 102), reliability estimations, of the kinds (with the exception of the
test-retest model) described in the several preceding paragraphs, were
again computed. Basically what was found, just about the same size reliability
estimations were again calculated. Just about all of the differences between
comparable reliability coefficients was about what one would expect on
the basis of chance alone. Based upon all the test administration data,
coming from both the original and the cross-validation samples, when submitted
to reliability estimation paradigms of the internal consistency (i.e.,
Kuder-Richardson Formula 20) and split-half (when corrected for scale length)
types, a reassuring appearance of highly acceptable test reliability can
be inferred.
Estimations of Validity Face/Content Validity
Empirical [Prediction] Validity
In attempting to ascertain how PSSPQ results might be used to maximize accuracy in the prediction of whether individuals would or would not be granted TS-SCI access, it was discovered that utilization of specific sets of PSSPQ items, rather than the originally constructed PSSPQ scale entities, would achieve significantly higher and improved levels of prediction success. For example, it was found that an optimally weighted sum of information, based on a specific set of 25 PSSPQ items, would correlate extremely high with the validity criterion of whether, in fact, the involved individuals (N = 102) would or would not be granted TS-SCI access. The actual multiple correlation coefficient, in this specific 25 predictor items situation was 0.79 (p < .001). It has been found that the number of PSSPQ items can be as low as 15 and still the validity coefficient remains quite high (e.g., above 0.73). When the best predictive 25 PSSPQ items are considered, they allow for extremely accurate, early-on predictions of whether an individual will eventually be granted or not granted TS-SCI access. For example, with the studied 102 personnel, when the 25 items response information was entered into a developed multiple regression equation (or a simple discriminant function model), such allowed for quite accurate, early predictions of what actually later occurred. Out of the 102 persons only eight errors of prediction occurred. Seven of the errors might be regarded as being of the ‘false positive’ type in that these seven were predicted to obtain TS-SCI clearances and in fact they were not successful in this regard. Only one error of prediction, of the other type, occurred in that the involved individual was predicted (based on only information from the 25 PSSPQ items) to not obtain TS-SCI access and in fact was successful in this regard. The involved empirically determined accuracy rate therefore was quite high; 92.2% accuracy was achieved in correctly predicting whether an individual would or would not be eventually granted high-level clearance status. If one made use of additional statistical prediction information, such as the standard error of the multiple regression, then even more highly improved predictions can be made. For example, no errors of prediction were noted for those individuals who achieved prediction scores at or above the mean prediction score for ‘successful’ individuals; also, no errors of prediction were noted for those who achieved prediction scores at or below the mean prediction score for ‘unsuccessful’ individuals. Only two errors of prediction occurred for individuals who scored on the multiple regression prediction scale, which were within one standard error of the multiple regression prediction scale of their respective means on this scale. Therefore, almost all (six out of eight) errors of prediction did occur for individuals who were almost midrange between the ‘successful’ and ‘unsuccessful’ means on the multiple regression prediction scale. The prediction accuracy described in the above couple of paragraphs can be more completely and dramatically understood by inspecting a tabular presentation of this set of statistical prediction information. Such is shown below:
Not
Inspection of the above tabular presentation clearly shows that the prediction scale (based upon the multiple regression prediction model involving 25 PSSPQ items) ranged from a possible low of 0.75 up to a possible high of 2.49. The mean score on this scale for the "Not Selected" group (which consisted of those 41 individuals who were not successful in later obtaining high-level security clearance status.) was 1.00; the mean score on this same scale for the "Selected" group (consisting of those 61 individuals who were later successful in obtaining their hoped-for security clearances) was 2.00. It can be easily seen that the higher and the lower scores on this prediction scale, with just about perfect accuracy, predicted whether, in fact, that the associated individuals would or would not be eventually granted TS-SCI access. A small number of prediction error (eight totally) occurred only for those individuals who had rather scores in a quite narrow mid-range segment Readers (most likely, psychometric oriented psychologists) of this presentation should be aware that the obtained 0.79 predictive validity correlation coefficient is unusually high for any psychological test used in a ‘real world’ situation. As has been discussed by Cronbach (1960), applied psychologists have almost "abandoned their insistence on validity coefficients of .70 or .80 for all tests" (p. 349) as such are very seldom encountered. He noted that, based on 30 years of practical testing experience, "we cannot obtain such standards." He used this as an introduction to the idea that validity coefficients "as low as .30 are of definite value" (p. 349), In this regard, one is reminded that Strong (1943) [one of the "greats" in the field of psychometrics] commented that he had observed that test critics who are hostile towards the idea of lower value validity coefficients (i.e., correlations) are quite willing to accept information of no greater dependability "when (they) play gold or employ a physician." According to Strong, the correlation of golf scores between the first and second 18 holes in championship play is about .30, and the reliability of medical diagnosis is near.40 (i.e., p. 55). To further show that PSSPQ scorings are valid in a predictive sense, it should be noted that multiple correlations between the set of 25 PSSPQ items with interviewing psychologists dichotomous recommendations was seen to be 0,718 (p < .001) and with the chief psychologists later, further-on, dichotomous recommendations was just barely higher, 0.725 (p < .001). The particular 25 PSSPQ items, along with their multiple-regression B weights will not be described in this presentation as this particular prediction information of considered to be proprietary information and will not be publicly shared. However, it can be stated that the selected 25 items were rather representative of the total 72 items, although not all of the PSSPQ scales had items in this item collection. A principal components factor analysis was done with the 25 item collection and did contribute some further understanding of what was being measured by this particular collection of items. It was interesting to note that two of the 25 items did not come from PSSPQ scales based upon DCID 6/4 adjudication concern areas, but rather came from the 10 item length LIE Scale. Cross Validation
Then a multiple regression model was computed using all of the PSSPQ items information from all 102 subjects. This model was then used to predict adjudication status for each of the studied 102 subject and such was compared to what was their actual adjudication status with respect to TS-SCI access. For only two of the last obtained 52 subjects did their prediction status change when it was based upon the multiple regression model which was computed using all 102 subjects information. For one of these two subjects his predicted status, based upon a prediction model not employing his scores in the development of the model, was that he would not be successful in obtain TS-SCI access, whereas when employing the prediction model, based upon all 102 subjects, he was predicted to obtain TS-SCI access. Exactly the opposite prediction results occurred with the other subject. As noted earlier, this form of cross-validation has been referred to as a "folding-over technique" and is especially useful in multiple regression (or discriminant function) situations where N size is not overly large. It should be noted that both of the above described individuals who changed predicted status when the prediction model included their own scores were seen as having positions n the prediction scale which was most central in the midrange region on the scale continuum. Such results do strongly suggest a contention that, with the limitation of only having employed a sample size of 102 studied subjects, very favorable cross-validation has been demonstrated Approximately a year or so later, the PSSPQ had been administered to another independent sample of contractor personnel who were also involved in the process of being evaluated for potential TS-SCI clearance status granting. Basically, somewhat similar demographic description of this group was seen when compared to that noted for the first or initial validation group. One difference was that the level of non-success in later being granted was of a lesser degree than was noted with the first or initial validation group. For the earlier group, it was roughly about 41% whereas with this second group (N = 179) it was quite a bit less, about 24%. When a multiple correlation coefficient (or simple discriminant function) coefficient was computed for the "best" 25 PSSPQ items and the success/failure determination, regarding the granting of high-level security clearance status, it was found to be only slightly less than the numerical value of the original validity coefficient; the cross-validation coefficient was seen to be 0.74, as compared to the original validation coefficient of 0.79. In terms of predictive validity, the PSSPQ most certainly can be regarded as showing excellent cross-validation predictive validity. Construct Validity
Factorial Validity
This factor analysis of the PSSPQ scales (which reflect the DCID 6/4 adjudication area concerns) clearly show that "suitability-nonsuitabilitiy for TS-SCI access is not a simple construct but is something which must be understood multidimensionally. A separate research investigation was completed a few years back that employed this PSSPQ data, and was focused upon study of the DCID 1/14 (i.e., which is an earlier version of the now-current DCID 6/4) adjudication area concerns. This ‘other’ research employed a principal components factor analysis paradigm. In this other research, almost the same factorial type results were obtained. Ina the principal factors analysis, the principal factor, Factor I was almost identical to the principal components Factor III; the principal factors Factor III was almost identical to the principal components Factor I; the principal factor Factor III was almost identical to the principal components Factor II; and the principal factors Factor IV bore some major similarity to the principal components Factor IV. Communalities in the principal components analysis were uniformly higher than in the principal factors analysis. In the generalized paradigm of factorial validity, the obtained factor loadings results (actually, from two different factor analyses of basically the same data) can be considered to represent validity coefficients with respect to the obtained factors. Although no real description will be attempted in the current presentation, factor analyses (i.e., using the principal components model, followed by varimax rotations) of the 10 (after omitting the LIE Scale and the single-item Loyalty Scale) PSSPQ scales has been accomplished. In other words, for each of the 10 scales, the items comprising the scale have been factor analyzed. As a result the factor structure for each of the involved PSSPQ scales has been obtained. In general, the results have proven to be readily interpretable and therefore have enlarged our understanding of the very specific nature of behaviors and misbehaviors associated with DCID 6/4 adjudication concerns. Another area of PSSPQ research that has
been extensive, but which is not going to be described here, is that involving
the correlation of many PSSPQ scorings (e.g., items scores, scale scores,
factor scores, etc.) with other test scores (e.g., MMPI, verbal intelligence
measures, etc.) as well as with many biographic/demographic variables (e.g.,
age, gender, education levels, etc.) These many hundreds of correlations
have been entered into extensive matrices and have been subjected to a
number of multidimensional analyses (e.g., factor and principal components
analyses, canonical analyses, discriminant analyses, etc.). These research
endeavors have led to a very large number of understandings of what is
measured by the PSSPQ. It can be comfortably stated that much of these
understandings can be considered to represent a good deal of support to
add to and further support construct validity arguments for the PSSPQ.
Actually, some of this additional research that has extensively explored
correlations of a great many variables (biographic, demographic, psychometric
testing variables, etc.) adds even more credence to even further factorial
validity arguments.
What About the Reliability of the Adjudication Decisions Pertaining to the Granting or Non-Granting of High-Level Security Clearance Status Readers of this PSSPQ development description presentation, mainly focused upon reliability/validity concerns, who truly have a psychometric background and who are not youngsters in the field, likely have some familiarity in the theory of measurement error (as found in Gulliksen, 1950 or Nunnally, 1967) and will quite easily understand what is argued in the following paragraph. If one is willing to assume in that with any two sets of measures, errors from each set are uncorrelated between sets and that error on either set is uncorrelated with ‘true’ scores, then the measurement model for the correction for attenuation can be developed. Students of theories of measurement error are quite familiar with the fact that from the classic formula for computing the correction for attenuation another formula can be derived provided that one make the assumption that the correlation between ‘true’ scores (i.e., without measurement error being involved) from both the variables involved be equal to unity. This particular formula is: r12 = ( Ö r11) ( Ö r22) In other words, the upper limits size of the correlation between variables 1 and 2 (e.g., the PSSPQ and the decison to grant or not grant TS-SCI security clearance status) is equal to the product of the square root of the reliability of variable 1 and the square root of the reliability of variable 2. In this fashion, it can be seen that a correlation coefficient between any two variables is a function of the reliabilities of the involved variables. One of the major uses of the above shown formula is that its elements can be rearranged so as to allow for the estimation of a reliability coefficient for one set of measurements if the correlation coefficient between the two measurement sets is known or established in some fashion and if the reliability coefficient for the other set of measurements has been established. Such is the situation with our PSSPQ data at the present time. A correlation coefficient of 0.79 (i.e., the multiple correlation coefficient) has been clearly described as having been found to describe the relationship between the a weighted sum of information associated with the so-called ‘best’ PSSPQ items with the final adjudication decision (made by the U. S. Government) to grant or not grant high-level security clearances. Also developed have been two different estimates of the reliability of the PSSPQ total scores measure; i.e., 0.94 based on the test-retest paradigm and 0.97 based on the Kuder-Richardson internal consistency model. If one accepts that the value of 0.95 (the average of the two reliability estimates) can represent a reliability coefficient for the PSSPQ, then the heretofore unknown and never-before estimated reliability for Government made adjudicational decisions, regarding the granting or non-granting of TS-SCI access, can now be known (or at least, argued) When 0.95 (the PSSPQ’s averaged reliability estimation) and the 0.79 (the Multiple R involving 25 of the PSSPQ’s items with the final adjudication decision) are entered into the correction for attenuation formula given above in one of the previous paragraphs, then the up-to-now, unknown reliability for the adjudication decision variable (favorable or unfavorable) can be determined and specified. This particular determined reliability value, in the present situation can be easily shown to be equal to 0.656, which is not an overly impressive reliability coefficient. However, for a measurement that is only dichotomous in kind, such a sized reliability estimation value seems not to be surprising to those who have studied judgmental reliability for diagnostic and prognostic evaluations regarding human beings. It can be noted that that the estimated reliability coefficient, having a numerical value of 0.656, can be thought of only as an ‘upper-limit’ value, as this estimation value was obtained by having to assume that the correlation of ‘true’ PSSPQ scores with the ‘true’ adjudication decisions was perfect and equal to unity. Psychometrically speaking, it is almost certain that the actual reliability for the Governmentally made adjudication decision is much lower than the 0.656 value presented and discussed here. The major conclusion which can be made, based upon information and logic described in the couple preceding paragraphs, is that if one wishes to increase the correlational prediction accuracy of the PSSPQ with respect to it’s ability to predict subsequent adjudication decisions, then it would be far more productive to attempt to increase the reliability for the making or formulating the adjudication decision-making rather than attempting to change, modify, or add-to the PSSPQ itself. All of the psychometric evidence points to a rather clear conclusion that the PSSPQ possesses excellent reliability. If there is a reliability problem in what is being measured, such is much more a problem with the Governmentally-made adjudication decision variable. This is what has been observed frequently
in the past in the applied areas of psychology. When using well-constructed
psychometric tests in applied situations, it is also the rule and not the
exception, that the tests possess far superior reliability than do the
involved validity criterion measures themselves. A good example, are the
correlational relationships that are found between extremely well developed
and constructed mental ability tests with school grades. Academic grading
has been notorious for being known to possess very poor measurement reliability.
Another example that has been well discussed, in the industrial psychology
literature, relates to poor reliability for the job interview type situation
(e.g., Guilford, 1959; Ulrich & Trumbo, 1965) The advice in this type
of situation is not that we need to improved psychological tests in order
to more accurately predict final adjudication decisions but rather what
is needed are better adjudication decision-making procedures which can
allow for improved reliability of such decision making. Readers of this
paper are urged to NOT tell or suggest to Dr. Stone (the developer and
current purveyor of the PSSPQ) that more research should be done so that
this testing instrument should be improved, added-to, or in some way changed.
If one wants to improve the accuracy of using PSSPQ scorings to predict
success/failure of individuals to be granted or not granted TS-SCI clearance
status, the way to increase the prediction accuracy is very simple – IMPROVE
THE RELABILITY OF THE GOVERNMENTAL DECISION MAKING PROCEDURE FOR ADJUDICATING
DECISIONS TO GRANT OR NOT GRANT HIGH-LEVEL SECURITY CLEARANCE STATUS!!!
References Cronbach, L. J. (1960). Essentials of psychological testing (2nd
ed.), New York: Harper &
Addendum Based upon the content of the above web-pages,
along with the format design of the Web Site in which they are a component
part, it is expected that most readers of these particular pages got to
these pages using a link from one of many subsections in Dr. Stone’s overall
Web Site that pertain to the PSSPQ. However, for those readers who got
to the above web-pages (counted on MicroSoft Word to consist of about 16
or 17 pages) in some fashion other than directly through a link on one
of Dr. Stone’s pages, some link to these indicated "other pages’ within
Dr. Stone’s Web Site that pertain the the PSSPQ and its use. Links to a
good number of these pages that involve presentation and discussion of
content that pertain to the PSSPQ test are as follows:
|