This page uses javascripts, but your browser is not currently supporting java scripts. Please turn ON scripting in your web browser. Click here to go to the Table of Contents for the State Estimates from the 2003-2004 NSDUHSkip To Content
Click for DHHS Home Page
Click for the SAMHSA Home Page
Click for the OAS Drug Abuse Statistics Home Page
Click for What's New
Click for Recent Reports and HighlightsClick for Information by Topic Click for OAS Data Systems and more Pubs Click for Data on Specific Drugs of Use Click for Short Reports and Facts Click for Frequently Asked Questions Click for Publications Click to send OAS Comments, Questions and Requests Click for OAS Home Page Click for Substance Abuse and Mental Health Services Administration Home Page Click to Search Our Site

2004 State Estimates of Substance Use
(from the 2003-2004 National Surveys on Drug Use & Health)

bulletNational data      bulletState level data       bulletMetropolitan and other subState area data

Appendix A: State Estimation Methodology

This report includes estimates of 22 substance use measures (see Section A.1) using the combined data from the 2003 and 2004 National Surveys on Drug Use and Health (NSDUHs). In addition to the 21 substance use measures for which age group–specific State estimates were produced and documented in the 2003 State report (Wright & Sathe, 2005), there is a new measure, past year nonmedical pain reliever use, introduced in this report. Also included in this report are estimates of change between 2002-2003 and 2003-2004 State estimates. This report is similar to the 2000 and 2001 State reports (Wright, 2002a, 2002b, 2003a, 2003b) that contained age group–specific State estimates obtained by pooling 1999–2000 and 2000–2001 National Household Survey on Drug Abuse (NHSDA)8 data, respectively. The 2001 State report also contained estimates of change between the 1999–2000 and 2000–2001 data for the 12 common substance use measures. As discussed in Chapter 1 (Section 1.1), several changes were introduced to the survey in 2002; thus, estimates for 2001 and prior years are not comparable with estimates from 2002 and later years.

The survey-weighted hierarchical Bayes (SWHB) methodology used in the production of State estimates from the 1999–2003 surveys also was used in the production of the 2003-2004 State estimates. The SWHB methodology is described in Appendix E of the 2001 State report (Wright, 2003b) and by Folsom, Shah, and Vaish (1999). The list of predictors used in the 2003-2004 small area estimation (SAE) modeling is given in Section A.2. The methodology used to select relevant predictors remains similar to the one used in prior years and is described in brief in Section A.3. The goals of SAE modeling, general model description, and the implementation of SAE modeling remain the same and are described in Appendix E of the 2001 State report (Wright, 2003b). At the end of this appendix, tables showing the 2002, 2003, 2004, pooled 2002-2003, and pooled 2003-2004 survey response rates are included (Table s A.1 to A.12).

Small area estimates obtained using the SWHB methodology are design consistent (i.e., for States with large sample sizes, the small area estimates are close to the robust design-based estimates). The State small area estimates when aggregated by using the appropriate population totals result in national small area estimates that are very close to the national design-based estimates. However, for numerous reasons (including internal consistency), it is desirable to have national small area estimates exactly match the national design-based estimates. Beginning in 2002, exact benchmarking was introduced as described in Section A.4. The definition and explanation of the formula used in estimating the marijuana incidence rate is given in Section A.5.

Included in this report for the first time are estimates of underage (ages 12 to 20) alcohol use and binge alcohol use. For all other outcomes the age groups of interest were 12 to 17, 18 to 25, and 26 or older. As alcohol consumption is expected to differ significantly across the 18 to 25 age group due to the legalization of alcohol at age 21, it was decided that it would be useful to produce small area estimates for persons aged 12 to 20. A short description of methodology used to produce underage drinking estimates is described in Section A.6.

Section A.7 discusses how serious psychological distress (SPD) estimates were produced. The methodology used to produce estimates of change between the 2002-2003 and the 2003-2004 State estimates is described in Section A.8.

A.1 Variables Modeled

The 2004 NSDUH data were pooled with the 2003 NSDUH data, and age group–specific State estimates for 22 binary (0,1) outcome variables were produced and presented in this report. The estimates of change between the 2002-2003 and 2003-2004 State estimates also were produced for the following outcomes:

  1. past month use of any illicit drug,
  2. past year use of marijuana,
  3. past month use of marijuana,
  4. perception of great risk of smoking marijuana once a month,
  5. average annual rate of first use of marijuana,
  6. past month use of any illicit drug other than marijuana,
  7. past year use of cocaine,
  8. past year nonmedical use of pain relievers,
  9. past month use of alcohol,
  10. past month binge alcohol use,
  11. perception of great risk of having five or more drinks of an alcoholic beverage once or twice a week,
  12. past month use of any tobacco product,
  13. past month use of cigarettes,
  14. perception of great risk of smoking one or more packs of cigarettes per day,
  15. past year alcohol dependence or abuse,
  16. past year alcohol dependence,
  17. past year any illicit drug dependence or abuse,
  18. past year any illicit drug dependence,
  19. past year dependence on or abuse of any illicit drug or alcohol,
  20. needing but not receiving treatment for illicit drug problems in the past year,
  21. needing but not receiving treatment for alcohol problems in the past year, and
  22. past year serious psychological distress (SPD).

A.2 Predictors Used in Mixed Logistic Regression Models

Local area data used as potential predictor variables in the mixed logistic regression models were obtained from several sources, including Claritas Inc., the U.S. Bureau of the Census, the Federal Bureau of Investigation (FBI) (Uniform Crime Reports), Health Resources and Services Administration (Area Resource File), the Substance Abuse and Mental Health Services Administration (SAMHSA) (National Survey of Substance Abuse Treatment Services [N-SSATS]), and the National Center for Health Statistics (mortality data). The list of major sources and potential data items used in the modeling are provided in the following text and lists.

The following lists provide the specific independent variables that were potential predictors in the models.

Claritas Data
Description Level
% Population aged 0–19 in block group Block group
% Population aged 20–24 in block group Block group
% Population aged 25–34 in block group Block group
% Population aged 35–44 in block group Block group
% Population aged 45–54 in block group Block group
% Population aged 55–64 in block group Block group
% Population aged 65+ in block group Block group
% Blacks in block group Block group
% Hispanics in block group Block group
% Other race in block group Block group
% Whites in block group Block group
% Males in block group Block group
% Females in block group Block group
% American Indian, Eskimo, Aleut in tract Tract
% Asian, Pacific Islander in tract Tract
% Population aged 0–19 in tract Tract
% Population aged 20–24 in tract Tract
% Population aged 25–34 in tract Tract
% Population aged 35–44 in tract Tract
% Population aged 45–54 in tract Tract
% Population aged 55–64 in tract Tract
% Population aged 65+ in tract Tract
% Blacks in tract Tract
% Hispanics in tract Tract
% Other race in tract Tract
% Whites in tract Tract
% Males in tract Tract
% Females in tract Tract
% Population aged 0–19 in county County
% Population aged 20–24 in county County
% Population aged 25–34 in county County
% Population aged 35–44 in county County
% Population aged 45–54 in county County
% Population aged 55–64 in county County
% Population aged 65+ in county County
% Blacks in county County
% Hispanics in county County
% Other race in county County
% Whites in county County
% Males in county County
% Females in county County

2000 Census Data
Description Level
% Population who dropped out of high school Tract
% Housing units built in 1940–1949 Tract
% Persons 16–64 with a work disability Tract
% Hispanics who are Cuban Tract
% Females 16 years or older in labor force Tract
% Females never married Tract
% Females separated/divorced/widowed/other Tract
% One-person households Tract
% Female head of household, no spouse, child less than equal to symbol18 Tract
% Males 16 years or older in labor force Tract
% Males never married Tract
% Males separated/divorced/widowed/other Tract
% Housing units built in 1939 or earlier Tract
Average persons per room Tract
% Families below poverty level Tract
% Households with public assistance income Tract
% Housing units rented Tract
% Population 9–12 years of school, no high school diploma Tract
% Population 0–8 years of school Tract
% Population with associate's degree Tract
% Population some college and no degree Tract
% Population with bachelor's, graduate, professional degree Tract
Median rents for rental units Tract
Median value of owner-occupied housing units Tract
Median household income Tract

Uniform Crime Report Data
Description Level
Drug possession arrest rate County
Drug sale/manufacture arrest rate County
Drug violations' arrest rate County
Marijuana possession arrest rate County
Marijuana sale/manufacture arrest rate County
Opium cocaine possession arrest rate County
Opium cocaine sale/manufacture arrest rate County
Other drug possession arrest rate County
Other dangerous non-narcotics arrest rate County
Serious crime arrest rate County
Violent crime arrest rate County
Driving under influence arrest rate County

Other Categorical Data
Description Source Level
=1 if Hispanic, =0 otherwise Sample Person
=1 if non-Hispanic Black, =0 otherwise Sample Person
=1 if non-Hispanic Other, =0 otherwise Sample Person
=1 if male, =0 if female Sample Person
=1 if MSA with 1 million +, =0 otherwise 2000 Census County
=1 if MSA with <1 million, =0 otherwise 2000 Census County
=1 if Non-MSA Urban, =0 otherwise 2000 Census Tract
=1 if Urban Area, =0 if Rural Area 2000 Census Tract
=1 if no Cubans in tract, =0 otherwise 2000 Census Tract
=1 if no arrests for dangerous non-narcotics,
=0 otherwise
UCR County

Miscellaneous Data
Variable Description Source Level
Alcohol death rate, underlying cause NCHS-ICD-10 County
Cigarettes death rate, underlying cause NCHS-ICD-10 County
Drug death rate, underlying cause NCHS-ICD-10 County
Alcohol treatment rate N-SSATS (formerly called UFDS) County
Alcohol and drug treatment rate N-SSATS (formerly called UFDS) County
Drug treatment rate N-SSATS (formerly called UFDS) County
% Families below poverty level ARF County
Unemployment rate ARF County
Per capita income (in thousands) ARF County
Average suicide rate (per 10,000) ARF County
Food stamp participation rate Census Bureau County
Single state agency maintenance of effort National Association of State Alcohol and Drug Abuse Directors (NASADAD) State
Block grant awards SAMHSA State
Cost of Services Factor Index SAMHSA State
Total Taxable Resources Per Capita Index U.S. Department of Treasury State

A.3 Selection of Independent Variables for the Models

The State estimates for past year nonmedical use of pain relievers (ANLYR) were not produced in prior years. Hence, in order to be consistent with the other set of outcomes, the fixed-effect predictors for ANLYR were selected using the pooled 2002-2003 NSDUH data. These fixed-effect predictors were selected based on the steps detailed in Section A.3 of Wright and Sathe (2005), and their updated versions were used to produce 2003-2004 State estimates for ANLYR. For all the other outcome variables, no new variable selection was done. The updated versions of fixed-effect predictors that were used in modeling the 2002-2003 data were used to model the 2003-2004 data. Because the interest was to estimate change between the 2002-2003 and 2003-2004 State estimates, the same set of fixed-effect predictors was used for producing both sets of estimates.

A.4 Benchmarking the Age Group–Specific Small Area Estimates

The self-calibration built into the SWHB solution ensures that the population-weighted average of the State small area estimates will closely match the national design-based estimates. Given the self-calibration ensured by the SWHB solution, for State reports prior to 2002, the standard Bayes prescription was followed; specifically, the posterior mean was used for the SAE point estimate, and the tail percentiles of the posterior distribution were used for the prediction interval limits.

Singh and Folsom (2001) extended Ghosh's (1992) results on constrained Bayes estimation to include exact benchmarking to design-based national estimates. In the simplest version of this constrained Bayes solution where only the design-based mean is imposed as a benchmarking constraint, each of the State-by-age group small area estimates (for 2003-2004) is adjusted by adding the common factor image representing deltaa = (Da - Pa), where Da is the design-based national prevalence estimate and Pa is the population-weighted mean of the State small area estimates (Psa) for age group-a. The exactly benchmarked State-s and age group-a small area estimates then are given by image representing thetasa = Psa + image representing deltaa. Experience with such additive adjustments suggests that the resulting exactly benchmarked State small area estimates always will be between 0 and 100 percent because the SWHB self-calibration ensures that the adjustment factor is small relative to the size of the State-level small area estimates.

Relative to the Bayes posterior mean, these benchmark-constrained State small area estimates are biased by the common additive adjustment factor. Therefore, the posterior mean-squared error for each benchmarked State small area estimate has the square of this adjustment factor added to its posterior variance. To achieve the desirable feature of exact benchmarking, this constrained Bayes adjustment factor was implemented for the State-by-age group small area estimates. The associated credible intervals can be recentered at the benchmarked small area estimates on the logit scale with the symmetric interval end points based on the posterior root mean-squared errors. The adjusted 95 percent prediction intervals (PIs) (Lowersa, Uppersa) are defined below:

Lowersa = exp(Lsa)/[1 + exp(Lsa)] and Uppersa = exp(Usa)/[1 + exp(Usa)],

where

Lsa = log[image representing thetasa/(1 - image representing thetasa)] - 1.96 * image representing the square root of M S E sub s a,

Usa = log[image representing thetasa/(1 - image representing thetasa)] + 1.96 * image representing the square root of M S E sub s a, and

MSEsa = (log[Psa/(1 - Psa)]- log[image representing thetasa/(1 - image representing thetasa)])2 + posterior variance of log[Psa/(1 - Psa)].

The associated posterior coverage probabilities for these benchmarked intervals are very close to the prescribed 0.95 value because the State small area estimates have posterior distributions that can be approximated exceptionally well by a Gaussian distribution.

A.5 Calculation of Average Annual Incidence of Marijuana Use

Incidence rates typically are calculated as the number of new initiates of a substance during a period of time (such as in the past year) divided by an estimate of the number of person years of exposure (in thousands). The incidence definition used in this report employs a simpler form of the at-risk-population based on the model-based methodology. This model-based average annual incidence rate is defined as follows:

Average annual incidence rate = {(Number of marijuana initiates in past 24 months) /
[(Number of marijuana initiates in past 24 months * 0.5) +
Number of persons who never used marijuana
]} / 2.

In this report, the incidence rate is expressed as a percentage or rate per 100 person years of exposure. Note that this estimate uses a 2–year time period to accumulate incidence cases from each annual survey. By assuming further that the distribution of first use for the incidence cases is uniform across the 2–year interval, the total number of person years of exposure is 1 year on average for the incidence cases plus 2 years for all the "never users" at the end of the time period. This approximation to the person years of exposure permits one to recast the incidence rate as a function of two population prevalence rates, namely, the fraction of persons who first used marijuana in the past 2 years and the fraction who had never used marijuana. Both of these prevalence estimates were estimated using the SWHB estimation approach.

The count of persons who first used marijuana in the past 2 years is based on a "moving" 2–year period that ranges over 3 calendar years. Subjects were asked when they first used marijuana. If a person indicated first use of marijuana between the day of the interview and 2 years prior, the person was included in the count. Thus, it is possible for a person interviewed in the first part of 2004 to indicate first use as early as the first part of 2002 or as late as the first part of 2004. Similarly, a subject interviewed in the last part of 2004 could indicate first use as early as the last part of 2002 or as late as the last part of 2004. Therefore, in the 2004 survey, the reported period of first use ranged from early 2002 to late 2004 and was "centered" in 2003. About half of the 12 to 17 year olds who reported first use in the past 24 months reported first use in 2003, while a quarter each reported first use in 2002 and 2004. Persons who responded in 2004 that they had never used marijuana were included in the count of "never used." Similarly, reports of first use in past 24 months from the 2003 survey ranged from early 2000 to late 2003 and were centered in 2002. Half of the 12 to 17 year olds who reported first use in the past 24 months reported first use in 2002, while a quarter each reported first use in 2000 and 2003. Note that only incidence rates for marijuana use are provided in this report.

A.6 Underage Drinking

To obtain small area estimates for persons aged 12 to 20 for past month alcohol and binge alcohol use, a separate set of models was fit for these two outcomes for the 12 to 17 age group and the 18 to 20 age group. New variable selection (using the same methodology as described in Section A.3) was done for the 18 to 20 age group. Even though separate models were fit for the 12 to 17 age group along with the 18 to 20 year olds, no new variable selection was done for the 12 to 17 age group. Model-based estimates for persons aged 12 to 20 were produced by taking the population-weighted average of the individual age group (12 to 17 and 18 to 20) estimates. Estimates for underage drinking for past month alcohol and binge alcohol use were benchmarked to match national design-based estimates for that age group using the process described in Section A.4. Estimates of change between 2002-2003 and 2003-2004 underage drinking State estimates also are presented in this report.

A.7 Serious Psychological Distress

In 2004, SPD was measured using the K6 screening instrument for nonspecific psychological distress (Furukawa, Kessler, Slade, & Andrews, 2003; Kessler et al., 2003). In previous NSDUH reports, the K6 scale was referred to as a measure of serious mental illness (SMI). SMI was first measured by the National Household Survey on Drug Abuse (NHSDA) in 2001 for all persons aged 18 or older. SAMHSA's official definition of adults with SMI, based on a notice published in the Federal Register (SAMHSA, Center for Mental Health Services, 1993), is as follows:

Pursuant to section 1912(c) of the Public Health Service Act, adults with serious mental illness (SMI) are persons: (1) age 18 and over and (2) who currently have, or at any time during the past year, had a diagnosable mental, behavioral, or emotional disorder of sufficient duration to meet diagnostic criteria specified within DSM-IV or their ICD-9-CM equivalent (and subsequent revisions) with the exception of DSM-IV "V" codes, substance use disorders, and developmental disorders, which are excluded, unless they co-occur with another diagnosable serious mental illness. (3) That has resulted in functional impairment which substantially interferes with or limits one or more major life activities.

In prior NSDUH reports, the K6 scale was used to measure SMI according to the above definition. The K6 consists of six questions that ask respondents how frequently they experienced symptoms of psychological distress during the 1 month in the past year when they were at their worst emotionally. The use of this scale for SMI was based on a methodological study designed to evaluate several screening scales for measuring SMI in NSDUH. These scales consisted of a truncated version of the World Health Organization (WHO) Composite International Diagnostic Interview Short Form (CIDI-SF) scale (Kessler, Andrews, Mroczek, Üstün, & Wittchen, 1998), the K10/K6 scale of nonspecific psychological distress (Furukawa et al., 2003), and the WHO Disability Assessment Schedule (WHO-DAS) (Rehm et al., 1999).

In the 2003 NSDUH, the mental health module contained a truncated version of the CIDI-SF scale, the K10/K6 scale, and the WHO-DAS scale to mirror the questions used by Kessler et al. (2003). Thus, the module contained a broad array of questions about mental health (i.e., panic attacks, depression, mania, phobias, generalized anxiety, posttraumatic stress disorder, and use of mental health services) that preceded the K6 items, and the four extra questions in the K10 scale were interspersed among the items in the K6 scale. To create a score, the responses to six items on the K6 scales were coded from 0 to 4. Summing across all the responses resulted in a score with a range from 0 to 24. Respondents with a total score of 13 or greater were classified as having a past year SMI. This cutpoint was chosen to equalize false positives and false negatives.

In the 2004 NSDUH, however, the sample of respondents aged 18 or older was split evenly between the "long-form" module, which included all items in the mental health module used in the 2003 NSDUH (sample A), and a "short-form" module consisting only of the K6 items (sample B). The short-form version was introduced to reduce interview time, removing questions that were not needed for estimation of SMI, and to provide space for a new module on depression. Inclusion of the long-form version in half of the sample was to measure the impact on the K6 responses of changing the context of the K6.

Results from the 2004 NSDUH showed large differences at the national level between the two samples in both the K6 total score and the proportion of respondents with a K6 total score of 13 or greater. These differences were most pronounced in the 18 to 25 age group. These differences suggested that the K6 scale was not context-independent; that is, respondents appeared to respond to the K6 items differently depending on whether the scale was preceded by a broad array of other mental health questions. There were other concerns as well. For example, the face validity of the K6 scale suggests that it may be more useful as a measure of psychological distress or of affective-mood and anxiety-type disorders. A direct consequence of these concerns was that a decision was made that the K6 would no longer be used to measure SMI. However, the K6 data are still useful as an indicator of psychological distress (see Section B.4.4 of OAS, 2005c).

The 2004 national SPD estimates therefore were based only on data from sample A (respondents who got the long-form module). For the purpose of producing State-level estimates for this report, however, an adjusted measure of SPD, which was produced for the entire sample of respondents aged 18 or older in 2004, was used, and SMI data from 2003 were pooled with data using this adjusted measure of SPD from 2004. The adjustment made to the SPD score on the short-form module is described in brief here.

A logistic regression model was used to estimate differences between the short- and long-form SPD prevalence rates (i.e., propensities). Several demographic and drug use covariates were included in the model, and it was found that the propensities varied according to race/ethnicity and age group. Five propensity strata based on race/ethnicity and age group were constructed from the results of this analysis. Tests suggested that a gross adjustment approach might be more appropriate than an item-based adjustment approach. As a consequence, the cumulative distribution function (CDF) (gross) adjustment method was applied within each of the five propensity strata, and the method appeared to work quite well in adjusting the marginal estimates of a number of important demographic and drug use variables. This method also was shown to be fairly robust to the way the propensity strata were defined.

Consideration also was given to the use of this logistic regression model to provide adjustments, as well as to provide propensity estimates. However, although this approach was useful for estimating propensities, it was not useful in determining how to adjust individual short-form respondents' SPD prevalence rates to match those of long-form respondents within covariate profiles. Using this approach, the only way to match prevalence rates would be to use long-form prevalence estimates in place of short-form prevalence estimates within covariate profiles. This is equivalent to discarding all short-form data after the logistic regression model has been fitted. A similar argument applies to the use of polytomous logistic regression models to estimate differences between short- and long-form SPD scores.

Before the CDF adjustment method was developed, consideration also was given to ad hoc adjustments to differences between short- and long-form SPD scores within covariate profiles, estimated from, say, polytomous regression models. For example, if the average difference between short- and long-form SPD scores for a particular covariate profile (e.g., white females aged 12 to 17 in the West) was 1.7, then all short-form SPD scores would be reduced by that amount in the profile. However, there are a couple of problems with this ad hoc approach. First, this approach is equivalent to shifting the entire distribution of short-form scores to the left, creating a set of adjusted values ranging from –1.7 to 22.3 instead of 0 to 24. Second, although this approach might force SPD scores to match on average within a profile, there is no guarantee that they would match at the SPD cutpoint of 13, which defines prevalence rates. A variation to this approach would be to multiply short-form scores by a factor that forced the scores to match on average, but this is equivalent to rescaling the short-form distribution so that all scores are shrunk toward zero. Neither of these ad hoc methods was optimal. Hence, the CDF adjustment method was used to transform the distribution of scores obtained from the short-form module to match that of the long-form module such that the distributional properties of the SPD scores from the short-form module matched the distributional properties of the SPD scores from the long-form module, without the scores matching exactly.

Adjusted short-form SPD scores and prevalence rates (based on the CDF adjustment method) were not used to derive national estimates for the 2004 survey. National estimates used a much finer categorization for some of the demographic and substance use variables than were used in the creation of the adjusted SPD outcome, and at these finer categorizations some notable discrepancies were observed between adjusted short-form and corresponding long-form prevalence rates. For this reason, national estimates of SPD scores and prevalence rates were derived from only long-form data.

Adjusted short-form SPD scores and prevalence rates were used to derive State-level estimates based on pooled 2003 and 2004 survey data. Because State-level estimates used a much coarser categorization of demographic and substance use variables than national estimates, the problem of discrepancies observed at the finer categorization of national estimates did not occur. In addition, unlike national estimates, which were based on large sample sizes, State-level estimates were typically based on small sample sizes. Hence, it was necessary to use all the data available, including the adjusted short-form data. For details on how the CDF adjustment was implemented, see Aldworth, Chromy, Foster, Heller, and Novak (2005).

A.8 Measuring Change in State Estimates between 2002-2003 and 2003-2004

The estimates of change between State estimates displayed in Appendix C are based on the 2002 through 2004 NSDUHs. The State estimates for 2002-2003 are the previously published model-based small area estimates (see Wright & Sathe, 2005). The State estimates for 2003-2004 are the small area estimates given in Appendix B. The moving average State prevalence estimates for the overlapping 2002-2003 and 2003-2004 time periods were obtained from independent applications of RTI's SWHB methodology; that is, the 2003-2004 models were fit independently of the previously fitted 2002-2003 models. This independent analysis approach was followed because there was no desire to revise the previously published estimates. Moreover, the same fixed predictor variables were used in the 2002-2003 and 2003-2004 models, but annual updates were made when more current versions became available. The age group–specific fixed predictor variables were defined at five levels (namely, person-level, 2000 decennial census block group-level, tract-level, county-level, and State-level). Also, each age group model had 51 State-level random effects and 300 substate region–level random effects.

To estimate change in State estimates, let image representing pisa(1) and image representing pisa(2) denote 2002-2003 and 2003-2004 prevalence rates, respectively, for State-s and age group-a. The change between image representing pisa(1) and image representing pisa(2) is defined in terms of the log-odds ratio (lorsa) as opposed to the simple difference because the posterior distribution of the lorsa is closer to Gaussian than the posterior distribution of the simple difference (image representing pisa(2)image representing pisa(1)). The lorsa is defined as

Equation A-1.     D

The p value given in the Appendix C tables is computed to test the null hypothesis of no change (i.e., image representing pisa(2) = image representing pisa(1) or equivalently lorsa = 0. An estimate of lorsa is given by

Equation A-2,     D

where the psa(1) are previously published 2002-2003 State estimates and the psa(2) are the 2003-2004 State estimates presented in this report (see Appendix B). To compute the variance of The estimate of the log-odds ratio, lor hat, sub s and a, i.e., Variance v of the estimate of the log-odds ratio, lor hat, sub s and a, let Theta 1 hat is defined as the ratio of p 1 sub s and a and 1 minus p 1 sub s and a and Theta 2 hat is defined as the ratio of p 2 sub s and a and 1 minus  p 2 sub s and a, then

Equation A-7,     D

where covariance between the logarithm of Theta 1 hat and the logarithm of Theta 2 hat. denotes the covariance between logarithm of Theta 1 hat and logarithm of Theta 2 hat. This covariance is defined in terms of the associated correlation as follows:

Equation A-11.     D

Note that the variance of the logarithm of Theta 1 hat and variance of the logarithm of Theta 2 hat used here to calculate Variance v of the estimate of the log-odds ratio, lor hat, sub s and a are the same variances used in calculating the previously published 2002-2003 prediction intervals (PIs) and the 2003-2004 PIs given in this report, respectively.

The correlation between logarithm of Theta 1 hat and logarithm of Theta 2 hat was obtained by simultaneously modeling the 2002, 2003, and 2004 NSDUH data. This simultaneous modeling approach was adopted based on the results of the validation study (see Appendix E, Section E.2., of Wright, 2003b) conducted for measuring change in 1999–2000 and 2000-2001 State estimates. For this simultaneous model, four age groups by 3 years (i.e., 12 subpopulation–specific models) were fitted, each with its own set of fixed and random effects. In this case, the general covariance matrices for the State and substate random effects were 12 by 12 matrices corresponding to the 12 element (age group by year) vectors of random effects. Note that the survey-weighted Bernoulli-type log likelihood employed in SWHB methodology was appropriate for this simultaneous model because the 12 age group by year subpopulations were nonoverlapping. The correlation [logarithm of Theta 1 hat, logarithm of Theta 2 hat] was approximated by the correlation calculated using the posterior distributions of log[image representing pisa(1) /(1 - image representing pisa(1))] and log[image representing pisa(2) /(1 - image representing pisa(2))] from the simultaneous model.

To calculate the p value for testing the null hypothesis of no change (lorsa = 0), it was assumed that The estimate of the log-odds ratio, lor hat, sub s and a, is assumed to follow a normal distribution with mean zero and variance v of the estimate of the log-odds ratio, lor hat, sub s and a.. Then, the p value = P[Z greater than or equal to symbol abs(z)], where Z is a standard normal random variate, Quantity z is the estimate of the log-odds ratio, lor hat, sub s and a, divided by the square root of the variance v of the estimate of the log-odds ratio, lor hat, sub s and a., and abs(z) denotes the absolute value of Z.

Table A.1 Sample Sizes, Weighted Screening and Interview Response Rates, and Population Estimates, by State, for Persons Aged 12 or Older: 2002
State Total Selected DUs Total Eligible DUs Total Completed Screeners Weighted DU Screening Response Rate Total Selected Total Responded Population Estimate Weighted Interview Response Rate Weighted Overall Response Rate
Overall 178,013 150,162 136,349 90.72% 80,581 68,126 235,143,245 78.56% 71.27%
Alabama 2,403 2,028 1,852 91.31% 1,103 960 3,686,602 81.85% 74.74%
Alaska 2,408 1,898 1,751 92.13% 1,067 915 496,025 82.05% 75.59%
Arizona 2,346 1,908 1,770 92.66% 1,078 924 4,361,020 79.66% 73.81%
Arkansas 2,540 2,102 2,005 95.28% 1,054 877 2,216,033 76.09% 72.50%
California 8,425 7,601 6,816 89.60% 4,363 3,599 28,231,483 74.93% 67.14%
Colorado 2,099 1,827 1,664 91.01% 1,087 914 3,655,496 81.67% 74.32%
Connecticut 2,718 2,440 2,227 91.44% 1,188 977 2,827,588 76.73% 70.16%
Delaware 2,585 2,116 1,908 89.64% 1,159 964 665,926 78.55% 70.42%
District of Columbia 3,701 3,100 2,608 84.08% 979 864 482,635 84.79% 71.29%
Florida 10,742 8,622 7,723 89.47% 4,340 3,653 13,832,088 77.23% 69.10%
Georgia 2,206 1,896 1,660 87.50% 1,066 897 6,842,168 77.76% 68.04%
Hawaii 2,276 1,942 1,759 90.38% 1,111 925 962,485 76.50% 69.14%
Idaho 2,033 1,634 1,515 92.80% 1,052 907 1,074,515 82.81% 76.86%
Illinois 9,263 8,181 6,986 85.45% 4,613 3,729 10,258,735 75.32% 64.36%
Indiana 2,261 1,961 1,856 94.61% 1,123 945 5,019,711 77.60% 73.42%
Iowa 2,252 1,939 1,835 94.68% 1,028 894 2,440,614 84.42% 79.93%
Kansas 1,933 1,683 1,579 93.86% 1,041 898 2,202,285 81.96% 76.92%
Kentucky 2,641 2,273 2,155 94.79% 1,098 909 3,395,143 79.55% 75.41%
Louisiana 2,189 1,816 1,701 93.64% 1,070 930 3,607,669 84.44% 79.07%
Maine 2,828 2,290 2,082 90.85% 1,017 906 1,104,764 87.35% 79.36%
Maryland 1,984 1,801 1,610 89.42% 1,039 919 4,449,299 81.71% 73.07%
Massachusetts 2,567 2,216 1,930 86.95% 1,142 916 5,387,071 71.93% 62.55%
Michigan 9,820 8,073 7,414 91.75% 4,432 3,792 8,255,399 81.82% 75.06%
Minnesota 2,173 1,895 1,765 93.09% 996 873 4,154,504 83.23% 77.48%
Mississippi1 2,261 1,750 1,508 86.58% 988 839 2,307,320 77.37% 66.99%
Missouri 2,725 2,236 2,098 93.87% 1,039 890 4,656,459 82.05% 77.02%
Montana 2,772 2,174 2,057 94.64% 1,075 914 759,543 81.98% 77.58%
Nebraska 1,954 1,746 1,652 94.59% 1,042 891 1,411,983 82.01% 77.57%
Nevada1 2,534 2,069 1,956 94.67% 1,147 954 1,742,004 73.54% 69.62%
New Hampshire 2,597 2,154 1,966 91.27% 1,092 910 1,065,165 78.10% 71.28%
New Jersey 2,554 2,290 2,042 89.28% 1,065 854 7,075,581 74.61% 66.61%
New Mexico1 1,950 1,586 1,236 77.38% 794 674 1,500,281 81.83% 63.32%
New York 10,480 9,032 7,516 83.31% 4,615 3,716 15,882,822 73.14% 60.94%
North Carolina 2,289 1,940 1,792 92.57% 1,046 902 6,726,205 80.99% 74.98%
North Dakota 2,307 1,873 1,770 94.52% 1,011 913 527,574 84.91% 80.26%
Ohio 9,194 7,970 7,476 93.76% 4,221 3,554 9,369,125 78.58% 73.68%
Oklahoma 2,300 1,932 1,791 92.64% 1,100 922 2,822,615 78.63% 72.84%
Oregon 2,456 2,158 2,019 93.43% 1,071 917 2,916,974 80.74% 75.44%
Pennsylvania 10,104 8,482 7,710 90.86% 4,251 3,606 10,298,942 79.56% 72.29%
Rhode Island 2,458 2,117 1,883 89.14% 1,107 925 896,699 74.12% 66.07%
South Carolina 2,332 1,824 1,729 94.77% 1,091 913 3,371,646 80.90% 76.67%
South Dakota 2,053 1,717 1,632 95.03% 1,013 914 619,768 86.83% 82.52%
Tennessee 2,732 2,357 2,212 92.82% 1,057 920 4,766,688 83.26% 77.28%
Texas 7,730 6,408 5,960 93.05% 4,212 3,649 17,207,615 82.73% 76.98%
Utah 1,487 1,336 1,264 94.52% 990 889 1,807,003 84.94% 80.29%
Vermont 2,410 1,914 1,803 94.36% 1,013 896 525,061 88.02% 83.06%
Virginia 2,426 2,104 1,873 89.03% 1,069 884 5,862,299 75.20% 66.95%
Washington 2,454 2,002 1,832 91.35% 1,079 901 4,962,300 78.20% 71.44%
West Virginia 2,763 2,299 2,169 94.33% 1,059 898 1,527,885 79.91% 75.38%
Wisconsin 2,152 1,709 1,587 92.87% 1,029 887 4,511,335 82.44% 76.56%
Wyoming 2,146 1,741 1,645 94.49% 1,059 907 413,099 79.40% 75.02%

Table A.2 Sample Sizes, Weighted Interview Response Rates, and Population Estimates, by State and Three Age Groups: 2002
State 12–17 18–25 26+
Total Selected Total Responded Population Estimate Weighted Interview Response Rate Total Selected Total Responded Population Estimate Weighted Interview Response Rate Total Selected Total Responded Population Estimate Weighted Interview Response Rate
Overall 26,230 23,659 24,753,586 89.99% 27,216 23,271 31,024,280 85.16% 27,135 21,196 179,365,379 75.81%
Alabama 361 331 378,922 92.11% 370 324 497,362 86.86% 372 305 2,810,318 79.54%
Alaska 393 353 70,050 90.00% 353 305 58,061 85.24% 321 257 367,914 79.65%
Arizona 360 330 477,791 91.87% 346 303 593,368 86.21% 372 291 3,289,861 76.81%
Arkansas 385 340 232,228 88.68% 287 256 299,329 89.70% 382 281 1,684,476 71.97%
California 1,439 1,304 3,119,651 90.54% 1,459 1,224 3,910,445 83.32% 1,465 1,071 21,201,387 70.93%
Colorado 349 309 386,275 88.67% 380 317 488,328 82.92% 358 288 2,780,893 80.55%
Connecticut 369 335 297,332 90.70% 423 341 314,467 82.08% 396 301 2,215,789 74.39%
Delaware 392 350 64,655 88.74% 344 285 87,670 83.05% 423 329 513,601 76.54%
District of Columbia 354 326 33,553 91.52% 284 256 73,858 89.63% 341 282 375,224 83.16%
Florida 1,335 1,213 1,332,058 91.10% 1,523 1,317 1,526,407 86.35% 1,482 1,123 10,973,623 74.40%
Georgia 339 309 740,287 91.81% 332 281 931,197 85.79% 395 307 5,170,684 74.28%
Hawaii 337 306 106,624 92.14% 351 300 123,983 85.94% 423 319 731,877 72.94%
Idaho 346 314 128,019 89.27% 348 302 162,155 87.73% 358 291 784,341 80.82%
Illinois 1,475 1,304 1,081,426 88.16% 1,620 1,301 1,366,021 79.82% 1,518 1,124 7,811,288 72.73%
Indiana 351 323 537,937 90.92% 415 346 699,137 84.53% 357 276 3,782,636 74.38%
Iowa 343 312 247,154 91.07% 315 278 348,675 89.36% 370 304 1,844,784 82.50%
Kansas 324 301 242,248 93.27% 374 321 316,706 86.26% 343 276 1,643,332 79.59%
Kentucky 376 325 317,845 84.53% 342 288 457,462 84.10% 380 296 2,619,836 78.11%
Louisiana 344 311 408,864 91.56% 359 310 533,943 86.92% 367 309 2,664,863 82.83%
Maine 337 310 107,138 92.04% 336 295 128,854 88.23% 344 301 868,772 86.65%
Maryland 376 346 472,125 91.83% 331 302 525,127 90.68% 332 271 3,452,047 78.58%
Massachusetts 402 353 502,081 87.86% 350 285 670,475 84.04% 390 278 4,214,516 68.13%
Michigan 1,458 1,301 892,683 89.81% 1,570 1,371 1,078,221 87.65% 1,404 1,120 6,284,494 79.57%
Minnesota 318 289 447,909 90.45% 352 317 564,444 90.66% 326 267 3,142,151 80.71%
Mississippi1 342 312 257,043 91.28% 314 274 346,485 87.36% 332 253 1,703,792 72.96%
Missouri 364 328 489,034 90.34% 335 289 621,802 85.99% 340 273 3,545,624 80.20%
Montana 383 348 82,057 91.77% 309 262 101,662 85.48% 383 304 575,825 80.05%
Nebraska 353 317 152,803 90.07% 327 280 202,014 86.69% 362 294 1,057,166 79.90%
Nevada1 396 359 182,000 91.12% 356 308 208,607 86.18% 395 287 1,351,398 69.19%
New Hampshire 344 300 112,627 88.19% 405 343 126,521 84.89% 343 267 826,017 75.60%
New Jersey 324 290 712,611 89.35% 383 308 775,060 79.98% 358 256 5,587,910 71.75%
New Mexico1 235 213 176,221 89.25% 296 250 207,372 85.15% 263 211 1,116,688 80.02%
New York 1,426 1,241 1,564,858 86.12% 1,649 1,344 2,026,299 80.59% 1,540 1,131 12,291,665 70.20%
North Carolina 354 325 677,525 89.91% 341 292 866,820 84.88% 351 285 5,181,860 79.25%
North Dakota 357 337 54,725 94.54% 332 307 81,994 92.38% 322 269 390,856 81.86%
Ohio 1,358 1,221 991,716 89.83% 1,429 1,224 1,217,589 85.83% 1,434 1,109 7,159,820 75.66%
Oklahoma 362 308 305,129 84.00% 385 333 408,904 85.11% 353 281 2,108,583 76.37%
Oregon 354 322 297,634 90.31% 361 308 379,401 85.13% 356 287 2,239,939 78.69%
Pennsylvania 1,395 1,243 1,025,357 89.15% 1,489 1,293 1,270,338 86.58% 1,367 1,070 8,003,247 77.15%
Rhode Island 365 334 83,814 91.12% 357 306 124,681 84.64% 385 285 688,204 70.20%
South Carolina 339 304 336,271 90.47% 412 343 458,511 82.93% 340 266 2,576,865 79.24%
South Dakota 359 343 70,145 95.94% 320 286 89,870 89.15% 334 285 459,753 85.02%
Tennessee 381 352 472,625 91.52% 260 228 610,807 87.69% 416 340 3,683,257 81.42%
Texas 1,347 1,224 2,004,787 90.81% 1,427 1,251 2,477,451 87.79% 1,438 1,174 12,725,377 80.50%
Utah 316 309 227,575 97.46% 324 289 363,300 88.95% 350 291 1,216,128 81.15%
Vermont 339 312 53,892 92.84% 367 314 68,583 86.88% 307 270 402,586 87.51%
Virginia 297 278 600,443 93.43% 412 341 728,869 83.24% 360 265 4,532,987 71.75%
Washington 298 264 530,187 86.66% 361 304 640,479 84.62% 420 333 3,791,634 76.00%
West Virginia 339 305 139,243 89.85% 336 292 193,439 87.55% 384 301 1,195,204 77.58%
Wisconsin 317 280 482,456 87.97% 380 338 613,508 87.26% 332 269 3,415,371 80.85%
Wyoming 323 295 45,958 91.71% 385 339 58,222 88.37% 351 273 308,919 75.91%

Table A.3 Sample Sizes, Weighted Screening and Interview Response Rates, and Population Estimates, by State, for Persons Aged 12 or Older: 2003
State Total Selected DUs Total Eligible DUs Total Completed Screeners Weighted DU Screening Response Rate Total Selected Total Responded Population Estimate Weighted Interview Response Rate Weighted Overall Response Rate
Overall 170,762 143,485 130,605 90.72% 81,631 67,784 237,682,009 77.39% 70.21%
Alabama 2,071 1,712 1,558 91.14% 1,029 879 3,699,723 79.60% 72.55%
Alaska 2,314 1,814 1,666 91.97% 1,098 883 505,278 75.00% 68.98%
Arizona 2,159 1,757 1,662 94.64% 1,057 897 4,473,518 81.20% 76.85%
Arkansas 2,258 1,850 1,767 95.53% 1,092 922 2,228,670 79.84% 76.27%
California 7,687 6,858 6,015 86.86% 4,471 3,600 28,673,990 73.76% 64.07%
Colorado 2,225 1,855 1,709 92.06% 1,103 911 3,701,560 78.79% 72.53%
Connecticut 2,623 2,288 2,073 90.56% 1,128 933 2,880,493 76.25% 69.06%
Delaware 2,419 1,936 1,774 91.59% 1,105 911 671,922 75.12% 68.80%
District of Columbia 3,692 3,078 2,576 83.69% 1,116 949 476,873 80.38% 67.27%
Florida 10,451 8,453 7,575 89.77% 4,414 3,541 14,145,707 73.68% 66.14%
Georgia 2,112 1,734 1,612 92.81% 1,088 902 6,951,437 79.46% 73.74%
Hawaii 2,259 1,953 1,767 90.25% 1,142 928 1,013,259 73.21% 66.07%
Idaho 1,998 1,596 1,509 94.45% 1,112 912 1,099,895 77.63% 73.32%
Illinois 9,163 8,128 6,803 83.45% 4,652 3,711 10,319,948 74.36% 62.05%
Indiana 2,046 1,741 1,637 94.11% 1,082 903 5,049,910 79.37% 74.69%
Iowa 2,035 1,829 1,721 94.16% 993 884 2,448,928 85.81% 80.79%
Kansas 2,042 1,744 1,638 93.94% 1,041 875 2,209,221 81.11% 76.20%
Kentucky 2,266 1,991 1,878 94.25% 1,102 908 3,381,254 75.69% 71.34%
Louisiana 2,084 1,757 1,637 93.12% 1,095 943 3,618,197 81.80% 76.17%
Maine 2,827 2,240 2,045 91.21% 1,094 928 1,113,100 82.07% 74.86%
Maryland 1,899 1,673 1,475 88.04% 1,000 863 4,510,290 82.58% 72.70%
Massachusetts 2,413 2,129 1,878 88.16% 1,220 964 5,377,359 75.04% 66.16%
Michigan 9,000 7,447 6,709 90.14% 4,353 3,667 8,316,442 79.06% 71.26%
Minnesota 2,029 1,801 1,673 92.73% 1,052 909 4,193,331 82.14% 76.17%
Mississippi 2,196 1,732 1,650 95.33% 1,078 899 2,311,859 78.81% 75.13%
Missouri 2,495 2,042 1,912 93.64% 1,105 932 4,683,914 81.99% 76.77%
Montana 2,384 1,871 1,766 94.40% 1,068 911 767,946 79.57% 75.12%
Nebraska 1,996 1,716 1,622 94.51% 1,071 918 1,418,952 79.62% 75.25%
Nevada 2,071 1,751 1,663 94.91% 1,072 902 1,818,116 79.78% 75.71%
New Hampshire 2,015 1,688 1,568 92.94% 1,112 910 1,082,138 76.29% 70.90%
New Jersey 2,564 2,287 1,981 86.56% 1,126 883 7,118,305 72.97% 63.17%
New Mexico 2,260 1,822 1,740 95.42% 1,132 944 1,520,180 77.03% 73.50%
New York 9,973 8,575 7,205 83.97% 4,609 3,634 15,948,708 71.96% 60.42%
North Carolina 2,239 1,852 1,753 94.65% 1,086 904 6,805,722 79.21% 74.98%
North Dakota 2,072 1,714 1,619 94.57% 977 867 525,140 87.43% 82.69%
Ohio 8,874 7,690 7,246 94.17% 4,313 3,559 9,433,820 75.91% 71.49%
Oklahoma 2,455 1,972 1,812 91.80% 1,042 871 2,846,785 78.62% 72.17%
Oregon 2,102 1,853 1,760 94.94% 1,095 912 2,970,969 79.79% 75.75%
Pennsylvania 9,866 8,252 7,482 90.76% 4,214 3,572 10,356,055 80.56% 73.12%
Rhode Island 2,255 1,991 1,772 88.58% 1,141 914 903,348 75.20% 66.61%
South Carolina 2,205 1,807 1,723 95.45% 1,109 920 3,384,520 79.64% 76.02%
South Dakota 2,154 1,749 1,660 94.78% 980 881 621,498 86.26% 81.76%
Tennessee 2,290 1,978 1,864 94.27% 1,004 856 4,823,157 79.89% 75.32%
Texas 7,901 6,466 6,079 94.03% 4,231 3,566 17,432,369 79.14% 74.42%
Utah 1,623 1,392 1,325 95.14% 995 898 1,816,737 87.98% 83.71%
Vermont 2,638 2,047 1,909 93.19% 1,092 917 530,133 79.87% 74.43%
Virginia 2,168 1,908 1,667 87.33% 1,076 907 5,951,031 78.61% 68.65%
Washington 2,475 2,033 1,920 94.43% 1,128 941 5,053,331 78.65% 74.28%
West Virginia 2,923 2,384 2,236 93.83% 1,058 871 1,534,650 78.86% 74.00%
Wisconsin 2,282 1,793 1,655 92.28% 1,046 887 4,546,217 77.76% 71.76%
Wyoming 2,214 1,756 1,659 94.48% 1,032 885 416,105 84.33% 79.67%

<
Table A.4 Sample Sizes, Weighted Interview Response Rates, and Population Estimates, by State and Three Age Groups: 2003
State 12–17 18–25 26+
Total Selected Total Responded Population Estimate Weighted Interview Response Rate Total Selected Total Responded Population Estimate Weighted Interview Response Rate Total Selected Total Responded Population Estimate Weighted Interview Response Rate
Overall 25,387 22,696 24,995,357 89.57% 27,259 22,941 31,728,286 83.47% 28,985 22,147 180,958,366 74.63%
Alabama 324 297 382,688 92.61% 394 340 501,543 86.10% 311 242 2,815,492 76.33%
Alaska 348 298 68,750 86.80% 378 314 67,522 82.66% 372 271 369,006 71.30%
Arizona 346 314 493,252 91.48% 377 317 611,163 84.15% 334 266 3,369,104 78.82%
Arkansas 352 320 233,744 91.18% 356 301 304,728 85.42% 384 301 1,690,198 77.24%
California 1,381 1,236 3,161,827 89.71% 1,463 1,195 3,928,708 81.65% 1,627 1,169 21,583,456 69.91%
Colorado 327 292 385,020 88.53% 379 305 499,513 79.29% 397 314 2,817,027 77.43%
Connecticut 313 279 292,982 88.47% 423 353 331,774 83.64% 392 301 2,255,738 73.62%
Delaware 344 305 68,298 88.69% 373 315 89,106 84.55% 388 291 514,518 71.54%
District of Columbia 370 326 32,832 88.64% 373 326 73,453 87.28% 373 297 370,589 78.33%
Florida 1,377 1,203 1,360,537 87.23% 1,418 1,171 1,626,149 81.73% 1,619 1,167 11,159,021 71.02%
Georgia 342 308 756,648 88.43% 323 267 959,782 84.93% 423 327 5,235,007 77.32%
Hawaii 388 353 100,981 90.91% 329 275 121,594 83.63% 425 300 790,684 69.33%
Idaho 331 299 128,037 90.50% 348 287 166,977 81.40% 433 326 804,881 74.87%
Illinois 1,423 1,238 1,083,365 86.69% 1,537 1,242 1,395,959 81.48% 1,692 1,231 7,840,623 71.43%
Indiana 338 308 545,217 90.65% 365 292 710,330 79.87% 379 303 3,794,364 77.73%
Iowa 329 304 245,539 89.91% 333 292 353,759 87.71% 331 288 1,849,631 84.81%
Kansas 317 280 240,109 87.93% 363 309 322,145 84.48% 361 286 1,646,967 79.40%
Kentucky 349 306 337,609 86.98% 349 293 451,685 83.75% 404 309 2,591,960 72.97%
Louisiana 353 321 405,066 92.36% 382 335 541,507 86.50% 360 287 2,671,623 79.32%
Maine 345 304 110,584 87.73% 388 330 132,168 86.27% 361 294 870,349 80.84%
Maryland 318 292 481,268 90.86% 280 237 547,577 83.87% 402 334 3,481,445 81.21%
Massachusetts