Counting Together or Separately? : The Hazard of Aggregating Asian American Groups within Research Using Ethnic Geography’s Dissimilarity Index

Ethan Yorgason

doi:10.22776/kgs.2020.55.1.43

Preview

Research Article

Journal of the Korean Geographical Society. 29 February 2020. 43~65
https://doi.org/10.22776/kgs.2020.55.1.43

Counting Together or Separately? : The Hazard of Aggregating Asian American Groups within Research Using Ethnic Geography’s Dissimilarity Index

통합 또는 분리? : 에스닉 지리의 상이지수를 활용한 연구에서 아시아계 미국인 그룹을 집계하는 문제

Ethan Yorgason^a^*

요거슨 이슨 ^a^*

^aAssociate Professor, Department of Geography, Kyungpook National University

^{* Corresponding Author}

License (open-access):

이 학술지는 2019년도 정부재원(교육부)으로 한국연구재단의 지원을 받아 출판되었음.

ABSTRACT

This paper explores the impact of aggregating Asian nationality groups within the United States on the dissimilarity index, a commonly used measure of distributional unevenness and segregation within ethnic geography and demography. Most use of the index combines all Asian groups together when comparing their level of segregation within the USA to other minority groups, such as blacks and Hispanics. The article argues that this aggregating convention is deeply problematic. I make this argument in part by comparing an aggregated and four disaggregated Asian groups (Korean, Chinese, Japanese, and Filipino) through five quantitative, empirical explorations using US Census data: 1) temporal trends since 1970; 2) the role of different areal units, disaggregated by spatial scale; 3) variation of dissimilarity-index values by both regional size and percentage of minority group; 4) regression analysis of various independent variables that contribute to variation in dissimilarity-index values; and 5) mapping the variation of these values by US state. In addition, I demonstrate that, because of the phenomenon of ecological correlation, an aggregated Asian-dissimilarity-index value will almost always misrepresent the values of the individual, disaggregated Asian groups.

Keywords

ethnic aggregation

Asian Americans

dissimilarity index

segregation

ecological correlation

MAIN

1. Introduction
2. The Dissimilarity Index and the Tendency to Aggregate Asians
3. Four East Asian Ethnic Groups in the United States
4. To Aggregate or Disaggregate?
1) A special type of ecological correlation
2) Empirical evidence
3) Exploration I: Average values and temporal change
4) Exploration II: Region-unit scales and unit disaggregation
5) Exploration III: Influence of regional size and ethnic proportion
6) Exploration IV: Relative influence of explanatory factors according to regression
7) Exploration V: Influence of geography (states)
5. Discussion and Conclusion

1. Introduction

Much social science research on the USA combines people with ancestry from all Asian nationalities together as a single group. When the issue involves Asians as objects of non-Asian action, this decision makes a certain kind of sense. In particular, American society’s racializing actions frequently erase distinctions between Asian groups. However, when Asian people are active subjects, national ancestral background may be much more important. The groups have quite different immigration histories; carry diverse cultural, linguistic, political, and religious heritage; and typically retain strong national identity, usually over and above any broad Asian American racial/ethnic identity. Analyses on related topics may obscure as much as they illuminate if they combine Asians into a single category. This distinction between Asian Americans as objects and subjects is often blurry, however (Aspinall, 2003; 2005). Many issues of ethnicity involve a complex mix of self- and socially-assigned identity. As Aspinall (2009) argues, the scholar’s task is finding an appropriate tradeoff between the validity provided by fine ethnic granularity, and the utility offered by courser aggregation.

The present research focuses narrowly on one way in which Asian Americans are arguably both subjects and objects. This is the geographical issue of residential patterns (Wang, 2011). Yet stated thusly, the issue is still too large. To be more specific, I focus on a single measure to discuss ethnic residential geography, one commonly used within demographic research: the Dissimilarity Index. Scholars of segregation, primarily, use this index. Though not without nuance at times, the literature most frequently implies that the index measures society’s discrimination toward minority groups rather than reflects group members’ choices. In other words, most use of the index treats Asian Americans as objects rather than subjects. Unsurprisingly, then, research typically aggregates Asian groups together rather than disaggregating by nationality. Is aggregation actually the best choice? While qualitative judgment is also relevant, this article seeks answers primarily through quantitative explorations of the index itself. It thereby builds on the issue of ethnic aggregation raised in demographic research by the geographer Pablo Mateos (Mateos et al., 2009; see also Mateos, 2011).

Leading to the conclusion that researchers should disaggregate Asian Americans more often, the paper first introduces the dissimilarity index and specifies its common (mostly aggregating) usage. Next I briefly review the differing historical geographies of the four Asian American nationality groups this research focuses on (see also White et al., 2003). The article’s largest section addresses the question of whether to aggregate or not. It begins with a theoretical subsection illustrating a type of ecological fallacy that can arise when using an aggregated Asian group dissimilarity-index value. Then, after introducing the data used, it addresses the question empirically through five explorations, each comparing an aggregated Asian grouping (along with other ethnic groups, for context) to the disaggregated groups. The final section summarizes and interprets results and implications. Though not the key objective, the paper also describes dissimilarity-index patterns among the four Asian American groups in more historical and geographical detail than exists elsewhere in the literature.¹⁾

2. The Dissimilarity Index and the Tendency to Aggregate Asians

The dissimilarity index (or D) is the most commonly used index among many indexes within segregation research. D was developed as a measure of urban ethnic segregation since even before Duncan and Duncan's (1955) seminal paper on segregation indexes more than half a century ago. Duncan and Duncan identified the mathematical relationships between several segregation indexes, showing that while they differed in important ways, several built off of D’s basic calculation. Since that time, many more computationally and geographically sophisticated segregation statistics have been developed by scholars such as Johnston et al. (2001; 2003); Horn (2005); 이상일 (2007), 박윤환 (2011), 최은진 and 김의준 (2011); Harris (2014; 2016; 2017) and Catney (2017). These examples, as well as many others, address how to and which index(es) best measure segregation. However, that important issue is not the central question in the present research. My goal here is not to improve upon the dissimilarity index, but rather to explore an under-appreciated implication of its use. I acknowledge that the index is limited and problematic. Other indexes should supplement D to fully understand segregation. Yet D remains a staple of research on residential segregation. This is in part because its interpretation and calculation are both relatively simple. It is thus the measure most likely to influence non-specialists (policy makers, politicians, the media, etc.). Thus my key concern is one that has not yet received much attention in the literature: the consequences of ethnic aggregation within dissimilarity-index research.

The index measures just one aspect of segregation: evenness/unevenness among ethnic groups in residential distribution.²⁾ A simple, though not quite technically correct, way to conceive of D is that it builds on a comparison of the percentage of an ethnic group living within smaller, geographically delineated units to the percentage of that same ethnic group living within a larger region (the nomenclature varies between studies). When the collective proportions of the group in the region’s small units differ substantially from the overall proportion in the large region, D rises. Segregation is then labeled high. Conversely, when the small units’ proportions differ very little from the larger region’s proportion, then D is low (low segregation). Technically this description is not quite correct. As seen in the equation below, the index actually utilizes the difference between two slightly different proportions: the percentage of members of the ethnic group within the region who reside in a smaller unit (X = members the ethnic group within the region; x_i = members of the ethnic group within any particular smaller unit; thus the percentage = x_i/X), and the percentage of all non-ethnic group members (or comparison ethnic group members) within the region residing in that same unit (y_i/Y). The absolute values of the differences are summed across the region’s units. The resulting value is then multiplied by 50 so that it can be interpreted as a percentage, ranging between zero and 100. (Many studies multiply by 0.5, in which case D varies between zero and one.) Thus:

$$D=50\ast\sum_{i=1}^n\left|\left(\frac{x_i}X\right)-\left(\frac{y_i}Y\right)\right|$$

The common interpretation is that D is the percentage of minority (X) group members who would need to move to other units within the region in order for that group to become evenly spread among Y-group members.

Most US research using the dissimilarity index focuses on “racial” segregation in urban areas. The Chicago School of urban sociology conceived of cities ecologically, transferring concepts such as competition for resources, community, and segregation from the natural world (Shlay and Balzarini, 2015). The field came to see the metropolitan area as a natural unit of analysis and thus, perhaps, the functional region par excellence. For early-twentieth-century US cities, scholars’ segregation focus applied primarily to the separation of recent European arrivals of new national backgrounds in the US context from longer established white Americans on the one hand, and blacks from whites on the other hand. Over time the newer European ethnicities integrated into white America, but black/white separation persisted. In this milieu, and as social researchers developed tools for measuring segregation quantitatively (Duncan and Duncan, 1955; Peach, 2009), scholars increasingly regarded segregation as a problem rather than a natural, often positive feature of community building (as they had earlier). Thus “race” began to take priority over “nationality” or “ethnicity.”³⁾ After the 1960s researchers gradually incorporated Hispanics and Asians into segregation studies as well, particularly due to their increasing migration following the 1965 Immigration and Nationality Act. Segregation was thus seen as something that whites, as the majority, do to non-whites. For a few decades, (especially quantitative) analysts regarded Asians primarily as a racialized Other, subjected to but not fully active participants in processes of segregation. Unsurprisingly, most applications of the dissimilarity index concentrate on three main “racialized” groups—blacks, Hispanics (despite their apparently pan-racial character), and Asians—and compare their distribution to (usually non-Hispanic) whites.⁴⁾

As a result, concern for differences within these broad groups emerged only recently. Logan and Zhang (2013); the related project “Diversity within diversity” (n.d.); and Iceland et al. (2014) are the best examples in relation to Asians. The first two calculate D values for six Asian groups, along with Asians generally, for the 1990, 2000, and 2010 censuses, while the latter explores two segregation indexes (one of which is D) for 1980-2010 across six Asian groups (see also White et al., [2003] who calculated and analyzed 1990/91 D values for five Asian groups in both the USA and Canada). Together these analyses measure D across the various groups both for the USA as a whole as well as in each US metropolitan statistical area. Logan and Zhang’s average D scores for the four Asian groups this paper focuses on are reported in Table 1. The numbers range within what is generally considered to be mid- or mid-to-low levels of unevenness, with Chinese highest and Japanese lowest. Japanese and Filipino values fell substantially over the 20 years, while the other groups decrease only slightly. It is important to note Logan and Zhang’s input decisions:⁵⁾ whites as the Y (comparison)-group, MSAs (Metropolitan Statistical Areas) as the key region scale with census tracts as the unit scale, averages weighted for population, inclusion of those identifying with the groups as part of multiple races in addition to those identifying with single races,⁶⁾ and the “Asian” category incorporating every Asian ethnicity. Iceland et al.’s somewhat different choices resulted in slightly different D values and patterns, but the overall dissimilarity levels and ordering of groups were similar.

Table 1. Dissimilarity Index Values for US Asians, 1990-2010, According to Logan and Zhang.

Year	Korean	Chinese	Japanese	Filipino	Asian
1990	46.6	50.9	40.9	49.7	41.6
2000	46.8	49.8	35.9	45.7	41.6
2010	45.8	48.7	33.6	42.1	40.7

Source: Logan and Zhang (2013, 9).

Ellis et al. (2004) also notably disaggregate Asian groups. They use the dissimilarity index and other measures to address the differential segregation at home and work of foreign-born Chinese, Koreans, and Filipinos (among others) in metropolitan Los Angeles. But most other dissimilarity index research aggregates Asians. This includes reports showing, for example, that: Asian dissimilarity from whites in approximately 20 large metropolitan areas generally hovered between about 45 and 50 in 1980 (Denton and Massey, 1988, 810-13); Asian D in relation to whites in 2000 was about an (unweighted) average 43 in metropolitan areas, with blacks at 59 and Hispanics at 44 (31, 46, and 36 respectively for cities over 25,000 people) (Frey and Myers, 2005, 38, 48); and Asian D values (in relation to whites, over virtually all US metropolitan areas) decreased from approximately 59 in 1980 to about 52 in 1990 and 2000, while rising back to about 58 by 2010 (blacks and Hispanics showed a steadier drop, from about 78 to 64 and 50 to 46, respectively; Intrator et al., 2016, 51). As noted in endnote 5, different input decisions led to different D values.

This tendency to aggregate Asians (and Hispanics, etc.) has only thus very recently and modestly been questioned within US dissimilarity-index research. Moreover, most of the related analysis is practical rather than theoretical. It is based on the simple, though important, idea that aggregation obscures variation among the various Asian groups. However, this is a critique that applies to all aggregation and is not specific to segregation. Mateos has led the way in problematizing aggregation in segregation research (Mateos et al., 2009; Mateos, 2011, Mateos, 2014). His main concern is how differing specific aggregations lead to different conclusions about segregation. In other words, for example, whether Pakistanis, Indians, Vietnamese, and Chinese are aggregated into the same category or different categories affects interpretation of the problem of segregation in aggregated ethnic groupings (Asians, South Asians, East Asians, etc.). However, while this critique is an important step forward, it does not directly address the issue of aggregation versus disaggregation. In order to do so, the present research also draws additional inspiration from another related literature, one that similarly does not quite get to the question of aggregation vs. disaggregation. This is the literature that within geography related to the Modifiable Areal Unit Problem (MAUP), represented best by Openshaw (1983).

Most research relating to the MAUP understandably focuses on areal units. One implication is that dissimilarity-index values are vulnerable to differing areal classifications. The way areal unit boundaries are constituted affects (often modestly, sometimes strongly) the D values obtained. This boundary phenomenon is well known. But work by Wong (2003), Simpson (2007) and Lloyd (2016), among others, add scale as second MAUP dimension to the context of segregation. Their ideas helped in formulating the claims about aggregation and disaggregation presented here. Wong, in particular, argues that D values at different nested scales are not just different from one another but hierarchically related such that within the same region (except for in extremely uncommon cases), D values calculated at a higher nested unit scale (such as census tracts) can never exceed D values at a lower nested unit scale (such as census block groups). In in virtually all realistic cases the former will be lower.

But this insight has never to the best of my knowledge been explicitly applied and adapted to ethnic categorization. Mateos et al., (2009) thus correctly claim that the problem of ethnic classification/aggregation is similar to the issue of the MAUP’s geographic categorization. However they emphasize only one aspect of the MAUP—the boundary aspect. Similar problems in both ethnic and spatial categorization exist, they argue, because: a) the boundaries of both are ultimately somewhat arbitrary, and b) different boundaries of both types create different results. But the MAUP also relates to scale. This paper is most explicitly concerned to adapt insights related to scale from the MAUP to ethnic categorization, and specifically to the issue of aggregation vs. disaggregation. It seems to have been forgotten or overlooked that Openshaw built many of his insights about the MAUP from the broader issue of ecological correlation within social science (Openshaw, 1984). Ecological correlation and the related ecological fallacy are most often used to discuss the hazards of applying group-level patterns to individual-level action. But as I argue in Subsection 4.1, the concepts also apply to calculations of the dissimilarity index and other related segregation indices.

3. Four East Asian Ethnic Groups in the United States

I have thus far argued that good philosophical reasons exist for disaggregating Asian groups within dissimilarity-index research. These relate, as alluded to in Section 1, to seeing people in these groups as subjects in addition to objects within residential decision-making. Nevertheless, the existing research, summarized in Section 2, only hints, in either theory or practice, at the extent to which disaggregation is useful. While I build below on hitherto undeveloped insights from work on the MAUP, the dissimilarity index, and aggregation within segregation research, this paper’s goal is not just conceptual but also observational (I here follow the spirit of Openshaw [1984] and Holt et al. [1996], who call for empirical and not just theoretical discussions of problems involving ecological correlation within areal units). Thus Section 4 not only shows why aggregation is theoretically problematic, but also quantitatively explores the results of aggregating Asian groups’ D values that differ from one another. But first, in order to set a context for those distributional differences, this section briefly summarizes the historical geographies of four Asian ethnicities within the United States.

Chinese were the first Asians to go to the United States in large numbers, from the early 1850s to early 1880s. Two key migratory streams went to California and the American West more generally starting in the 1850s for the Gold Rush and railroad construction, and to Hawaii (though not then a state or even US territory) for sugar plantation labor. The 1882 Chinese Exclusion Act closed off most of this migration, however, helping create a racialized, “gatekeeping” regime of US immigration law and cultural attitudes (Lee, 2002). In Hawaii, Japanese laborers quickly replaced the Chinese stream. Many Japanese stayed in Hawaii, but a large cohort eventually moved to California and other western states. This Japanese flow began to be cut off in 1907 with the Gentlemen’s Agreement between the USA and Japan. However, migration of wives and children continued into the early 1920s. The 1924 Immigration Act included an Asian Exclusion Act which virtually eliminated subsequent Japanese immigration (Suzuki, 2002; Daniels, 1992). Filipinos replaced Japanese on Hawaii’s plantations, especially between 1907 and 1929. US colonial control of the Philippines exempted Filipinos from the Asian Exclusion Act, though culturally they experienced much anti-Asian racism (Hinnershitz, 2013). Migration to Hawaii and eventually other US states and territories (primarily in the West, particularly Alaska and California), slowed greatly with the 1934 Philippine Independence Act. Deportations of some Filipinos commenced in the 1930s (Lee, 2002), but provisions for Filipinos serving in the US Navy and for family members kept 1934-1965 migration a bit higher than from other Asian countries (Roces, 2015). Koreans, for their part, migrated in this early period only in small numbers. This was primarily a few thousand men to Hawaiian plantations in 1900’s first decade, followed a bit later by brides. For most of the four decades until 1965, migration from these four Asian countries essentially stopped, although limited exceptions existed (Patterson, 1988; Min and Kim, 2013). Regionally, most of the pre-1965 migration centered on the American West, particularly Hawaii (per capita) and California (in absolute numbers).

The 1965 US Immigration and Nationality Act changed ethnically and racially discriminatory formulas in place since the 1920s (Holland, 2007). East Asian immigrants arrived in much greater numbers thereafter than they had for decades. US migration from three of these countries rose substantially, though the cultural context was not substantially more welcoming (Shimpi and Zirkel, 2012). Migrants sought primarily economic and educational opportunities. Koreans showed the sharpest percentage increase, given their very low initial numbers, though their rate of increase has slowed below Chinese and Filipino rates since the 1990s. Many well-educated migrants turned, often by necessity, to small entrepreneurship in large cities (Reimer, 2007). In recent years, many Koreans are migrating for educational opportunities (Logan and Zhang, 2103). Chinese and Filipinos also migrated in large absolute numbers. Philippines migration, primarily male oriented earlier, now heavily involved females, especially in professional occupations such as health care (Tyner, 1999; Roces, 2015). Chinese migrants also included large proportions of professionals (Holland, 2007). Far fewer Japanese came. Most US Japanese today are third, fourth, or even fifth generation Americans. Many Koreans are first or second generation, while the Chinese and Filipinos have more bifurcated (pre-1930 and post-1965 arriving) cohorts.

The distribution of each group has its own peculiarities, but a general Western focus persists today. However, recent immigration patterns also strongly prioritize large cities generally (Roseman, 2002). Figure 1 shows the groups’ relative distribution within the 50 states. The figure uses locational quotients-proportion of state’s population composed of the ethnic group divided by the proportion of the nation’s population composed of that same ethnic group.⁷⁾ Distribution of all individuals within these four groups is aggregated as “Asians” and also mapped. Blacks are likewise included for comparison. Unsurprisingly, most have high values in western states. Hawaii has far and away the greatest concentration of Asians; the Japanese locational quotient there is an astoundingly high 62. Low values cluster in the lower Mississippi Valley among a few other (mostly more rural) states. For blacks, the comparison group, the Deep South is relatively high, with low values in northern parts of New England, the Great Plains, and Intermountain West. The 2010 US Census counted, out of 309 million total people, 3.3 million Chinese, 2.6 million Filipinos, 1.4 million Koreans, and 763,000 Japanese.⁸⁾

http://static.apub.kr/journalsite/sites/kwra/2019-052-11/N0200521101/images/geo_55_01_04_F1.jpg

Figure 1.

Relative Ethnic Concentration by State, 2010, For Six US Ethnic Categories. Values represent locational quotients. The “Asian” category here means only people within four disaggregated Asian national groups. Source: US Census.

4. To Aggregate or Disaggregate?

1) A special type of ecological correlation

While the locational quotient is one way to conceive of ethnic distribution, the remainder of this paper uses several empirical explorations based on the dissimilarity index exclusively to assess the utility of aggregating or disaggregating Asian groups. A fundamental problem immediately confronts aggregation, however, but it is one the existing literature has not explicitly identified.⁹⁾ This is the mathematical property inherent in the index that an aggregated value is not necessarily-and in practice, extremely seldom-a good representation of the averages of the disaggregated groups.¹⁰⁾ The index does not function as many simpler statistics do when ethnic aggregation is involved. With average income, for example, the aggregated value is the weighted average of the individual groups. If the disaggregated groups are similarly sized, we correctly surmise that some of the disaggregated groups’ averages fall above the aggregated average, while other groups’ averages lie a similar collective distance below that aggregated average. That is not true for the dissimilarity index, however. As such, it can perhaps be considered a special case within ecological correlation (Openshaw, 1984; Vogt, 1993). The term ecological correlation refers to correlations within data gathered at the group level rather than the individual level. For a simple example, home value and net worth are likely highly-correlated variables. The level of correlation (within a county, for instance) could be calculated based on individual-level data or on census tract-level averages, among other possibilities. The principle of ecological correlation shows that the values of the two correlations are unlikely to be identical; it is, rather, an ecological fallacy to expect them to be. Since the D calculation, like correlation, is based on sums of deviations from means, the principle of ecological correlation applies to differently aggregated levels of data collection. This means that expecting aggregated D values to adequately represent the D values of the disaggregated groups is a type of ecological fallacy.¹¹⁾ The problem is perhaps best understood though an example.

Imagine four small counties (regions) identical in three ways (Figure 2): a) each contains four census tracts (units) with 4,000 people in each census tract; b) each of the four Asian groups comprises one percent of the county’s 16,000 total population (160 people); c) all other residents are white. In County A, all the Asians live in one census tract. Thus their D values in relation to one another are all zero. In relation to whites, their D values are each 78.1. This unrealistic case-n which all D values between Asian groups are zero—is one of the few ways the aggregated Asian D value can equal the average of the disaggregated Asian D values. The aggregated Asian value can never exceed the average of the disaggregated values, however. In most cases, rather, the aggregated value is substantially lower than the average of the disaggregated values. County B shows the extreme case—where all D values between Asian groups are 100, since each Asian group resides entirely separate from the others. Each disaggregated group has a D value of 75 in relation to whites, but the aggregated Asian value in relation to whites is zero.

Counties A and B point to the fact that the aggregated D value for Asians complexly incorporates average values of the disaggregated Asian groups as well as the groups’ dissimilarity from each other.¹²⁾ In other words, it is composed of both within- and between-group variation. The more the Asian groups’ residential patterns differ from one another, the larger the gap is between the average D of the disaggregated groups and the aggregated Asian value. County C shows that it is possible for the aggregated D value to exceed one or more of the disaggregated groups’ values—Filipinos in this scenario. But this aggregated value still represents the average of the disaggregated groups poorly-in this case 15.5 vs. 53.9, respectively. Counties A, B, and C posit unlikely scenarios to illustrate how aggregated D values operate. County D represents a more realistic case. Here the aggregated D value is nearly 10 points lower than the average of the disaggregated values: 26.0 vs. 35.0. Thus this example illustrates the larger point: an aggregated Asian D value gives no indication of how well or poorly it is representing the disaggregated Asian groups’ individual values.¹³⁾

http://static.apub.kr/journalsite/sites/kwra/2019-052-11/N0200521101/images/geo_55_01_04_F2.jpg

Figure 2.

Varying Results of Ethnic Aggregation in Hypothetical Dissimilarity Index Calculations. D values calculated with whites as the Y-group. Each of the four hypothetical cases represents a county of 16,000 people, with four census tracts of 4,000 people each, composed of ethnicities at the indicated numbers.

2) Empirical evidence

This ecological correlation problem may be dispositive toward the view that in order to understand individual Asian ethnicities’ residential unevenness, disaggregation is necessary. An aggregated Asian D value, quite simply, tells us little about the individual groups’ scores. Nevertheless, additional empirical investigation adds to the picture of how much information is lost or retained through aggregation. This section undertakes five such explorations; in some of these cases, the aggregated Asian value more closely represents the patterns of the disaggregated groups.

First, though, a note about the data is necessary. In 1970, the U.S. decennial census began separately counting the four Asian groups highlighted here. Data from that census and the four subsequent decennial censuses are used, as made available through the IPUMS National Historical Geographic Information System (Manson et.al., 2017). Not all five explorations below used 1970’s and 1980’s information, however. Some problems exist with those particular data, as explained below. This project aims to understand how the dissimilarity index varies across the groups in as many situations as possible. So I employed counties as the main regional scale. Counties produce many more cases than the more-commonly-used metropolitan areas; there are approximately 3,000 in the United States. Census tracts are the primary unit scale; an average tract is approximately 4,000 people. The key ethnic groups are Koreans, Chinese, Japanese, Filipino, plus an aggregated “Asian” group which here consists of those four ethnicities but not other Asian groups. Explorations also include other ethnic groups for comparison, most frequently blacks, in an attempt to determine if empirical “markers” exist that set Asian ethnicity apart from other types of ethnicity.¹⁴⁾ The total non-ethnic-group population-rather than non-Hispanic whites, as in many studies-is used as the Y-group population. My interest is in Asian groups’ distribution in relation to the rest of the American population rather than to just whites. Problems calculating or interpreting the dissimilarity index inhere in certain cases; eliminated cases thus include zero X- or Y-group populations, single-unit regions, and regions with number of units exceeding X-group population (Voas & Williamson, 2000). I also included only people who claimed single-group ethnicity for X-group calculations. Finally, I did not weight (by population) in order to average across regions, as some research does; I am more interested in how the D values vary across different types of places rather than the average group member’s situation (on the hazards of identifying areal data with individual characteristics, see Openshaw, 1984; on ecological modeling more generally, see Vogt, 2007).

3) Exploration I: Average values and temporal change

The first exploration has the most cases: 78,509. It utilizes all five censuses, the aggregated Asian category and four disaggregated Asian groups, plus four comparison ethnicities. It uses the county-census tract and five other region-unit combinations: state-census tract, nation-census tract, state-county, nation-county, and nation-state; nevertheless, almost 95% of D values came from the county-census tract scale. This analysis simply shows ethnicities’ average-D-value variation by census year (Figure 3). Note, first, the overall average for each group, designated in the legend. As anticipated, the aggregated overall Asian average (32.9) does not represent the disaggregated groups well (40.4 – 48.0).¹⁵⁾ Blacks are often regarded by many studies as the most segregated group in the United States. But while, indeed, blacks have a higher D value than aggregated Asians, this figure shows that their overall average is lower than each disaggregated Asian group. In fact, the, the aggregated Asian average is closer to the American Indian, black, and white averages than to any of the disaggregated Asian groups’ averages.

http://static.apub.kr/journalsite/sites/kwra/2019-052-11/N0200521101/images/geo_55_01_04_F3.jpg

Figure 3.

Dissimilarity Index Values by Ethnicity and Year, 1970-2010, for Nine US Ethnic Categories. Total ethnicity average indicated parenthetically in the legend. Source: US Census.

Among the Asian groups, Chinese values are particularly high over this five-census period. While average Korean, Japanese, and Filipino values are all within 2.5 D points of each other, the Chinese value (48.0) is more than 5 points higher than any of these groups. It is difficult to know exactly why this last value is so high; the following should be considered as merely informed speculation, which may be pursued through further research. Chinese are high among Asian groups for growth rates and foreign-born within the US, while simultaneously having relatively low incomes (Logan and Zhang, 2013). This relatively large cohort of recently-arrived and less well-off people may lead to stronger concentration in poorer neighborhoods. The institutional presence of Chinatowns in many cities-an inheritance from the early decades of Chinese migration and a phenomenon lacking as widespread a parallel among other Asian groups-may also pull people toward certain neighborhoods (White et al., 2003). Additionally, the larger number of Chinese compared to other Asian groupings may increase the benefits of ethnic clustering by providing a stronger traditional cultural presence than is available to other groups. The Chinese may also lack as strong drivers that may lead toward greater integration with the larger US population: for Filipinos, language and cultural similarities due to US colonization; long US residence for many Japanese; for Koreans, the strong emphasis on education and tendency to live in relatively wealthier neighborhoods in relation to their own modest income, (Reimer, 2007; Logan and Zhang, 2013).¹⁶⁾

Figure 3’s most visibly notable pattern relates to 1980, and possibly 1970. 1970’s and 1980’s data are both incomplete, omitting several, typically smaller, counties. The problem is most acute for 1970. This may push the calculated average 1970 D values up, as smaller counties tend toward lower scores. Nevertheless, analysis the author has reported elsewhere-one that compares these data to data utilizing a common set of counties between 1970 and 2010 (Yorgason, 2019)-suggests that the 1970 impact is likely worth no more than 3 or 4 points.

1980’s problem is more substantial, however. Its data derive from a 17% sample rather than, like the other years, full counts. Sampling has been found to upwardly bias D values for groups with low ethnic percentages (Ransom, 2000; Napierala and Denton, 2017). Since black and white populations are many times larger than any of the other groups analyzed here, this, as well as substantial real change in those two groups, explains why only their values fell between 1970 and 1980. This sampling phenomenon may also explain why the increase in the (larger) aggregated Asian group was less in 1980 (10.8) than that of any of the (smaller) disaggregated Asian groups (12.5 – 19.7). The smallest groups—Hawaiians¹⁷⁾ and the four disaggregated Asian ethnicities show the greatest 1970 – 1980 increase. However, perhaps the most telling comparison is between the American Indian and aggregated Asian categories. As a smaller group than aggregated Asians (Asians had nearly twice the population), American Indians’ average D might have been expected to rise more quickly in 1980 if that group was subject to the same processes as the aggregated Asians. But the Asian score increased more. This likely suggests their influx of recent immigrants then-15 years after the immigration law changed-whether for reasons of discrimination or proactive attempts to live among the same ethnicity, meant that Asians retained relatively high values from 1970 into 1980, before D values dropped in 1990.¹⁸⁾ Nonetheless, the overall 1970 – 2010 aggregated Asian drop (9.8) does not wildly misrepresent the average fall among the dissagregated groups (11.3). Of course, the relatively large Filipino decline over these years (20.2) and low Japanese decrease (2.6) is obscured by aggregation. Japanese may have already reached a distributional plateau by those years. Leading segregation theorists often emphasize different generations’ varying experiences. That generational difference may explain the rapid and continuing fall of Filipino D values and the relative stability of Japanese scores. But Korean values, which track more closely with Japanese, especially over the past three censuses, do not fit this explanation well. The timing of their migration is more similar to Filipinos (for analysis of the relationship between country of nativity and segregation among Asian subgroups, see Iceland et al., 2011).

4) Exploration II: Region-unit scales and unit disaggregation

While 1970’s and 1980’s data help us understand change over time, despite their problematic nature, all subsequent explorations use only 1990 – 2010 data, a period when D values shifted only modestly, as those explorations do not highlight temporal trends. The second exploration draws on David Wong’s (2003) work on disaggregating nested-hierarchical geographies. Basically, Wong mathematically demonstrates that we can calculate the role of each type of unit’s contribution to regional D values though comparison of the D values calculated at the different unit scales. Figure 4 illustrates. Part A shows average D values for all US counties with census-tracts as the unit scale. It exhibits close similarity with the overall averages reported in Figure 3, since nearly 95% of all of Figure 3’s cases were at this county-census tract region-unit scale. Part B gives average D values with state as the regional scale. Part B’s and A’s patterns of relative ethnic position differ somewhat. Most notably, unlike within counties, blacks are distributed most unevenly within states. Whites also have relatively higher dissimilarity in states. Hawaiians show the opposite-relatively high dissimilarity within counties but not states. The four disaggregated Asian groups’ patterns show little difference in Parts A & B, however. Here, again, we see that the aggregated Asian average provides a misleading representation of the four Asian ethnicities, if judging by the overall average, though not quite as severely in Part B as A.

http://static.apub.kr/journalsite/sites/kwra/2019-052-11/N0200521101/images/geo_55_01_04_F4.jpg

Figure 4.

Average D Values within Three Unit Scales, Disaggregated by Unit Scales, 1990-2010, for Nine US Ethnic Categories. A) County as region scale, with census (T)ract as unit scale; B) State as region scale with (C)ounty and census (T)ract as unit scales; C) Nation as region scale with (S)tate, (C)ounty and census (T)ract as unit scales. Source: US Census.

However, a better way to judge aggregation’s impact within this exploration-since we already know aggregation represents averages poorly-is to determine how closely the percentages contributed by the various unit scales with the aggregated Asian value match the percentages contributed among the four disaggregated groups. In other words, the contribution of counties and census tracts (and states, when the region scale is nation) can be disaggregated. These disaggregations, following Wong’s methodology, are shown in Parts B & C. Based on the values in Part B, for example, out of Koreans’ total D of 49.3 within states, 61% is contributed by counties and 39% by census tracts. According to this test, the aggregated Asian pattern is not so different from those of the disaggregated groups. Nevertheless, it falls at the extreme of the groups rather than in the middle (Asian county/census-tract percentages within states: 64/36; C: 64/36; J: 57/43; F: 63/37).

With states as regions, only modest differences across the nine ethnic categories are apparent. But as Part C shows, the differences are larger with nation as region. For example, dissimilarity among states (as the unit scale) overwhelmingly (87%) contributes to Hawaiians’ dissimilarity across the nation. This is unsurprising; most live in Hawaii and a few other states. Blacks and whites, on the other hand, are more evenly distributed across state populations. But compared to other ethnicities, counties and census tracts contribute more to their unevenness across the nation. How does the aggregated Asian category represent the aggregated Asian groups at this regional scale of nation? Not badly. Not only is the aggregated Asian average (D = 60.1) close to the disaggregated Asian groups’ averages (K: 60.6, C: 65.2, J: 65.5, F: 64.0), but the aggregated Asian disaggregated-unit-contribution percentages also fall within the disaggregated groups’ percentages (A: 74/12/14; K: 59/19/22; C: 67/14/19; J: 81/7/12; F: 81/9/10). Of course if we only use aggregation we might never suppose that Korean patterns here in Part C differ substantially from Japanese and Filipino patterns (which are more similar to Hawaiian patterns) and perhaps more nearly resemble black and white patterns. That is, Koreans have proven themselves relatively, compared to the other Asian groups, more willing to distribute themselves across the 50 states. This has perhaps been enabled by their quite small early presence and rapid growth in an era of easier transportation and information exchange, along with their relatively strong focus on educational opportunities, which are broadly distributed across US states.

5) Exploration III: Influence of regional size and ethnic proportion

The final three explorations use only the county-census tract region-unit scale, once again for 1990-2010, so as to compare relatively similar areal units. These data encompass well over 5,000 cases for each ethnic group. The explorations build on awareness within the literature that several factors associate with variation in D values. Given space limitations, only a few variables are explored here. This third exploration looks at two in particular detail. First, dissimilarity-index research has long recognized that larger regions have larger average D values than smaller regions (for example, Simpson, 2007). I use units per region to compare the effect of the region’s size on D scores for the disaggregated and the aggregated Asian groups. Figure 5 indicates that D values generally rise with increasing number of units in the region. This pattern holds across most ethnic groups. But the increase for blacks over the size range is much more rapid than for other ethnicities (almost 35 points). The gain is milder for the aggregated Asians (about 15 points) and American Indians (12 points), which, by-and-large, have similar slopes.

Leaving aside differences in averages, how well does the aggregated Asian group represent the disaggregated Asian ethnicities in terms of shape/slope of the curve? To assess this, I first calculated the difference between each group and the aggregated Asian group at each size level. Then, by calculating the standard deviation of those values for each group, we can determine the closeness of their slope to the aggregated Asian slope. According to this measure, the aggregated Asian group’s slope best represents the Korean slope (st. dev. = 0.6; for comparison, for American Indians the standard deviation is 1.1). However the Asian slope does not represent the other Asian groups as well (Chinese = 3.5, Japanese = 2.0, Filipino = 2.2; for comparison, black = 5.6). This misrepresentation seems systematic for these three groups; the largest problems come in regions with about 6-13 units as well as those with over 125 units. Stated somewhat differently, Chinese and Filipino D values essentially do not rise with increasing units after more than about 8 units, while Japanese values increase only extremely modestly. Thus the aggregated Asian slope is closer to the slope of another ethnic group (American Indians) than to three of its four constituent ethnicities. Since the aggregated Asian slope also systematically misrepresents those three groups’ slopes, aggregation’s utility here is rather questionable.

http://static.apub.kr/journalsite/sites/kwra/2019-052-11/N0200521101/images/geo_55_01_04_F5.jpg

Figure 5.

Relationship between D Values and Number of Units per Region, 1990-2010, for Seven US Ethnic Categories at County-Census Tract Region-Unit Scale. Source: US Census.

A second variable affecting D values is the region’s minority proportion (Johnston et al., 2004) (Figure 6). The overall pattern here is complex. D values generally increase from the lowest minority proportions, peak at one ethnic member per 2,500 and 3,000 people, slowly fall again to their lowest values at about 1 per 400, and then rise a bit thereafter.¹⁹⁾ According to the standard deviation test employed above, the Asian curve more closely resembles another ethnicity (blacks, in this case: 2.4) than any of its four constituent ethnicities. The aggregated Asian curve again most closely matches the Korean curve among the four Asian ethnicities (2.9), followed closely by Filipinos (2.9), Japanese (3.5), and Chinese (3.7).²⁰⁾ However the misrepresentation through aggregation here is less systematic than for regional size. It is most pronounced at the highest and lowest proportions, and less problematic in the middle ranges. We might conclude that the aggregated Asian values here do not poorly represent the constituent ethnicities’ average slope. But aggregation does obscure differences among the constituent groups: Korean D values are lowest at 1 per about 600 people; Chinese are lowest at about 1 per 400, Japanese and Filipino at about 1 per 200.

http://static.apub.kr/journalsite/sites/kwra/2019-052-11/N0200521101/images/geo_55_01_04_F6.jpg

Figure 6.

Relationship between D Values and Proportion of Ethnic Group within the Region, 1990-2010, for Seven US Ethnic Categories at County-Census Tract Region-Unit Scale. Proportion of ethnic group measured as one out of indicated number of people on the x-axis. Source: US Census.

6) Exploration IV: Relative influence of explanatory factors according to regression

The next exploration employs the same data as Exploration III. It creates a multiple linear regression model for each ethnic group to account for D. Based on my research reported elsewhere (Yorgason, 2019), dummy independent variables are census year and US state; continuous independent variables include units per region and ethnic proportion (as in Exploration III), as well as total regional population, ethnic regional population, and population per unit. These continuous variables were also grouped into categories of regional size (units and total population) and regional ethnicity (regional ethnicity population and proportion), as well as a category combining all the continuous variables together. Table 2 shows the decrease in adjusted R² values when factors are removed individually from the regression equations. This is an approximate, if imperfect, proxy for the importance of each variable in explaining D values.

Table 2. Change in Linear Regression Adjusted R² with the Individual Removal of Independent Variables and Variable Categories, 1990-2010, at County-Census Tract Scale, for Six US Ethnic Categories.

Overall Adj. R²	Korean .235	Chinese .289	Japanese .269	Filipino .268	Asian .226	Black .400
Year	.000	.042	.002	.048	.010	.029
States	.074	.072	.117	.049	.080	.087
All to continuous variables	.098	.098	.060	.099	.115	.205
Regional size	.055	.035	.028	.050	.041	.044
Units	.046	.025	.028	.028	.038	.036
Total population	.013	.002	.015	.015	.018	.018
Regional ethnicity	.036	.035	.003	.038	.028	.038
Ethnicity population	.005	.000	.001	.001	.000	.008
Ethnicity proportion	.009	.030	-.001	.026	.025	.033
Population/unit	.002	.026	.003	.003	.000	.003

All models significant at .000 level. All figures in the body of the table indicate the decrease in Adj. R² when categories removed from the models.²²⁾ Source: US Census.

As shown by the relatively low overall adjusted R² values for each ethnic group, much is “unexplained” by the models. The regressions account somewhat better for D variation within counties for blacks than for any Asian group. Nevertheless, a few conclusions regarding aggregation can be drawn. First, the overall adjusted R²for aggregated Asians does not fit the disaggregated groups well. Second, however, the aggregated Asian values (as seen through changes to adjusted R²) provide adequate averages of the disaggregated Asian groupings across many variables. The aggregated Asian values were not close representations only for ethnicity proportion, population per unit, the all-continuous-variables category, and perhaps year. Third, as with previous explorations, even adequate averages often mask significant variation among Asian groups. Supporting Figure 3, year affects Filipino and Chinese D values much more strongly than Korean or Japanese. Japanese and Filipinos are relatively far apart on states’ effects. The continuous variables together affect Japanese D scores less than those of other groups.²¹⁾ Regional size is somewhat more important for Koreans and Filipinos. Regional ethnicity matters little to Japanese results. And Chinese stand out for the importance of population per unit. Most of these effects are not large. Nevertheless, except for the category of all continuous variables, all these differences noted between Asian groups are larger than the differences between the aggregated Asians and blacks. In other words, these regressions may indicate that important differences exist between Asian groups. But aggregation covers up those differences.

7) Exploration V: Influence of geography (states)

The final exploration maps the effect of states on D, based on each state’s B values in Exploration IV’s regressions (Figure 7). The higher the expected D value for the state’s counties after regression, the darker the state appears. Differences can exceed 20 points. The maps illustrate several points. For one thing, state patterns clearly differ between blacks and aggregated Asians. The highest D values for blacks funnel from north to south, with the states surrounding the Great Lakes the wide and the lower Mississippi area the funnel’s narrow section. High values for aggregated Asians, by contrast reside particularly in Appalachia and the Deep South/eastern Southwest. Asians have substantially lower values in the Upper Midwest, while the Deep South is relatively low for blacks. The two groups share relatively low D values in New England and the West, though values for Asians are somewhat lower in the latter.

Additionally, patterns differ more between blacks and aggregated Asians than between aggregated and disaggregated Asians. On this count, aggregation does not mislead. While differences exist by nationality, the aggregated Asian map represents overall patterns well. For example, Hawaii and Nevada (and Colorado and West Coast states, to a lesser extent) have low D values among both aggregated Asians as well as most of the disaggregated groups. The aggregated Asians map also usefully represents high scores among disaggregated groups in the South generally, Northern Appalachia, and the geographic outlier of North Dakota. Nevertheless, aggregation here once again obscures differences among disaggregated Asians. The Japanese, perhaps, have the most unique geographic patterns. Japanese D values in the West, except for Hawaii, interestingly, are even lower than those of other Asian groups. And unlike other Asians, North Dakota is also relatively low for Japanese. On the other hand, Japanese have relatively high D scores in several high-population-concentration states in the Northeast, such as New York, New Jersey, and Pennsylvania. And while Appalachia shows high values for all Asian groups, it does so especially for the Japanese. Other distinctive patterns include: Koreans’ relatively low Midwest and Northeast and high Deep South scores; low Florida values for Japanese and Chinese; and high Alaska scores for Koreans and Filipinos. Overall, however, this exploration of state geography shows less damage from aggregation compared to most of the other explorations.

http://static.apub.kr/journalsite/sites/kwra/2019-052-11/N0200521101/images/geo_55_01_04_F7.jpg

Figure 7.

Post-Regression Average D Values by State, 1990-2010 for Six US Ethnic Categories at the County-Census Tract Scale. D values represent B scores from regression. Source: US Census.

5. Discussion and Conclusion

Please recall, as noted in Section 2, that the dissimilarity index measures only one aspect of segregation: unevenness of ethnic distribution within areal units (“regions” in this paper). The five explorations above produce quantitative results about how aggregation affects our understanding of this unevenness among Asian groups. Nevertheless evaluation of the impact of aggregating within dissimilarity-index research is necessarily subjective. Table 3 summarizes the explorations. The table notes key tests performed, a subjective evaluation of how closely the aggregation represents the disaggregated groups’ values, whether the tests show closer correspondence of aggregated Asians average to other Asian groups or instead to altogether other ethnicities, and important inter-Asian differences which aggregation cannot capture.

Table 3. Aggregation of Asian Groups within the Dissimilarity Index: Summary of Explorations.

Exploration	Key Test	Representativeness	Ethnic Similarity	Key Inter-Asian Disparities Concealed
I	A) Aggregated average B) Temporal trends	A) Poor B) Adequate	A) Other B) Asians	A) High Chinese D value B) Stability of Japanese and decrease in Filipino D values, 1970-2010
II	Disaggregated region-unit scale percentages	Good	Asians	Low Korean contribution of states to D values at the region scale of nation
III	Influence of regional size and ethnic proportion on D	Poor/Adequate	Other	D values lowest at different ethnic proportions for different Asian groups
IV	Comparative impact of independent variables on D	Adequate/Good	Mostly Asians	Multiple modest differences
V	Geography of D	Good	Asians	Japanese geographic uniqueness

Within Exploration I (Figure 3), the overall aggregated Asian average D value across all the data available over five censuses poorly represents individual Asian groups’ D values; it is closer to three of the four non-Asian groups calculated. Although three of the four disaggregated Asian groups have similar D values, this (or any) aggregation cannot hint that Chinese values are significantly higher. Additionally, Exploration I’s temporal trend line for the aggregated Asians reveals neither the general stability of Japanese D values across the five censuses nor the strong decrease in the Filipino line; it does adequately fit overall temporal trends among the disaggregated groups, however. Aggregation’s problems are less severe in Exploration II (Figure 4), which calculated the relative amount of dissimilarity contributed by the various unit scales at the three region scales. With the nation as the region scale, in particular, aggregated Asian patterns differed substantially from the non-Asian groups and represented the disaggregated Asian groups quite well. Nevertheless, aggregation obscured the Korean pattern of relatively low state-unit contribution to nation-region dissimilarity.

Exploration III (Figures 5 and 6) charted D values’ variation according to units per region and ethnic proportion. With the former, the aggregated value conformed well to the Korean pattern, but less so to those of other Asian groups. Overall, the curve paralleled the American Indian trend line better than Chinese, Japanese, or Filipino lines. For ethnic proportion, however, the aggregated Asian trend better matched the pattern for blacks than for any of the four individual Asian ethnicities. While aggregation usefully shows a dip in D-values at certain ethnic proportions, one that obtains in all Asian groups, aggregation cannot reveal that the location of this dip differs somewhat among the groups. Nor can it show that, unlike for most other ethnicities, Chinese, Japanese, and Filipino D values essentially do not rise after a certain point with increasing regional size. Exploration IV (Table 3) used regression to compare the relative impact of various independent variables on D values. With some exceptions, the aggregation represented the disaggregated groups closely. Nevertheless, the differences between disaggregated Asian groups is often larger than the difference between aggregated Asians and blacks. Among numerous, though generally small, differences concealed through aggregation, perhaps most significant is the high importance of states for variation in D values among Japanese. Exploration V (Figure 7) confirms this pattern; the Japanese map has relatively more extreme values than the other ethnicities. In fact, among Asian groups the geography of D values seems to differ most between Japanese and Koreans. Nevertheless, the distinction here between Asians ethnicities (including aggregated Asians) and blacks is quite apparent. Thus a clear Asian ethnic marker exists in state patterns. Additionally, the aggregated Asian category fairly represents the combined patterns of the disaggregated groups. Overall then, across the five explorations, the empirical evidence relating to the utility and problems of aggregation in dissimilarity-index research is mixed.

Nonetheless, I recommend disaggregation, wherever possible and practical, for two major reasons. First, aggregation has an inherent problem that cannot be overcome. Unless one also disaggregates, one cannot know which differences between disaggregated groups are being concealed. This is particularly important in residential decisions, where, as argued above, people of Asian ethnicities in the United States likely operate not only as objects but also active subjects. Asians may at times be acted upon as racialized objects of segregation by other Americans in ways that structure residential options beyond their full control. But they also have agency within this structure, and likely use it in part by valuing ethnic connections. These ties are far more apt to prioritize specific Asian nationality than pan-Asian American identity, I believe. Only disaggregation opens a window onto this agency.

Second, one element of the empirical evidence supersedes the others. This is the fact that, because of the way D is calculated, the most basic aggregated Asian average value creates an ecological correlation problem and in all probability represents disaggregated groups’ D values poorly. This paper illustrates this fact hypothetically and verifies it empirically. This matters more than any of the other evidence because few consumers of dissimilarity-index results ever see more than these simple averages. This article’s additional explorations-though important, I believe-likely only interest specialists in residential demography. Most discussions of D, especially in public policy, only pay attention to basic averages. So when a key conclusion of the simple, aggregated average is that Asians are less segregated than blacks in American society, important nuance is missing. Many might think that is the end of the story. Obscured are that the Asian D value is low partly because the constituent Asian groups’ residential patterns are dissimilar from one another, and also that these constituent groups exhibit individually higher D values than blacks across most counties.

Ideally, dissimilarity-index researchers should both aggregate and disaggregate. For one thing, doing so will illuminate particularities of groups typically discussed jointly-high average Chinese D values, Korean region-unit scale patterns that resemble those of blacks and whites nearly as closely as those of other Asians, or extremely geographically differentiated Japanese values, for example. I have attempted in this paper to relate several of these peculiarities to the different migration histories of the various groups. Nonetheless, since disaggregated analysis of several types has been performed for the first time in this research, it proved more difficult to account for other patterns. Hopefully, this paper provides impetus to look more deeply into these unexplained differences. Secondly, if done carefully, employing both aggregation and disaggregation can also hint toward whether, how strongly, and at what rate different Asian nationality groups are creating a larger pan-Asian American identity. Using both, additionally, may help elucidate the shifting balance between minority-groups as objects and subjects within residential segregation. Ultimately, incorporating both aggregation and disaggregation in dissimilarity-index research helps us to better understand how category construction affects D values. Too often such values can appear to be straightforward and generally comparable.

All of these objectives, incidentally, are also facilitated through greater balance between using whites and other non-X populations as the Y population in dissimilarity-index research. The present research goes somewhat against convention, by not using whites as the Y-comparison group. While white/non-white segregation continues to be an issue, I believe that the distribution of ethnic groups within the United States will benefit by moving away from the white-centered understanding of segregation that has dominated the historical understanding of the phenomenon

This paper argues that both geographic (regions and units) and ethnic categorization affect results and require interpretive contextualization. In this aspect, different patterns may be found at higher and lower scales of a nested relationship, as work on ecological correlations has shown. Within dissimilarity-index research the role of ethnic nesting has been overlooked and easily leads to an ecological fallacy in interpreting D values. It is not just an issue of which ethnicities are included in aggregation, but also the different results that exist at the different scales of the nesting. This article has shown that larger-scale aggregation of ethnic groups not only obscures important differences between the groups, though that fact is important to note. In addition, because of the way D is calculated, aggregation also makes some large- (ethnic-) scale values meaningless when applied to the smaller groups. It is hoped that further research will build on and make appropriate use of these insights with other geographic and ethnic cases.

Notes

1) The focus of this study is on four Asian-American groups and the consequences of aggregation of those groups. Many Asian groups are omitted from this analysis, but could also be dealt with in a similar manner. Likewise, the ideas in this article could be directed toward other broad ethnic/racial groups and their subgroups: in the US context, especially Hispanics, blacks, and American Indians, but even Pacific Islanders and others. This paper’s specific ethnic choices relate to extended-period data availability, the need within dissimilarity-index analysis for groups of a certain minimum size, the broadly comparable size of the different subgroups utilized, researcher interest, and the desire to not overwhelm the narrative (and figures) with too much data. While the author suspects that similar arguments and conclusions would apply to other cases, it is hoped that empirical research (beyond what is pointed to here) will be undertaken to verify and compare the conclusions drawn here.

2) Iceland et al. (2002, 119) concisely identify the five generally accepted aspects of residential segregation: “... evenness involves the differential distribution of the subject population, exposure measures potential contact, concentration refers to the relative amount of physical space occupied, centralization indicates the degree to which a group is located near the center of an ... area, and clustering measures the degree to which minority group members live disproportionately in contiguous areas.

”

3) Even the U.S. Census Bureau participated in this racializing process. When it finally began to recognize distinctions between various Asian groups with the 1970 decennial census, it categorized Chinese, Japanese, and Koreans, for example, as races rather than ethnicities (it used the latter for the finer categories of white ancestral heritage). However “race” is a profoundly problematic term. I prefer to use “ethnicity” to refer to key American social groupings, though that term has weaknesses as well. I will use “nationality” or “race” hereafter only when necessary to make my points clearly and will use “ethnicity” otherwise. In the United States context the distinction between the terms is often blurry (Robbin, 1999; Mateos, 2014).

4) It is noteworthy that Duncan and Duncan (1955) already recognized problems of the definition of spatial units (and noted their indebtedness to geographers on this issue, an issue that would be developed later through the work of Openshaw [1984] and others through the Modifiable Areal Unit Problem) within segregation indexes and their interpretation. Yet their paper did not anticipate the issue dealt with here: the problematic construction of ethnic categories.

5) Dissimilarity index values depend heavily on researcher choices: of geographical units, ethnic categories, comparison groups, etc. Thus while general patterns may hold from one study to another with different choices (though not always), precise values should not be strictly compared.

6) The option of identifying with multiple races rather than just a single race was available to U.S. Census respondents starting in 2000.

7) See also Brown and Chung (2006) for a different use of the locational quotient within segregation research.

8) These figures omit those claiming multiple-race identity; including them adds 21% to Chinese and Korean counts, 31% to Filipino, and 71% to Japanese. The differences between groups likely stem from much longer average residence among Japanese ancestry as well as, perhaps, easier linguistic and inter-marriage assimilation for Filipinos.

9) Perhaps this lack of recognition is partly due to the fact that the dissimilarity index is mathematically size-invariant. In other words, increasing the minority numbers within regions and units by any particular multiplier will not change results. Results reported by Logan and Zhang (2013) and by Iceland et al. (2014) accord with the ideas presented in this section, though neither paper conceptually explains the issue. Logan and Zhang make two assumptions about the conditions under which a pan-ethnic Asian category would be helpful. First, they correctly posit, Asian nationalities would have low dissimilarity values in relation to each other. They secondly assume that a useful pan-ethnic Asian category would have a D value around which the constituent Asian ethnicities’ D values vary. As explained here, this second condition is technically possible, but extremely unlikely in practice, since it requires fully even distributions among the Asian groups. Thus their argument points in the right direction but does not fully explain why the aggregated value is never likely to be a useful representation of the disaggregated values.

10) It is not exactly the same issue that Peach (2009) identifies with the multigroup entropy index, whose measures of diversity can be manipulated higher or lower depending on whether one aggregates. It is perhaps closer to the set of problems identified by Holt et al. (1996) and by Wong (2003), though their analyses (rather differently) target spatial aggregation rather than the ethnic aggregation focused on here. But similar principles apply in that aggregation affects the index’s outcome.

11) This may apply more broadly, since several segregation indices, like D, use some form of deviations from means, summed over areal units (for descriptions of major indices, see Iceland, et al, 2002). For example, like the dissimilarity index, the Gini coefficient measures unevenness within areal units, and ranges between zero and one for low-to-high levels of unevenness. For the hypothetical example in Figure 2D below, it returns values of 41.0, 37.9, 44.2, and 33.1 for Koreans, Chinese, Japanese, and Filipinos, respectively. But its aggregated Asian value is a not-very-representative 32.1.

12) Significantly, Iceland et al. (2014) found that disaggregated Asian groups’ dissimilarity from one another within US metropolitan areas is virtually as high as from whites.

13) A high aggregated value does tell us more than a low one in one way, however. The former implies that the average of the disaggregated values is very likely somewhat higher (and cannot be lower).

14) Analysis does not include Hispanics, or any of that broad category’s subgroups, despite their conventional inclusion in cross-ethnic segregation analysis. Part of that decision stemmed from wanting to compare Asians to other “smaller” ethnic categories such as American Indians and Hawaiians/Polynesians. Part relates to the problematic way Hispanics are regarded within American society and the US Census: a single ethnicity that consists of multiple races and even more nationality backgrounds. Rather, analysis focused on ethnic categories the US Census identifies as separate “races” (despite the problematic nature of that term). Nevertheless, including Hispanics in an analysis such as this should be fruitful for future research.

15) The aggregate average also is typically lower for aggregated Hispanics than for individual Hispanic groups, according to the calculations of Iceland et al. (2014). They found, for example, that in 2010, the average aggregated Hispanic D value in relation to non-Hispanic whites as the Y-group (under somewhat different data decisions than the present study) was 49.4. All the individual Hispanic groups had higher values: Mexicans 50.3, Puerto Ricans 51.9, Cubans 59.7, Salvadorans 65.4, and Dominicans 72.4 (a similar pattern also obtained for Hispanics with the other years and Y-groups they calculated).

16) This project’s different inputs likely led to the relatively high Japanese values here compared to Logan and Zhang’s (2013) analysis (Table 1). This may relate partly to higher multiple-ethnicity identity among Japanese than other Asian groups. Japanese might also have relatively different metropolitan and non-metropolitan dissimilarity profiles compared to others (Logan and Zhang included those reporting multiple ethnicity in 2000 and 2010 and focused exclusively on metropolitan areas). The fact that Logan and Zhang weighted averages by MSA ethnic population may also reduce scores. Japanese Americans are unusually concentrated a few areas, including some with quite low Japanese D values. These include Orange County, California; Santa Clara County, California; King County, Washington; and San Diego County, California. Interestingly, however, their two highest population counties-Honolulu County, Hawaii, and Los Angeles County, California-are not particularly low.

17) This label is an abbreviation for Hawaiians and other Polynesians.

18) A rough estimation that extrapolates from the comparison groups is that approximately 12 D-points of the 1980 increase among the four disaggregated nationalities is due to the sampling issue (perhaps a bit less for aggregated Asians). If accurate, only Japanese showed a more-than modest or negligible increase in real unevenness in 1980.

19) Black and American Indian curves result from relatively few observations at lower proportions—since categories were based on Asian groups’ distributions—and may not be fully reliable in that range.

20) The reason that Korean values are closest and Chinese values are farthest from Asian patterns according to this standard-deviation test for both regional size and ethnic percentage are unclear to the author and would require further research to explain.

21) Since the continuous variables’ effects on D were non-linear, SPSS curve transformations created more approximately linear relationships. Each transformation (for each variable-ethnicity pairing) was done separately. Readers may contact the author for additional details on regression procedures and results.

22) The causes of the extreme Japanese effects (high for states, and low for the continuous variables), as well as other major differences between Asian groups in this table, are unclear and deserve further study. One might hypothesize that they relate to the relatively long tenure of Japanese in the USA as well as, perhaps their strong western concentration, but the mechanisms mediating from these factors and these effects are far from clear.

References

Aspinall, P. J., 2003, Who is Asian? A category that remains contested in population and health research, Journal of Public Health Medicine, 55(2), 91-97.

10.1093/pubmed/fdg02112848395

Aspinall, P. J., 2005, The operationalization of race and ethnicity concepts in medical classification systems: issues of validity and utility, Health Informatics Journal, 11(4), 259-274.

10.1177/1460458205055688

Aspinall, P. J., 2009, The future of ethnicity classifications, Journal of Ethnic and Migration Studies, 35(9), 1417-1437.

10.1080/13691830903125901

Brown, L. A. and Chung, S.-Y., 2006, Spatial segregation, segregation indices and the geographical perspective, Population, Space and Place, 12, 797-817.

10.1002/psp.403

Catney, G., 2017, Towards an enhanced understanding of ethnic group geographies using measures of clustering and unevenness, The Geographical Journal, 183, 71-83.

10.1111/geoj.12162

Daniels, R., 1992, Asian America: Chinese and Japanese in the United States since 1850, University of Washington Press, Seattle.

Denton, N. A. and Massey, D. S., 1988, Residential segregation of blacks, Hispanics, and Asians by socioeconomic status and generation, Social Science Quarterly, 69, 797-817.

Diversity within diversity, n.d., Diversity and disparities, US2010 project, Spatial structures in the social sciences, Brown University, https://s4.ad.brown.edu/Projects/Diversity/DDhab/Default.aspx

Duncan, O. D. and Duncan, Beverly, 1955, A methodological analysis of segregation indexes, American Sociological Review, 20(2), 210-217.

10.2307/2088328

Ellis, M., Wright, R. and Parks, V., 2004, Work together, live apart? Geographies of racial and ethnic segregation at home and at work, Annals of the Association of American Geographers, 94(3), 620-637.

10.1111/j.1467-8306.2004.00417.x

Frey, W. H. and Myers, D., 2005, Racial segregation in U.S. metropolitan areas and cities, 1990-2000: patterns, trends and explanations, Report 05-573, Population Studies Center, University of Michigan Institute for Social Research.

Harris, R., 2014, Measuring changing ethnic separations in England: a spatial discontinuity approach, Environment and Planning A, 46, 2243-2261.

10.1068/a130021p

Harris, R., 2016, Measuring segregation as a spatial optimisation problem, revisited: a case study of London, 1991-2001, International Journal of Geographical Information Science, 30(3), 474-493.

10.1080/13658816.2015.1032973

Harris, R., 2017, Measuring the scales of segregation: looking at the residential separation of white British and other schoolchildren in England using a multilevel index of dissimilarity, Transactions of the Institute of British Geographers, 42, 432-444.

10.1111/tran.12181

Hinntershitz, S., 2013, Review of Baldoz, Rick, The Third Asiatic Invasion: Empire and Migration in Filipino America, 1898-1946, H-Empire, H-Net Reviews, May, http://www.h-net.org/reviews/showrev.php?id=38974.

Holland, K. M., 2007, A history of Chinese immigration in the United States and Canada, American Review of Canadian Studies, 37(2) 150-160.

10.1080/02722010709481851

Holt, D., Steel, D., Tranmer, M. and Wrigley, N., 1996, Aggregation and ecological effects in geographically based data, Geographical Analysis, 28(3), 244-261.

10.1111/j.1538-4632.1996.tb00933.x

Horn, A., 2005, Measuring multi-ethnic spatial segregation in South African cities, South African Geographical Journal, 87(1), 58-72.

10.1080/03736245.2005.9713827

Iceland, J., Weinberg, D. H. and Steinmetz, E., 2002, Racial and ethnic residential segregation in the United States: 1980-2000, U.S. Census Bureau, Census Special Report, CENSR-3, U.S. Government Printing Office, Washington, DC.

Iceland, J., Mateos, P. and Sharp, G., 2011. Ethnic residential segregation by nativity in Great Britain and the United States, Journal of Urban Affairs, 33(4), 409-429.

10.1111/j.1467-9906.2011.00555.x25392601PMC4225714

Iceland, J., Weinberg, D. H. and Hughes, L., 2014, The residential segregation of detailed Hispanic and Asian groups in the United States: 1980-2010, Demographic Research, 31(20) 593-624.

10.4054/DemRes.2014.31.2026097412PMC4472438

Intrator, J., Tannen, J. and Massey, D. S., 2016, Segregation by race and income in the United States 1970-2010, Social Science Research, 60, 45-60.

10.1016/j.ssresearch.2016.08.00327712688PMC5117629

Johnston, R., Forrest, J. and Poulsen, M., 2001. Sydney's ethnic geography: new approaches to analysing patterns of residential concentration. Australian Geographer, 32(2), 149-162.

10.1080/00049180123731

Johnston, R. J., Poulsen, M. F. and Forrest, J., 2003, The ethnic geography of New Zealand: a decade of growth and change, 1991-2001, Asia Pacific Viewpoint, 44(2), 109-130.

10.1111/1467-8373.00188

Johnston, R., Poulsen, M. and Forrest, J., 2004, The comparative study of ethnic residential segregation in the USA, 1980-2000, Tijdschrift voor Economische en Sociale Geografie, 95(5), 550-569.

10.1111/j.0040-747X.2004.00339.x

Lee, E., 2002, The Chinese exclusion example: race, immigration, and American gatekeeping, 1882-1924, Journal of American Ethnic History, 21(3), 36-62.

Lloyd, C. D., 2016, Spatial scale and small area population statistics for England and Wales, International Journal of Geographical Information Science, 30, 1187-1206.

10.1080/13658816.2015.1111377

Logan, J. R. and Zhang, W., 2013, Separate but equal: Asian nationalities in the U.S, US2010 Project.

Manson, S., Schroeder, J., Van Riper, D. and Ruggles, S., 2017, IPUMS national historical geographic information system: version 12.0 [dataset], University of Minnesota, Minneapolis, http://doi.org/10.18128/ D050.V12.0

Mateos, P., Singleton, A. and Longley, P., 2009, Uncertainty in the analysis of ethnicity classifications: issues of extent and aggregation of ethnic groups, Journal of Ethnic and Migration Studies, 35(9), 1437-1460.

10.1080/13691830903125919

Mateos, P., 2011. Uncertain segregation: the challenge of defining and measuring ethnicity in segregation studies, Built Environment, 37(2), 226-238.

10.2148/benv.37.2.226

Mateos, P., 2014, Ethnicity, language and populations, in Mateos, P. (ed.), Names, Ethnicity and Populations: Tracing Identity in Space, Springer, Berlin, 9-27.

10.1007/978-3-642-45413-4_2

Min, P. G. and Kim, C., 2013, Growth and settlement patterns of Korean Americans, in Min, G. P. (ed.), Koreans in North America: Their Twenty-First Century Experiences, Lexington Books, Lanham Maryland, 35-56.

Napierala, J. and Denton, N., 2017, Measuring residential segregation with the ACS: how the margin of error affects the dissimilarity index, Demography, 54, 285-309.

10.1007/s13524-016-0545-z28105579PMC5315421

Openshaw, S., 1984, Ecological fallacies and the analysis of areal census data, Environment and Planning A, 16, 17-31.

10.1068/a16001712265900

Openshaw, S., 1983, The Modifiable Areal Unit Problem (Concepts and Techniques in Modern Geography), Geo Books, Norwich.

Patterson, W., 1988, The Korean Frontier in America: Immigration to Hawaii, 1896-1910, University of Hawaii Press, Honolulu.

Peach, C., 2009, Slippery segregation: discovering or manufacturing ghettos, Journal of Ethnic and Migration Studies, 35(9), 1381-1395.

10.1080/13691830903125885

Ransom, M. R., 2000, Sampling distributions of segregation indices, Sociological Methods & Research, 28(4), 454-475.

10.1177/0049124100028004003

Reimer, D. G., 2007, Korean culture and entrepreneurship, in Miyares, I. M. and Airriess C. A. (eds.), Contemporary Ethnic Geographies in America, Rowman & Littlefield, Lanham, Maryland, 233-50.

Robbin, A., 1999, The problematic status of U.S. statistics on race and ethnicity: an 'imperfect representation of reality,' Journal of Government Information, 26(5), 467-483.

10.1016/S1352-0237(99)00078-7

Roces, M., 2015, Filipina/o migration to the United States and the remaking of gender narratives, 1906-2010, Gender & History, 27(1), 190-206.

10.1111/1468-0424.12097

Roseman, C. C., 2002, The changing ethnic map of the United States, in Berry, K. A. and Henderson, M. L. (eds.), Geographical Identities of Ethnic America: Race, Space, and Place, University of Nevada Press, Reno, 15-37.

Shlay, A. B. and Balzarini, J., 2015, Urban sociology, in Wright James, D. (ed.), International Encyclopedia of the Social & Behavioral Sciences, vol. 24, 2nd Edition, Elsevier, Oxford, 926-933.

10.1016/B978-0-08-097086-8.32164-X

Shimpi, P. M. and Zirkel, S., 2012, One hundred and fifty years of 'the Chinese question': an intergroup relations perspective on immigration and globalization, Journal of Social Issues, 68(3), 534-558.

10.1111/j.1540-4560.2012.01762.x

Simpson, L., 2007, Ghettos of the mind: the empirical behavior of indices of segregation and diversity, Journal of the Royal Statistical Society. Series A (Statistics in Society), 170(2), 405-424.

10.1111/j.1467-985X.2007.00465.x

Suzuki, M., 2002, Selective immigration and ethnic economic achievement: Japanese Americans before World War II, Explorations in Economic History, 39, 254-281.

10.1006/exeh.2002.0785

Tyner, J. A, 1999, The global context of gendered labor migration from the Philippines to the United States, American Behavioral Scientist, 42(4), 671-689.

10.1177/00027649921954417

Voas, D. and Williamson, P., 2000, The scale of dissimilarity: concepts, measurement and an application to the socio-economic variation across England and Wales, Transactions of the Institute of British Geographers, 25, 465-481.

10.1111/j.0020-2754.2000.00465.x

Vogt, W. P., 1993, Dictionary of Statistics and Methodology: A Nontechnical Guide for the Social Sciences, Sage, Newbury Park, California.

Vogt, W. P. 2007, Quantitative Research Methods for Professionals, Pearson, Boston.

Wang, Y., 2011, From structure to agency: essays on the spatial analysis of residential segregation, Ph.D. Dissertation, University of Southern California.

White, M. J., Fong, E. and Cai, Q., 2003, The segregation of Asian-origin groups in the United States and Canada, Social Science Research, 32, 148-167.

10.1016/S0049-089X(02)00023-6

Wong, D. W. S., 2003, Spatial decomposition of segregation indices: a framework toward measuring segregation at multiple levels, Geographical Analysis, 35(3), 179-194.

10.1111/j.1538-4632.2003.tb01109.x

Yorgason, E., 2019, The dissimilarity index's overlooked variables: population size, scale, and proportion within ethnic residential segregation, 지역과 세계, 43(3), 293-325.

10.33071/ssricb.43.3.201912.293

박윤환, 2011, 빈곤층과 외국인 주민 거주지분리에 대한 연구: 서울시 사례연구, 서울도시연구, 103-122.

이상일, 2007, 거주지 분화에 대한 공간통계학적 접근 (I): 공간 분리성 측도의 개발, 대한지리학회지, 42(4), 616-631.

최은진·김의준, 2011, 출신국적에 따른 서울시 외국인 이주자의 거주지 분리, 都市行政學報, 24(4), 85-107.

Journal of the Korean Geographical Society ISSN:1225-6633(Print) 대한지리학회지

Preview

Counting Together or Separately? : The Hazard of Aggregating Asian American Groups within Research Using Ethnic Geography’s Dissimilarity Index

ABSTRACT

MAIN

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

References