Glossary
An API -- application programming interface-- is a way for a person or a group of people to access functionality of a software or hardware platform in order to build their own applications. The Kaleidocade Indicators Framework (KIF) API provides a webservice enabling administrative users to embed query results-- such as dynamically updated maps-- in other web applications.
An indicator is a statistical measure that provides insight into broader patterns of change. Indicators are typically used comparatively, by judging a quantitative measurement against a reference point-- either a historical measure, an ideal, or a comparable area or institution. KIF makes it easy for users to assemble customized collections of indicators and to develop comparisons between geographic areas.
Geocoding refers to the GIS operation that assigns latitude and longitude coordinates to a geographic reference (street address, location-based data, etc.), which can then be displayed as a feature on a map or used for analytical purposes, such as identifying corresponding geographies like census tracts or police districts. KIF uses geocoding to simplify the data retrieval process by enabling users to enter an address and be provided with a list of matching data sets from which to build indicator collections. KIF's framework is built flexibly so that clients can employ the geocoding service that is best suited to their needs.
Batch geocoding is the process whereby the geocoding of large sets of data is automated. When dealing with large quantities of data, information must be encoded in a standard format so that the records can be parsed correctly and the addresses sent to the geocoding service. If clients choose to enable the batch geocoding functionality in KIF, end-users can upload their own data sets formatted as Excel spreadsheets or comma-delimited (CSV) files.
The census tract is one of the most commonly employed levels of geographical analysis, for several reasons. The boundaries of tracts are intended to be relatively permanent, making them a stable unit for tabulating and comparing census data across years. Tracts are subdivisions of counties or county equivalents and their boundaries typically follow permanent, visible features, such as streets, roads, highways, rivers, canals, railroads, and powerlines. The population within a census tract is meant to represent a relatively homogeneous group, ideally comprising about 4,000 people and 1,500 housing units. There are approximately 50,000 census tracts in the U.S., each represented by a 4-digit number and an optional 2-digit suffix that is appended when population changes necessitate that a single tract is subdivided into new tracts.
Census block groups are subdivisions of census tracts, and are the smallest unit for which the Census Bureau publishes sample data-- data that is collected from only a fraction of households. Block groups are identified by a single digit that appears as the first number in the ID of blocks located within it. There are roughly 210,000 block groups in the United States, with an optimum size of 1,500 people.
The census block is the finest scale at which the U.S. Census Bureau releases 100-percent population data-- data collected from every household. As implied by their name, blocks frequently correspond to city blocks in urban areas, but in less densely populated areas may be bounded by streets, streams or the boundaries of other legal or statistical entities. Blocks are identified by a 4-digit number, where the first digit represents the block group in which it is located. There are approximately 39 blocks per block group, with a total of 8.2 million blocks in the United States. The population of census blocks varies widely.

A single census tract is represented by the yellow area. The subdivisions within the tract are census block groups. Census blocks are further subdivisions within census block groups; their boundaries are not publicly available.

Left: This data is visualized with a quantile class scheme from yellow to red, respectively as follows: 0-6, 6-8, 8-11, 11-14, 17-70.
Right: The same are visualized differently, using an equal interval class scheme. From yellow to red, respectively as follows: 0-14, 14-28, 28-42, 42-56, 56-70.

When visualizing data, different class break schemes can produce vastly different representations of the same information. KIF enables users to create choropleth maps of indicators with customizable class breaks and color ramps.
  • Equal interval classes are a method of subdividing data such that the value ranges for each class are equal in size (e.g. 1-5, 6-10 and 11-15). With equal interval breaks, some classes may contain very few elements, while others may contain many.
  • Quantile classes are another method of subdividing data, such that the number of elements within each class is equal (e.g. a set with 12 element broken into 3 groups of 4); the breaks between classes may occur at radically different intervals.
Geographic aggregation is the process of summarizing spatial data by larger areal units. Often, data is collected as points or addresses-- think of the locations of crime incidents or the households surveyed in the decennial census-- but is only made available to the public in aggregated form, as summary statistics at the level of police precincts, census tracts or even counties. This may be done to preserve confidentiality so that individuals are not identifiable or simply to cut down on data management costs. To simplify the process of creating reports within KIF, users are asked to select a geographic area of interest and the data available at that level of aggregation are made available. KIF does provide some simple analysis functions that allow users to assemble data from existing geographies, such as searches within a given radius or visual overlay of alternate boundaries.
The Modifiable Areal Unit Problem (MAUP) characterizes a potential source of error when spatial data is aggregated to bounded areal units, such as census tracts or police districts. Because the data in question are generally points distributed over space, if the zones of aggregation (our "areal units") are drawn differently (that is, they are arbitrarily "modifiable"), then substantially different patterns may emerge. These patterns may simply be artifacts of the aggregation process; alternately, real patterns in the data may be obscured by the process of aggregation. Unfortunately, researchers are often constrained by the geographic scale at which data is available (see Geographical Aggregation).

Top: Geographic aggregation allows detailed data (represented here as points) to be summarized by area (squares).
Bottom: When the boundaries of areal units are altered, dramatically different patterns may appear to emerge from the same data. Here, the number represents the average number of points per square.

Copyright (c) 2008 - 2010, Avencia Incorporated. All rights reserved.