What is data? How does it relate to statistics?
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data (plural of "datum") are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which information and then knowledge are derived. Raw data, i.e. unprocessed data, refers to a collection of numbers, characters, images or other outputs from devices that collect information to convert physical quantities into symbols. Statistics is the study of the collection, organization, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments.
Regression With Social Data : Modeling Continuous And Limited Response Variables / Alfred DeMaris Hoboken, NJ : Wiley-Interscience, c2004
HON HA31.3 .D46 2004 NOT CHECKED OUT
Using Statistics A Gentle Introduction / [Electronic Resource] : Gordon Rugg. [Electronic Resource] Maidenhead, England ; New York : McGraw Hill/Open University Press, 2007.
Statistical Methods Of Analysis / Chin Long Chiang Hoboken, NJ : Wiley-Interscience, c204. HON HA31.3 .D46 2004 NOT CHECKED OUT
Where Do I Start?
Practical First Steps
1. Define Topic and Unit of Analysis
2. Create a specific statement of exactly the kind of data you need.
3. Identify Data Sources Locate Government Agencies, Organizations and Honnold Library or Claremont Colleges commercial subscriptions.
4. Review Literature Search the major bibliographic databases in your field.
5. Restricted Data: Some data supplied by data archives such as ICPSR contain restricted data. This data requires special security to insure protection of confidential material. If you identify restricted data that you wish to request, please contact firstname.lastname@example.org for help.
Define Unit and Geographic Level of Analysis
Unit of coverage
- Individual Level
- Institutional Level: company, health facility, school
- Production Level: automobiles, commodities
Local: city agencies National: federal agencies, research centers International: international organizations Note: Not all data is available at the geographic level you need. Some data is only available at the state or county level.
Define Frequency & Time Series
Annual Quarterly Monthly Daily? Note: Some frequencies may need to be calculated.
Series of measurements over regular intervals of time
- Cross sectional: collected at the same point of time for several individuals.
- Longitudinal/Panel: data collected at a sequence of time points for each of a sample of individuals.
- Time Series: data collected at a sequence of time points, usually at a uniform frequency.
- Pooled cross sectional time series: mixture of time series data and cross-section data