Facilitators • Ed Nelson– CSU Fresno ednelson@csufresno.edu • Jim Ross – CSU Bakersfield jross@csub.edu • Campus representatives for the Social Sciences Research and Instructional Council (SSRIC)
4.
Social Science Researchand Instructional Council (SSRIC) • Discipline council for the social sciences made up of representatives from each campus in the CSU. List of campus representatives can be found at http://www.ssric.org/reps • Promotes use of data analysis in research and teaching • Website is at http://www.ssric.org
5.
Social Science DataBases • The SSRIC helps maintain and promote the use of the social science data bases in the CSU • Data bases include: – Inter-university Consortium for Political and Social Research (ICPSR) – The Field Institute – The Roper Center for Public Opinion Research • We’ll explore these data bases and how to use them at the workshop tomorrow
6.
Agenda for theIntroductory SPSS Workshop • Overview of SPSS • A brief tour • Transforming data – Recode – Compute – Select If • Univariate analysis – Frequencies – Descriptives – Explore • A look ahead at the intermediate workshop
7.
Overview of SPSS •SPSS is a statistical package for beginning, intermediate, and advanced data analysis • Other statistical packages include SAS and Stata • Online statistical packages that don’t require site licenses include SDA
8.
Text – SPSSfor Windows Version 13 A Basic Tutorial • Authors: Linda Fiddler (Bakersfield), Laura Hecht (Bakersfield), Ed Nelson (Fresno), Elizabeth Nelson (Fresno), Jim Ross (Bakersfield) • Available from McGraw-Hill Custom Publishing. Call 800-338-3987 to order. Request ISBN 0- 07-353671-7 • Available on the web at http://www.csub.edu/~jross/projects/spss/. The data set for this workshop can be downloaded at this site
9.
Current Version ofSPSS • Current version is 14.0 • Text is for version 13.0 • Text is revised every other version
Opening SPSS • Goto start and find SPSS for Windows • Click on SPSS 13.0 for Windows to open • You’ll need to update your SPSS license every year (or your school technician will do it for you)
12.
Creating Your OwnData File • We’re not going to go through how you would create your own data file. It would take too long. But you can go to ch. 2 in the text for a thorough discussion. (Note: the slides for creating your own data file are “hidden” in this PowerPoint presentation.) • It involves creating: – Variable names – Variable labels – Value labels – Missing values
13.
Opening an ExistingFile • Usually you will want to open a data set that you got from someplace else such as: – ICPSR – Field Institute – Roper Center • These files will usually be in the form of a: – SPSS portable file – SPSS data file – Raw data file with a SPSS syntax file – Raw data file without a syntax file
14.
Opening a Portablefile • Click on the open yellow folder to open a new file • Change file type to .por • Browse to where the portable file you want to open is located and double click on that file
15.
Opening a DataFile • Click on the open yellow folder to open a new file • Change file type to .sav • Browse to where the data file you want to open is located and double click on that file • We’re going to use the data set that comes with the text – gss02a.sav. You can download it from the web site that has the text -- http://www.csub.edu/~jross/projects/spss/
16.
Opening a RawData File with a SPSS Syntax File • Sometimes you will need to open a raw data file (ASCII or text) and there will be an accompanying SPSS syntax file • You will need to modify the “File Handle” and “Save Outfile” commands • See http://www.icpsr.umich.edu/help/newuser.html#0 5 for more information • You may need help doing this. Feel free to contact your campus SSRIC representatives or the facilitators for this workshop
17.
Opening a RawData File Without a SPSS Syntax File • If you don’t have a SPSS syntax file you will have to use the codebook that came with the data and create your own syntax file • You may need help doing this. Feel free to contact your campus SSRIC representatives or the facilitators for this workshop
18.
What’s Next? • Nowyou know how to open an existing SPSS portable or data file • Let’s do a quick overview of SPSS and then we’ll learn how to transform variables
19.
A Brief Tourof SPSS (see ch. 1 in text, pp. 5-10) • Frequencies -- Analyze/Descriptive Statistics/Frequencies – Select ABANY and move it to the big box and click on OK • Crosstabs – Analyze/Descriptive Statistics/Crosstabs – Move ABANY to the “Row” box – Move SEX to the “Column” box – Click on “Cells” and select “Column” percents – Click on OK
20.
A Brief TourContinued • Comparing means – Analyze/Compare Means/Means – Move AGEKDBRN and EDUC in the “Dependent List” box – Move SEX to the “Independent List” box – Click on OK
21.
A Brief TourContinued • Correlations – Analyze/Correlate/Bivariate – Move EDUC, MAEDUC, and PAEDUC into the “Variables” box – Click on OK
22.
A Brief TourContinued • Scatterplots – Graphs/Scatter/Dot – Click on “Simple Scatter” and then on “Define” – Move EDUC into the “Y axis” box – Move PAEDUC into the “X Axis” box – Click on OK
23.
Transforming Data (see ch.3 in text) • We can transform variables by recoding which means to combine categories on an existing variable into fewer categories • We can transform variables by creating new variables out of existing variables • We can select particular cases and analyze only these cases • We can do other things like weighting cases that we’re not going to talk about in this workshop. (Note: the slides for weighting data are “hidden” in this PowerPoint presentation.)
24.
Recoding Variables • Recodinginto different variables • Recoding into the same variable • We recommend recoding into different variables and not using the into same variable option
25.
Recoding into DifferentVariables • Click on “Transform” and then on “Recode” and then on “into different variables” • Select the variable you want to recode • Start by giving the new variable a new name and assigning a variable label to the new variable. Click on “Change”
26.
Recoding AGE intoAGE1 • Recode AGE into four categories and give it the name of AGE1 – Click on “Old and New Values” • Use “Range” (fourth option down) to recode as follows. Remember to click on “Add” after entering each recode – 18 to 29 = 1 – 30 to 49 = 2 – 50 to 69 = 3 – 70 to 89 = 4
27.
Recoding Options • Whenyou click on “Old and New Values” there will be seven options • For most recoding you will only have to use two of these options – The first option from the top allows you to recode a single value into a new value – The fourth option from the top allows you to recode a range of values from X to Y into a new value
28.
Assign Value Labelsto the Four Categories of AGE1 • Go into “Variable View” • Find the variable AGE1 (should be at the bottom of the list of variables) • Click in the “Values” column and then click on the small gray box • Enter the value labels • Click on OK
29.
Exercises for Recoding •INCOME98 is total family income. Do a frequency distribution to see what it looks like before recoding • Recode into 4 categories and call this new variable INCOME1. Use the following categories: under $20K, $20K to under $40K, $40K to under $60K, and $60K and over • Add the value labels • Run a frequency distribution for INCOME1 and check to make sure that you recoded it correctly by comparing the unrecoded and recoded frequency distributions
30.
More Exercises forRecoding • Now recode INCOME98 again and call the new variable INCOME2 • This time use 8 categories: under $10K, $10K to under $20K, $20K to under $30K, $30K to under $40K, $40K to under $50K, $50K to under $60K, $60K to under $75K, and $75K and over • Add the value labels • Run a frequency distribution for INCOME2 and check to make sure that you recoded it correctly by comparing the unrecoded and recoded frequency distributions
31.
Creating a NewVariable with Compute • Let’s create a new variable and call it ABORTION which is the sum of the seven abortion variables • Click on “Transform” and then on “Compute” • Enter the new variable name (ABORTION) into the target variable box • Enter the formula for this new variable into the “Numeric Expression” box • Click on OK
32.
Dealing with MissingData • If there is missing data for any of these variables (ABANY to ABSINGLE), the new variable ABORTION will be assigned a system missing value • What do we do if we want to allow no more than two missing values? • Let’s compute the mean value and divide the sum of the abortion values by the number of cases with valid information • But let’s allow only two variables with missing values
33.
Dealing with MissingData Continued • Click on “Reset” to erase what is currently in the “Compute Variable” box • Click on “Statistical” in the “Function Group” box • Then double click on “Mean” in the “Function and Special Variables” box • In the “Target Variable” box, enter the name of the new variable. Let’s call it ABORMEAN • In the “Numeric Expression” box, you should see “MEAN(?,?)”
34.
Dealing with MissingData Continued • Replace the “?,?” with the variables you want to include so it reads “MEAN (abany,abdefect,abhlth,abnomore,abpoor, abrape,absingle)” • Insert .5 following MEAN so it reads “Mean.5”. This indicates that you want to have at least five variables with valid information • Click on OK
35.
Exercises for Compute •There are five variables that measure tolerance for letting someone speak in your community who may have different views than your own: SPKATH, SPKCOM, SPKHOMO, SPKMIL, and SPKRAC • For each of these variables, 1 means they would allow such a person to speak and 2 means they would not allow it
36.
Exercises for ComputeContinued • Create a new variable (call it SPEAK) which is the sum of these five variables • Run a frequency distribution for SPEAK • What do the values in this new variable tell us?
37.
More Exercises forCompute • Now let’s create a variable called SPKMEAN which allows for one of the five variables (SPKATH to SPKRAC) to be missing • What happens if there is more than one variable with a missing value? • How does SPSS calculate the new variable if there is only one variable with a missing value?
38.
Using Select Casesto Select Specific Cases for Analysis • Let’s select only Protestants for further analysis • Click on “Data” and then on “Select Cases” • Click on “If condition is satisfied” and then on the “If” button below it • Select the variable RELIG and move it into the box on the right • In this box, enter the expression “relig = 1” • Click on “Continue” and on OK
39.
Using Select CasesContinued • Now lets select Protestants who are under 35 years age old • Enter the expression “relig = 1” as you did before. • Use & for and. Enter “age < 35” so the expression reads “relig = 1 & age < 35” • Click on OK
40.
Exercises for SelectIf • Select all males (1 on the variable SEX) and do a frequency distribution for the variable FEAR (afraid to walk alone at night in the neighborhood) • Now select all females (2 on the variable SEX) and fun a frequency distribution for FEAR • Are males or females more fearful of walking alone at night?
41.
More Exercises forSelect If • Now let’s select males under age 35 and run a frequency distribution for FEAR • Do the same thing for females under 35 • Are males or females under 35 more fearful of walking alone at night?
42.
Important Note onUsing Select Cases • When you are finished using “Select Cases” and want to revert to using all the cases be sure to click on Data/Select Cases and select “All cases”. Then click on OK • If you don’t do this, you will continue to use only those cases you last selected
43.
Univariate Analysis • Nowthat we know how to open existing files and transform variables, we’re ready to begin analyzing data • Univariate analysis refers to analyzing variables one-at-a-time
Frequencies • Go toAnalyze/Descriptive Statistics/Frequencies • Select ABANY and AGE and click on OK
46.
Bar Charts • Barcharts – click on Analyze/Descriptive Statistics/Frequencies • Click on “Charts” • Select “Bar Charts” and click on “Continue” and then on OK • Do you think bar charts are appropriate for both ABANY and AGE?
47.
Histograms • Click onclick on Analyze/Descriptive Statistics/Frequencies • Click on “Charts” • Select “Histograms” and click on “Continue” and then on OK • Do you think histograms are appropriate for both ABANY and AGE? • Which do you think is the most appropriate chart (bar chart or histogram) for ABANY and for AGE?
48.
Statistics • Click onAnalyze/Descriptive Statistics/Frequencies • Click on “Statistics” • Select the statistics you want and click on “Continue” and then on OK
49.
Exercises for Frequencies •There are seven variables dealing with abortion: ABANY, ABDEFECT, ABHLTH ABNOMORE, ABPOOR, ABRAPE, and ABSINGLE • Run a frequency distribution for each variable • Get a bar chart for each variable • Compare and contrast how people answered these seven questions
50.
More Exercises forFrequencies • Run the frequency distribution for AGE • Get a histogram for AGE • Compute the following statistics for AGE: – Mean – Median – Standard deviation – Percentiles – 25th , 50th , and 75th
51.
Descriptives • Click onAnalyze/Descriptive Statistics/Descriptives • Select AGE and EDUC • Click on “Options” and select the statistics you want and then click on “Continue” and OK
52.
Exercises for Descriptives •Use Descriptives to compute the following statistics for AGE – Mean – Standard deviation – Variance – Skewness – Kurtosis
53.
More Exercises forDescriptives • Use Descriptives to compute the mean for EDUC, MAEDUC, PAEDUC • Who has the most education – respondents or their parents? • Who has the most education – mothers or fathers?
54.
Explore • Click onAnalyze/Descriptive Statistics/Explore • Select EDUC and put it in the “Dependent List” • In the Display box on the lower left, click on “Both” • Click on OK
55.
Selecting Statistics forExplore • Click on Analyze/Descriptive Statistics/Explore • Click on “Statistics” and select the statistics you want • Click on “Continue” and then OK
56.
Selecting Plots forExplore • Click on “Plots” • Select the plots you want • Click on “Continue” and then OK
57.
Exercises for Explore •Using Explore to get the following statistics and plots for the variables EDUC, PAEDUC, and MAEDUC – Descriptives – Outliers – Stem-and-leaf plot – Histogram – Boxplot • First select “Factor levels together” and run it • Then select “Dependents together” and run it again • What’s the difference?
58.
Intermediate Workshop forSPSS • In the next workshop we’ll look at different types of statistical analysis you can do in SPSS – Cross tabulations (ch. 5) – Comparing means (ch. 6) – Correlation and regression (ch. 7) – Multivariate analysis (ch. 8) • Cross tabulations • Multiple regression – Presenting your data – charts and tables (ch. 9)