Point and click may be most useful if
Syntax programming may be most useful if
For this course, I recommend you become familiar with both methods, but use syntax for your statistical operations. Having a "dossier" of syntax commands will allow you to run multivariate statistics on more than one data set.
Below are common syntax commands for data management:
RECODE var1 (3=1) (2=2) (1=3) INTO rvar1. This recodes into a different variable. RECODE var1 var2 (3=1) (2=2) (1=3) INTO rvar1 rvar2. This recodes many variables.
Example:
RECODE YEARHOSP (85=1) (86=1) (87=1) (88=1) (89=1) (90=2) (91=2) (92=2) (93=2) (94=2) into DECADE . EXECUTE. VALUE LABELS DECADE 1 '1980s' 2 '1990s' . The "thru" subcommand enables recoding cases from one value through another: RECODE YEARHOSP (80 thru 89=1) (90 thru 99=2) into DECADE. EXECUTE. VALUE LABELS DECADE 1 '1980s' 2 '1990s' .
COMPUTE var1 = var2.
COMPUTE var1 = MEAN(a1 TO a6).
IF ELSE is a set of commands used to transform variables based on logical arguments. Note that only one logical statement can be made at a time.
IF (var1 > 2 AND var1< 1) var1 = var2.
Example:
COMPUTE NUMREHAB = 0. IF ((rhb1days GE 1) AND (rhb2days EQ 0)) NUMREHAB = 1. IF ((rhb2days GE 1) AND (rhb3days EQ 0)) NUMREHAB = 2. IF ((rhb3days GE 1) AND (rhb4days EQ 0)) NUMREHAB = 3. IF ((rhb4days GE 1) AND (rhb5days EQ 0)) NUMREHAB = 4. IF (rhb5days GE 1) NUMREHAB = 5 .(note: you wil have to run this command and then go to your data window and choose "run pending transformations" in the "Transform" pull-down menu)
Example:
MISSING VALUES GCSER DRSATDC INDEPLEV (999) - assigns a missing value to cases with 999 for these three variables MISSING VALUES ACUTDAYS (0, 999) - assigns a missing value to cases with 0 or 999 for this variable
When variables are compared to numbers or other variables, the following keywords or signs can be used:
Key | Symbol | Meaning |
---|---|---|
EQ | = | Equal to |
NE | ~= | Not equal to |
GE | >= | Greater than or equal to |
GT | > | Greater than |
LE | <= | Less than or equal to |
LT | < | Less than |
Several conditions (comparisons) may be concatenated by AND (symbol: &) and/or OR (symbol: ¦). If two conditions are concatenated by AND, the whole expression is true only if both conditions are met. If two conditions are concatenated by OR, the whole expression is true if one of the conditions is met. Several conditions may be be concatenated by AND and/or OR clauses. In addition, you may specify, instead of a condition having to be met, a condition that must NOT (symbol: ~) be met. AND, OR and NOT are called logical operators.
It is usually highly recommended to use parentheses to clarify the priorities of clauses. For instance, you may be looking for single mothers in your data. You first will check whether a person is female (say, gender EQ 1) and if she is never married or divorced or widowed (say, famst EQ 3, 4 or 5); and if this is true, you have to check whether the number of children in the household is greater than 0 (nkids GT 0). Now if you write:
WRONG:
IF (gender EQ 1 AND famst EQ 3 OR famst EQ 4 OR famst EQ 5
AND nkids GT 0) singlemo = 1.
you will, e.g. code as single mothers all people who are divorced, no matter whether they have children or not and no matter whether they are female or not. This is because the IF clause becomes true if one of the conditions concatenated by OR is true, such as the condition famst EQ 4. The right way to get what you want is:
IF (gender EQ 1 AND (famst EQ 3 OR famst EQ 4 OR famst EQ 5)
AND nkids GT 0) singlemo = 1.
Here, all conditions concatenated by OR are counted, as it were, as one condition, and this condition is linked to the other conditions by AND.