Notice some outliers or problematic cases in your dataset and want a shorthand way to quickly remove them while also keeping a record of which cases you removed? No problem, there are numerous ways to approach this.
If it is just one or a few numerical cases, then a great shorthand is:
SELECT IF VARNAME <> CASE.
SELECT IF (VARNAME ne CASE)
With this syntax, replace VARNAME with the identifying variables (i.e., the variable that will identify the case you want to remove) and CASE with the specific entry within that variable. For instance, if your VARNAME is ID and the CASE you want to drop is 653, then your syntax would look like this:
SELECT IF ID <> 653.
SELECT IF (ID ne 653).
If you have a few cases rather than just one, the latter syntax may be more efficient to use. For example, imagine you also have cases 155, 374, and 416 you want to remove. Here is what the syntax would look like:
SELECT IF (ID ne 653 and ID ne 155 and ID ne 374 and ID ne 416).
You can also use the the exact same syntax with string variables by adding ‘ ‘ around the entry that would identify the case you want to remove. For example:
SELECT IF NAME <> ‘Dave’.
SELECT IF (NAME ne ‘Dave’).
SELECT IF (NAME ne ‘Dave’ and NAME ne ‘Bob’ and NAME ne ‘Bill’).
If you have a large dataset and want to remove a good chunk of cases – say you have a number of cases that are missing on a key variable – then you can use the following syntax:
SELECT IF (not missing(VARNAME)).
You may come across circumstances where you need to get more creative with your case removal syntax, but in general these are the basic approaches you’ll most often use. As I come across new strategies for removing cases in SPSS, I will be sure to add them to this post for reference.
[More to come]