Skip to contents

Functions for handling missing values in "timeSeries" objects.

Usage

# S3 method for timeSeries
na.omit(object, method = c("r", "s", "z", "ir", "iz", "ie"), 
    interp = c("before", "linear", "after"), FUN, ...)

Arguments

object

an object of class "timeSeries".

method

the method of handling NAs, see section ‘Details’.

interp

Three alternative methods are provided to remove NAs from the data: type="zeros" replaces the missing values with zeros, type="mean" replaces the missing values with the column mean, type="median" replaces the missing values with the column median.

FUN

a function or a name of a function, such as "mean" or median. FUN is applied to the non-NA values in each column to determine the replacement value. The call looks like FUN(coli, na.rm = TRUE), so FUN should have argument na.rm. All arguments except object are ignored if FUN is specified.

...

arguments to be passed to the function as.matrix.

Details

Functions for handling missing values in "timeSeries" objects and in objects which can be transformed into a vector or a two dimensional matrix.

For na.omit argument method specifies how to handle NAs. Can be one of the following strings:

method = "s"

na.rm = FALSE, skip, i.e. do nothing,

method = "r"

remove NAs,

method = "z"

substitute NAs by zeros,

method = "ir"

interpolate NAs and remove NAs at the beginning and end of the series,

method = "iz"

interpolate NAs and substitute NAs at the beginning and end of the series,

method = "ie"

interpolate NAs and extrapolate NAs at the beginning and end of theseries.

Note

When dealing with daily data sets, there exists another function alignDailySeries which can handle missing data in un-aligned calendrical "timeSeries" objects.

Additional remarks by GNB:

removeNA(x) is equivalent to na.omit(x) or na.omit(x), methods = "r".

interpNA can be replaced by a call to na.omit with argument method equal to ir, iz, or ie, and argument "interp" equal to the "method" argument for interpNA (note that the defaults are not the same).

substituteNA(x, type = "zeros") is equivalent to na.omit(x, method = "z"). For other values of type one can use argument FUN, as in na.omit(x, FUN = "mean").

A final remark: the three deprecated functions are non-generic. removeNA(x) is completely redundant as it simply calls na.omit. The other two however may be useful for matrix-like objects. Please inform the maintainer of the package if you use them on objects other than from class "timeSeries" and wish them kept in the future.

References

Troyanskaya O., Cantor M., Sherlock G., Brown P., Hastie T., Tibshirani R., Botstein D., Altman R.B., (2001); Missing Value Estimation Methods for DNA microarrays Bioinformatics 17, 520--525.

See also

Examples

X <- matrix(rnorm(100), ncol = 5)  # Create a Matrix X
X[3, 5] <- NA                      # Replace a Single NA Inside
X[17, 2:4] <- c(NA, NA, NA)        # Replace Three in a Row Inside
X[13:15, 4] <- c(NA, NA, NA)       # Replace Three in a Column Inside
X[11:12, 5] <- c(NA, NA)           # Replace Two at the Right Border
X[20, 1] <- NA                     # Replace One in the Lower Left Corner
X
#>             [,1]        [,2]        [,3]        [,4]       [,5]
#>  [1,] -1.2188515 -1.51319841 -0.18022159 -0.58244338  1.1745123
#>  [2,] -1.2900417 -0.43898655 -0.66582924 -0.74490561 -1.1151520
#>  [3,]  0.4222867 -1.32640195  1.31724791 -1.50875006         NA
#>  [4,] -0.1030881 -1.19271949  0.13421979 -0.95380354 -0.3747833
#>  [5,]  0.5258783 -1.13269775  0.33373548  0.13156962 -1.2167619
#>  [6,]  0.4992021 -0.71374675  1.42513695 -0.10488850 -1.6879300
#>  [7,]  1.2296179  0.97129225 -0.66687363 -1.29914179 -0.8430539
#>  [8,]  0.4359482  0.11081679 -0.15419999 -1.81072734  1.3052824
#>  [9,] -0.7221102  1.14315208  0.39575880  0.34617192  0.2354969
#> [10,]  0.8468178 -0.79418025 -0.28903724  0.30310787  0.7766441
#> [11,]  1.9957190 -0.09344266 -1.03946394  0.62199075         NA
#> [12,]  0.8602705 -0.04080370  0.91980757 -0.07572521         NA
#> [13,] -0.1299832 -0.76676070  1.18297818          NA -0.4331491
#> [14,] -0.2294296  2.04819505 -0.06661768          NA  0.7639455
#> [15,]  0.1092238 -0.70144007  0.69121516          NA  0.1272981
#> [16,]  0.7522100  0.65586854 -1.28738212  1.42095840 -2.3297949
#> [17,]  0.9254184          NA          NA          NA -0.9856772
#> [18,] -0.2917223  1.94466287 -0.68995732 -0.37824285  0.4704501
#> [19,] -0.4709085  0.66014159 -0.53506265  0.07207710  2.2224636
#> [20,]         NA  0.37914099 -1.39575652  1.05144407 -0.1124329
Xts <- timeSeries(X)  # convert X to timeSeries Xts

## remove rows with NAs
na.omit(Xts)
#>  
#>             SS.1       SS.2       SS.3       SS.4       SS.5
#>  [1,] -1.2188515 -1.5131984 -0.1802216 -0.5824434  1.1745123
#>  [2,] -1.2900417 -0.4389865 -0.6658292 -0.7449056 -1.1151520
#>  [3,] -0.1030881 -1.1927195  0.1342198 -0.9538035 -0.3747833
#>  [4,]  0.5258783 -1.1326977  0.3337355  0.1315696 -1.2167619
#>  [5,]  0.4992021 -0.7137467  1.4251369 -0.1048885 -1.6879300
#>  [6,]  1.2296179  0.9712922 -0.6668736 -1.2991418 -0.8430539
#>  [7,]  0.4359482  0.1108168 -0.1542000 -1.8107273  1.3052824
#>  [8,] -0.7221102  1.1431521  0.3957588  0.3461719  0.2354969
#>  [9,]  0.8468178 -0.7941803 -0.2890372  0.3031079  0.7766441
#> [10,]  0.7522100  0.6558685 -1.2873821  1.4209584 -2.3297949
#> [11,] -0.2917223  1.9446629 -0.6899573 -0.3782428  0.4704501
#> [12,] -0.4709085  0.6601416 -0.5350626  0.0720771  2.2224636

## Subsitute NA's with zeros or column means (formerly substituteNA())
na.omit(Xts, method = "z")
#>  
#>             SS.1        SS.2        SS.3        SS.4       SS.5
#>  [1,] -1.2188515 -1.51319841 -0.18022159 -0.58244338  1.1745123
#>  [2,] -1.2900417 -0.43898655 -0.66582924 -0.74490561 -1.1151520
#>  [3,]  0.4222867 -1.32640195  1.31724791 -1.50875006  0.0000000
#>  [4,] -0.1030881 -1.19271949  0.13421979 -0.95380354 -0.3747833
#>  [5,]  0.5258783 -1.13269775  0.33373548  0.13156962 -1.2167619
#>  [6,]  0.4992021 -0.71374675  1.42513695 -0.10488850 -1.6879300
#>  [7,]  1.2296179  0.97129225 -0.66687363 -1.29914179 -0.8430539
#>  [8,]  0.4359482  0.11081679 -0.15419999 -1.81072734  1.3052824
#>  [9,] -0.7221102  1.14315208  0.39575880  0.34617192  0.2354969
#> [10,]  0.8468178 -0.79418025 -0.28903724  0.30310787  0.7766441
#> [11,]  1.9957190 -0.09344266 -1.03946394  0.62199075  0.0000000
#> [12,]  0.8602705 -0.04080370  0.91980757 -0.07572521  0.0000000
#> [13,] -0.1299832 -0.76676070  1.18297818  0.00000000 -0.4331491
#> [14,] -0.2294296  2.04819505 -0.06661768  0.00000000  0.7639455
#> [15,]  0.1092238 -0.70144007  0.69121516  0.00000000  0.1272981
#> [16,]  0.7522100  0.65586854 -1.28738212  1.42095840 -2.3297949
#> [17,]  0.9254184  0.00000000  0.00000000  0.00000000 -0.9856772
#> [18,] -0.2917223  1.94466287 -0.68995732 -0.37824285  0.4704501
#> [19,] -0.4709085  0.66014159 -0.53506265  0.07207710  2.2224636
#> [20,]  0.0000000  0.37914099 -1.39575652  1.05144407 -0.1124329
na.omit(Xts, FUN = "mean")
#>  
#>             SS.1        SS.2        SS.3        SS.4       SS.5
#>  [1,] -1.2188515 -1.51319841 -0.18022159 -0.58244338  1.1745123
#>  [2,] -1.2900417 -0.43898655 -0.66582924 -0.74490561 -1.1151520
#>  [3,]  0.4222867 -1.32640195  1.31724791 -1.50875006 -0.1189790
#>  [4,] -0.1030881 -1.19271949  0.13421979 -0.95380354 -0.3747833
#>  [5,]  0.5258783 -1.13269775  0.33373548  0.13156962 -1.2167619
#>  [6,]  0.4992021 -0.71374675  1.42513695 -0.10488850 -1.6879300
#>  [7,]  1.2296179  0.97129225 -0.66687363 -1.29914179 -0.8430539
#>  [8,]  0.4359482  0.11081679 -0.15419999 -1.81072734  1.3052824
#>  [9,] -0.7221102  1.14315208  0.39575880  0.34617192  0.2354969
#> [10,]  0.8468178 -0.79418025 -0.28903724  0.30310787  0.7766441
#> [11,]  1.9957190 -0.09344266 -1.03946394  0.62199075 -0.1189790
#> [12,]  0.8602705 -0.04080370  0.91980757 -0.07572521 -0.1189790
#> [13,] -0.1299832 -0.76676070  1.18297818 -0.21945678 -0.4331491
#> [14,] -0.2294296  2.04819505 -0.06661768 -0.21945678  0.7639455
#> [15,]  0.1092238 -0.70144007  0.69121516 -0.21945678  0.1272981
#> [16,]  0.7522100  0.65586854 -1.28738212  1.42095840 -2.3297949
#> [17,]  0.9254184 -0.04216359 -0.03001590 -0.21945678 -0.9856772
#> [18,] -0.2917223  1.94466287 -0.68995732 -0.37824285  0.4704501
#> [19,] -0.4709085  0.66014159 -0.53506265  0.07207710  2.2224636
#> [20,]  0.2182346  0.37914099 -1.39575652  1.05144407 -0.1124329
na.omit(Xts, FUN = "median")
#>  
#>             SS.1        SS.2        SS.3        SS.4       SS.5
#>  [1,] -1.2188515 -1.51319841 -0.18022159 -0.58244338  1.1745123
#>  [2,] -1.2900417 -0.43898655 -0.66582924 -0.74490561 -1.1151520
#>  [3,]  0.4222867 -1.32640195  1.31724791 -1.50875006 -0.1124329
#>  [4,] -0.1030881 -1.19271949  0.13421979 -0.95380354 -0.3747833
#>  [5,]  0.5258783 -1.13269775  0.33373548  0.13156962 -1.2167619
#>  [6,]  0.4992021 -0.71374675  1.42513695 -0.10488850 -1.6879300
#>  [7,]  1.2296179  0.97129225 -0.66687363 -1.29914179 -0.8430539
#>  [8,]  0.4359482  0.11081679 -0.15419999 -1.81072734  1.3052824
#>  [9,] -0.7221102  1.14315208  0.39575880  0.34617192  0.2354969
#> [10,]  0.8468178 -0.79418025 -0.28903724  0.30310787  0.7766441
#> [11,]  1.9957190 -0.09344266 -1.03946394  0.62199075 -0.1124329
#> [12,]  0.8602705 -0.04080370  0.91980757 -0.07572521 -0.1124329
#> [13,] -0.1299832 -0.76676070  1.18297818 -0.09030686 -0.4331491
#> [14,] -0.2294296  2.04819505 -0.06661768 -0.09030686  0.7639455
#> [15,]  0.1092238 -0.70144007  0.69121516 -0.09030686  0.1272981
#> [16,]  0.7522100  0.65586854 -1.28738212  1.42095840 -2.3297949
#> [17,]  0.9254184 -0.09344266 -0.15419999 -0.09030686 -0.9856772
#> [18,] -0.2917223  1.94466287 -0.68995732 -0.37824285  0.4704501
#> [19,] -0.4709085  0.66014159 -0.53506265  0.07207710  2.2224636
#> [20,]  0.4222867  0.37914099 -1.39575652  1.05144407 -0.1124329

## Subsitute NA's with a trimmed mean
na.omit(Xts, FUN = function(x, na.rm) mean(x, trim = 0.10, na.rm = na.rm))
#>  
#>             SS.1        SS.2        SS.3        SS.4       SS.5
#>  [1,] -1.2188515 -1.51319841 -0.18022159 -0.58244338  1.1745123
#>  [2,] -1.2900417 -0.43898655 -0.66582924 -0.74490561 -1.1151520
#>  [3,]  0.4222867 -1.32640195  1.31724791 -1.50875006 -0.1276874
#>  [4,] -0.1030881 -1.19271949  0.13421979 -0.95380354 -0.3747833
#>  [5,]  0.5258783 -1.13269775  0.33373548  0.13156962 -1.2167619
#>  [6,]  0.4992021 -0.71374675  1.42513695 -0.10488850 -1.6879300
#>  [7,]  1.2296179  0.97129225 -0.66687363 -1.29914179 -0.8430539
#>  [8,]  0.4359482  0.11081679 -0.15419999 -1.81072734  1.3052824
#>  [9,] -0.7221102  1.14315208  0.39575880  0.34617192  0.2354969
#> [10,]  0.8468178 -0.79418025 -0.28903724  0.30310787  0.7766441
#> [11,]  1.9957190 -0.09344266 -1.03946394  0.62199075 -0.1276874
#> [12,]  0.8602705 -0.04080370  0.91980757 -0.07572521 -0.1276874
#> [13,] -0.1299832 -0.76676070  1.18297818 -0.22296711 -0.4331491
#> [14,] -0.2294296  2.04819505 -0.06661768 -0.22296711  0.7639455
#> [15,]  0.1092238 -0.70144007  0.69121516 -0.22296711  0.1272981
#> [16,]  0.7522100  0.65586854 -1.28738212  1.42095840 -2.3297949
#> [17,]  0.9254184 -0.07859440 -0.03527544 -0.22296711 -0.9856772
#> [18,] -0.2917223  1.94466287 -0.68995732 -0.37824285  0.4704501
#> [19,] -0.4709085  0.66014159 -0.53506265  0.07207710  2.2224636
#> [20,]  0.2023988  0.37914099 -1.39575652  1.05144407 -0.1124329

## interpolate NA's linearily (formerly interpNA())
na.omit(X, method = "ir", interp = "linear")
#>             [,1]       [,2]       [,3]       [,4]       [,5]
#>  [1,] -1.2188515 -1.5131984 -0.1802216 -0.5824434  1.1745123
#>  [2,] -1.2900417 -0.4389865 -0.6658292 -0.7449056 -1.1151520
#>  [3,] -0.1030881 -1.1927195  0.1342198 -0.9538035 -0.3747833
#>  [4,]  0.5258783 -1.1326977  0.3337355  0.1315696 -1.2167619
#>  [5,]  0.4992021 -0.7137467  1.4251369 -0.1048885 -1.6879300
#>  [6,]  1.2296179  0.9712922 -0.6668736 -1.2991418 -0.8430539
#>  [7,]  0.4359482  0.1108168 -0.1542000 -1.8107273  1.3052824
#>  [8,] -0.7221102  1.1431521  0.3957588  0.3461719  0.2354969
#>  [9,]  0.8468178 -0.7941803 -0.2890372  0.3031079  0.7766441
#> [10,]  0.7522100  0.6558685 -1.2873821  1.4209584 -2.3297949
#> [11,] -0.2917223  1.9446629 -0.6899573 -0.3782428  0.4704501
#> [12,] -0.4709085  0.6601416 -0.5350626  0.0720771  2.2224636
#> attr(,"na.action")
#> [1] 20 17 13 14 15  3 11 12
#> attr(,"class")
#> [1] "omit"
na.omit(X, method = "iz", interp = "linear")
#>             [,1]       [,2]       [,3]       [,4]       [,5]
#>  [1,] -1.2188515 -1.5131984 -0.1802216 -0.5824434  1.1745123
#>  [2,] -1.2900417 -0.4389865 -0.6658292 -0.7449056 -1.1151520
#>  [3,] -0.1030881 -1.1927195  0.1342198 -0.9538035 -0.3747833
#>  [4,]  0.5258783 -1.1326977  0.3337355  0.1315696 -1.2167619
#>  [5,]  0.4992021 -0.7137467  1.4251369 -0.1048885 -1.6879300
#>  [6,]  1.2296179  0.9712922 -0.6668736 -1.2991418 -0.8430539
#>  [7,]  0.4359482  0.1108168 -0.1542000 -1.8107273  1.3052824
#>  [8,] -0.7221102  1.1431521  0.3957588  0.3461719  0.2354969
#>  [9,]  0.8468178 -0.7941803 -0.2890372  0.3031079  0.7766441
#> [10,]  0.7522100  0.6558685 -1.2873821  1.4209584 -2.3297949
#> [11,] -0.2917223  1.9446629 -0.6899573 -0.3782428  0.4704501
#> [12,] -0.4709085  0.6601416 -0.5350626  0.0720771  2.2224636
#> attr(,"na.action")
#> [1] 20 17 13 14 15  3 11 12
#> attr(,"class")
#> [1] "omit"
na.omit(X, method = "ie", interp = "linear")
#>             [,1]       [,2]       [,3]       [,4]       [,5]
#>  [1,] -1.2188515 -1.5131984 -0.1802216 -0.5824434  1.1745123
#>  [2,] -1.2900417 -0.4389865 -0.6658292 -0.7449056 -1.1151520
#>  [3,] -0.1030881 -1.1927195  0.1342198 -0.9538035 -0.3747833
#>  [4,]  0.5258783 -1.1326977  0.3337355  0.1315696 -1.2167619
#>  [5,]  0.4992021 -0.7137467  1.4251369 -0.1048885 -1.6879300
#>  [6,]  1.2296179  0.9712922 -0.6668736 -1.2991418 -0.8430539
#>  [7,]  0.4359482  0.1108168 -0.1542000 -1.8107273  1.3052824
#>  [8,] -0.7221102  1.1431521  0.3957588  0.3461719  0.2354969
#>  [9,]  0.8468178 -0.7941803 -0.2890372  0.3031079  0.7766441
#> [10,]  0.7522100  0.6558685 -1.2873821  1.4209584 -2.3297949
#> [11,] -0.2917223  1.9446629 -0.6899573 -0.3782428  0.4704501
#> [12,] -0.4709085  0.6601416 -0.5350626  0.0720771  2.2224636
#> attr(,"na.action")
#> [1] 20 17 13 14 15  3 11 12
#> attr(,"class")
#> [1] "omit"
   
## take previous values in a column
na.omit(X, method = "ir", interp = "before")
#>             [,1]       [,2]       [,3]       [,4]       [,5]
#>  [1,] -1.2188515 -1.5131984 -0.1802216 -0.5824434  1.1745123
#>  [2,] -1.2900417 -0.4389865 -0.6658292 -0.7449056 -1.1151520
#>  [3,] -0.1030881 -1.1927195  0.1342198 -0.9538035 -0.3747833
#>  [4,]  0.5258783 -1.1326977  0.3337355  0.1315696 -1.2167619
#>  [5,]  0.4992021 -0.7137467  1.4251369 -0.1048885 -1.6879300
#>  [6,]  1.2296179  0.9712922 -0.6668736 -1.2991418 -0.8430539
#>  [7,]  0.4359482  0.1108168 -0.1542000 -1.8107273  1.3052824
#>  [8,] -0.7221102  1.1431521  0.3957588  0.3461719  0.2354969
#>  [9,]  0.8468178 -0.7941803 -0.2890372  0.3031079  0.7766441
#> [10,]  0.7522100  0.6558685 -1.2873821  1.4209584 -2.3297949
#> [11,] -0.2917223  1.9446629 -0.6899573 -0.3782428  0.4704501
#> [12,] -0.4709085  0.6601416 -0.5350626  0.0720771  2.2224636
#> attr(,"na.action")
#> [1] 20 17 13 14 15  3 11 12
#> attr(,"class")
#> [1] "omit"
na.omit(X, method = "iz", interp = "before")
#>             [,1]       [,2]       [,3]       [,4]       [,5]
#>  [1,] -1.2188515 -1.5131984 -0.1802216 -0.5824434  1.1745123
#>  [2,] -1.2900417 -0.4389865 -0.6658292 -0.7449056 -1.1151520
#>  [3,] -0.1030881 -1.1927195  0.1342198 -0.9538035 -0.3747833
#>  [4,]  0.5258783 -1.1326977  0.3337355  0.1315696 -1.2167619
#>  [5,]  0.4992021 -0.7137467  1.4251369 -0.1048885 -1.6879300
#>  [6,]  1.2296179  0.9712922 -0.6668736 -1.2991418 -0.8430539
#>  [7,]  0.4359482  0.1108168 -0.1542000 -1.8107273  1.3052824
#>  [8,] -0.7221102  1.1431521  0.3957588  0.3461719  0.2354969
#>  [9,]  0.8468178 -0.7941803 -0.2890372  0.3031079  0.7766441
#> [10,]  0.7522100  0.6558685 -1.2873821  1.4209584 -2.3297949
#> [11,] -0.2917223  1.9446629 -0.6899573 -0.3782428  0.4704501
#> [12,] -0.4709085  0.6601416 -0.5350626  0.0720771  2.2224636
#> attr(,"na.action")
#> [1] 20 17 13 14 15  3 11 12
#> attr(,"class")
#> [1] "omit"
na.omit(X, method = "ie", interp = "before")
#>             [,1]       [,2]       [,3]       [,4]       [,5]
#>  [1,] -1.2188515 -1.5131984 -0.1802216 -0.5824434  1.1745123
#>  [2,] -1.2900417 -0.4389865 -0.6658292 -0.7449056 -1.1151520
#>  [3,] -0.1030881 -1.1927195  0.1342198 -0.9538035 -0.3747833
#>  [4,]  0.5258783 -1.1326977  0.3337355  0.1315696 -1.2167619
#>  [5,]  0.4992021 -0.7137467  1.4251369 -0.1048885 -1.6879300
#>  [6,]  1.2296179  0.9712922 -0.6668736 -1.2991418 -0.8430539
#>  [7,]  0.4359482  0.1108168 -0.1542000 -1.8107273  1.3052824
#>  [8,] -0.7221102  1.1431521  0.3957588  0.3461719  0.2354969
#>  [9,]  0.8468178 -0.7941803 -0.2890372  0.3031079  0.7766441
#> [10,]  0.7522100  0.6558685 -1.2873821  1.4209584 -2.3297949
#> [11,] -0.2917223  1.9446629 -0.6899573 -0.3782428  0.4704501
#> [12,] -0.4709085  0.6601416 -0.5350626  0.0720771  2.2224636
#> attr(,"na.action")
#> [1] 20 17 13 14 15  3 11 12
#> attr(,"class")
#> [1] "omit"


## examples with X (which is a matrix, not "timeSeries")
## (these examples are not run automatically as these functions are
## deprecated.) 
if(FALSE){
## Remove Rows with NAs
removeNA(X)
   
## subsitute NA's by zeros or column means
substituteNA(X, type = "zeros")
substituteNA(X, type = "mean")
   
## interpolate NA's linearily
interpNA(X, method = "linear")
# Note the corner missing value cannot be interpolated!
   
## take previous values in a column
interpNA(X, method = "before")
# Also here, the corner value is excluded
}