You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that ROMM within addNoise is implemented in a way not preserving sample means. Below I suggest how to fix this and speed up the calculations remarkably by utilizing methodology in a recent paper. See https://github.com/olangsrud/RegSDC (hopefully soon on CRAN). Below, I will refer to the functions in that package.
y <- testdata[sample(NROW(testdata), 100), c("expend", "income", "savings")]
addNoise(y, method = "ROMM")$xm
# An almost identical (read about sequentially phenomenon in paper for minor differences) method is
RegSDCromm(y, lambda = 0.001, ensureIntercept = FALSE)
# This can be viewed as a high-speed version of the current implementation in addNoise.
# Sample means is preserved by the default method where ensureIntercept = TRUE.
# Other values of lambda may be used.
RegSDCromm(y, lambda = 0.001)
# This is equivalent to calling a more general function
RegSDCgen(y, lambda = 0.001, makeunique = TRUE)
# The parameter makeunique is of minor importance, but must be TRUE if exact distributional behaviour
# is important (sample form RegSDCromm several times). So setting makeunique to FALSE can be OK.
# Feel free to import/wrap functions from RegSDC within sdcMicro.
# However, this line
RegSDCgen(y, lambda = 0.001, makeunique = FALSE)
# can be implemented without using RegSDC by
lambda <- 0.001
y <- as.matrix(y)
Mean <- function(x) t(matrix(colMeans(x), ncol(x), nrow(x)))
qr1 <- qr(y - Mean(y))
qr1Q <- qr.Q(qr1)
z <- qr1Q + lambda * matrix(rnorm(length(qr1Q)), nrow(y))
qr2 <- qr(z - Mean(z))
Mean(y) + qr.Q(qr2) %*% qr.R(qr1)
# Here Mean can be replaced in several ways. The difference from the result using RegSDCgen is at the
# level of numerical precision (use set.seed to see).
The text was updated successfully, but these errors were encountered:
It seems that ROMM within addNoise is implemented in a way not preserving sample means. Below I suggest how to fix this and speed up the calculations remarkably by utilizing methodology in a recent paper. See https://github.com/olangsrud/RegSDC (hopefully soon on CRAN). Below, I will refer to the functions in that package.
The text was updated successfully, but these errors were encountered: