From 5b95ad35b2706ba4e29c85d167baf5553b93fcd2 Mon Sep 17 00:00:00 2001 From: "Jose R. Zubizarreta" Date: Mon, 26 Jun 2023 21:10:02 +0000 Subject: [PATCH] version 0.5.1 --- DESCRIPTION | 17 +- MD5 | 34 +- NAMESPACE | 16 +- R/bmatch.r | 881 ++++++++------- R/cardmatch.r | 113 +- R/constraintmatrix.r | 908 +++++++-------- R/nmatch.r | 1836 ++++++++++++++++--------------- R/oneprob_profmatch.r | 98 +- R/problemparameters_cardmatch.r | 1 + R/problemparameters_profmatch.R | 3 +- R/relaxation_b.r | 100 +- R/relaxation_n.r | 122 +- man/bmatch.Rd | 2 +- man/cardmatch.Rd | 8 +- man/designmatch-package.Rd | 126 +-- man/distmatch.Rd | 4 +- man/nmatch.Rd | 6 +- man/profmatch.Rd | 4 +- 18 files changed, 2345 insertions(+), 1934 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 83c8214..8e928ec 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,17 +1,16 @@ Package: designmatch Type: Package Title: Matched Samples that are Balanced and Representative by Design -Version: 0.4.1 -Date: 2022-08-25 +Version: 0.5.1 +Date: 2023-06-21 Author: Jose R. Zubizarreta , Cinar Kilcioglu , Juan P. Vielma , Eric R. Cohn Maintainer: Jose R. Zubizarreta -Depends: R (>= 3.2), lattice, MASS, slam, Rglpk -Enhances: gurobi, Rcplex, Rmosek, Rsymphony -SystemRequirements: GLPK library package (e.g., libglpk-dev on - Debian/Ubuntu) +Depends: R (>= 3.2), lattice, MASS, slam, highs +Enhances: gurobi, Rcplex, Rglpk, Rmosek, Rsymphony License: GPL-2 | GPL-3 -Description: Includes functions for the construction of matched samples that are balanced and representative by design. Among others, these functions can be used for matching in observational studies with treated and control units, with cases and controls, in related settings with instrumental variables, and in discontinuity designs. Also, they can be used for the design of randomized experiments, for example, for matching before randomization. By default, 'designmatch' uses the 'GLPK' optimization solver, but its performance is greatly enhanced by the 'Gurobi' optimization solver and its associated R interface. For their installation, please follow the instructions at and . We have also included directions in the gurobi_installation file in the inst folder. +Description: Includes functions for the construction of matched samples that are balanced and representative by design. Among others, these functions can be used for matching in observational studies with treated and control units, with cases and controls, in related settings with instrumental variables, and in discontinuity designs. Also, they can be used for the design of randomized experiments, for example, for matching before randomization. By default, 'designmatch' uses the 'highs' optimization solver, but its performance is greatly enhanced by the 'Gurobi' optimization solver and its associated R interface. For their installation, please follow the instructions at and . We have also included directions in the gurobi_installation file in the inst folder. NeedsCompilation: no -Packaged: 2022-08-25 19:15:48 UTC; eric +Packaged: 2023-06-26 19:11:34 UTC; eric Repository: CRAN -Date/Publication: 2022-08-25 22:22:40 UTC +Date/Publication: 2023-06-26 22:10:02 UTC +RoxygenNote: 7.2.3 diff --git a/MD5 b/MD5 index e5c9787..ef86d6c 100644 --- a/MD5 +++ b/MD5 @@ -1,10 +1,10 @@ -1913c0b2bfe68cd7e30b4473718ee292 *DESCRIPTION -a84e461889448bab6f00dc036a173876 *NAMESPACE +33685adfb5060742d93ab1f1bfa579c2 *DESCRIPTION +45cba158a6e3de9c2156a518f2f24628 *NAMESPACE 4035589e4087aa8c0dfc97e90db340f8 *R/absstddif.r 7d1a85944bcd2594e3ab13ffdd80bd1e *R/addcalip.r -06d2cede573309b27980bbb8869ad01e *R/bmatch.r -53910a26ffdd46067f0ff93d9694b58d *R/cardmatch.r -1ccbbc3d8e654fcd4366a8c93350178b *R/constraintmatrix.r +9017e1fba5bece7acf56712cef12c2d8 *R/bmatch.r +9dc5166095b7987dbf4e742345004242 *R/cardmatch.r +9e77b9b929f571ca8ac335edda5e67f1 *R/constraintmatrix.r ddd36bec26100d3064a1bf106c36740c *R/distmat.r 692b745a194a270376faf4842b958a02 *R/distmatch.r f562ca6fc19a2951c9fcaaa01cae81dd *R/ecdfplot.r @@ -12,32 +12,32 @@ d41d8cd98f00b204e9800998ecf8427e *R/errorhandling.r a7c14294e78b350eca6b9b036f9f0817 *R/finetab.r c98b06b08fd9add6edc0464b9d8d3443 *R/loveplot.r d3df5386a7329cd2c93dedac69a83db4 *R/meantab.r -4a4e30c24c6373c9118b32245b3f29fc *R/nmatch.r -a63de79ad34c200384f40be979e2bfb1 *R/oneprob_profmatch.r +09803a6063703ebc862e15f714aeda64 *R/nmatch.r +c445bfdc992990fc1986f8d2bf524573 *R/oneprob_profmatch.r 98f291ae461a0b1c18ea5d8858b3eb22 *R/pairsplot.r 36d380e4bd2fb8bbd551af173cc12056 *R/problemparameters.r -6c4827264e52d44760c865663f2e6c2d *R/problemparameters_cardmatch.r -100040b1be9171a41dbf7c7329ab22a2 *R/problemparameters_profmatch.R +90e08362114fb82097302f29ef213616 *R/problemparameters_cardmatch.r +60d3a3e224db7c5adf9e7789fa981e5e *R/problemparameters_profmatch.R 4ef6dd019eb18dbc4f060d44a730d440 *R/profmatch.r -faaeb63e221a183bf338d8005151bb03 *R/relaxation_b.r -1ee8331d65f0dab43a41bd5a4740fcdf *R/relaxation_n.r +ea52925286223eec1f44292880f1ca6c *R/relaxation_b.r +85f4ef21f5d27230f788812ed84162e9 *R/relaxation_n.r 8a6974d5a5f551af3ca1e1be7d555be6 *R/smahal.r 0187616874e4a366b4ba006ded775e2d *data/germancities.rda 7c8f4dadefa6fd57a06a23067605dda0 *data/lalonde.rda fca1c5c2ab15ec2207f7f77919cca494 *inst/gurobi_installation.txt 8b5ab702a3ca62c67934d34220c09ab6 *inst/symphony_installation.txt a8685672a86c1c9e7c630b430a193f5f *man/absstddif.Rd -d6b93eea5adcb9f75f4b1ffdc0733073 *man/bmatch.Rd -8a815d1326851ed95222c480ebb02ebd *man/cardmatch.Rd -f57efa7f8b3a56947c842360a6be4c4b *man/designmatch-package.Rd +24064d46ad3c8d8c0ba66b1d077ce560 *man/bmatch.Rd +332017726f0e529aa0af7ec67bd736dd *man/cardmatch.Rd +e61d158873b0ed5f8370adc0a8a7c7fa *man/designmatch-package.Rd 9d510e34095ba20a26ae6ed0e3b7807a *man/distmat.Rd -14e2735656fdb26f4990db6dc593066e *man/distmatch.Rd +af456f0b2fbc3bb7d9c7feeefd3087fe *man/distmatch.Rd 1d59c6df9d58dd6a7d6d7e94a0a535b8 *man/ecdfplot.Rd e7f0f69f8b70ba4d92f21b1d93d60bec *man/finetab.Rd 1498da5b6612d10d4077748d27e1dd75 *man/germancities.Rd e98a127c9df8177312c4cbb48b34dfab *man/lalonde.Rd e6792c32417b2ef50459d550ed90b83f *man/loveplot.Rd 1616542174f147bf821c0ca73161fb10 *man/meantab.Rd -52194ac8da397357463a7f0fd66a4d8b *man/nmatch.Rd +0dc408d2fd1a672d9f05fe8342c42d47 *man/nmatch.Rd 54aa121631d4cc210c88409123f0cbec *man/pairsplot.Rd -e29657b717ad90002275a1a20f7d5049 *man/profmatch.Rd +7048e0d3e5062792f73e8ad0a4923aed *man/profmatch.Rd diff --git a/NAMESPACE b/NAMESPACE index 1b7ea18..b0fd401 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,14 +1,24 @@ -# Default NAMESPACE created by R # Remove the previous line if you edit this file # Export all names -exportPattern(".") +export(absstddif) +export(bmatch) +export(cardmatch) +export(distmat) +export(distmatch) +export(ecdfplot) +export(finetab) +export(loveplot) +export(meantab) +export(nmatch) +export(pairsplot) +export(profmatch) # Import all packages listed as Imports or Depends import( + highs, lattice, MASS, - Rglpk, slam ) diff --git a/R/bmatch.r b/R/bmatch.r index 63efc50..8c37a15 100644 --- a/R/bmatch.r +++ b/R/bmatch.r @@ -1,402 +1,481 @@ -#! bmatch -bmatch = function(t_ind, dist_mat = NULL, subset_weight = NULL, n_controls = 1, total_groups = NULL, - mom = NULL, - ks = NULL, - exact = NULL, - near_exact = NULL, - fine = NULL, - near_fine = NULL, - near = NULL, - far = NULL, - #use_controls = NULL, - solver = NULL) { - - use_controls = NULL - - if (is.null(mom)) { - mom_covs = NULL - mom_tols = NULL - mom_targets = NULL - } else { - mom_covs = mom$covs - mom_tols = mom$tols - mom_targets = mom$targets - } - - if (is.null(ks)) { - ks_covs = NULL - ks_n_grid = 10 - ks_tols = NULL - } else { - ks_covs = ks$covs - ks_n_grid = ks$n_grid - ks_tols = ks$tols - } - - if (is.null(exact)) { - exact_covs = NULL - } else { - exact_covs = exact$covs - } - - if (is.null(near_exact)) { - near_exact_covs = NULL - near_exact_devs = NULL - } else { - near_exact_covs = near_exact$covs - near_exact_devs = near_exact$devs - } - - if (is.null(fine)) { - fine_covs = NULL - } else { - fine_covs = fine$covs - } - - if (is.null(near_fine)) { - near_fine_covs = NULL - near_fine_devs = NULL - } else { - near_fine_covs = near_fine$covs - near_fine_devs = near_fine$devs - } - - if (is.null(near)) { - near_covs = NULL - near_pairs = NULL - near_groups = NULL - } else { - near_covs = near$covs - near_pairs = near$pairs - near_groups = near$groups - } - - if (is.null(far)) { - far_covs = NULL - far_pairs = NULL - far_groups = NULL - } else { - far_covs = far$covs - far_pairs = far$pairs - far_groups = far$groups - } - - if (is.null(solver)) { - solver = 'glpk' - t_max = 60 * 15 - approximate = 1 - } else { - t_max = solver$t_max - approximate = solver$approximate - trace = solver$trace - round_cplex = solver$round_cplex - solver = solver$name - } - - - - #! CALL ERROR HANDLING - - if (is.null(subset_weight)) { - subset_weight = 0 - } - - #! Generate the parameters - cat(format(" Building the matching problem..."), "\n") - prmtrs = .problemparameters(t_ind, dist_mat, subset_weight, n_controls, total_groups, - mom_covs, mom_tols, mom_targets, - ks_covs, ks_n_grid, ks_tols, - exact_covs, - near_exact_covs, near_exact_devs, - fine_covs, - near_fine_covs, near_fine_devs, - near_covs, near_pairs, near_groups, - far_covs, far_pairs, far_groups, - use_controls, - approximate) - n_t = prmtrs$n_t - n_c = prmtrs$n_c - - cvec = prmtrs$cvec - Amat = prmtrs$Amat - bvec = prmtrs$bvec - ub = prmtrs$ub - sense = prmtrs$sense - vtype = prmtrs$vtype - c_index = prmtrs$c_index - - #! Find matches and calculate the elapsed time - #! Gurobi - if (solver == "gurobi") { - #library(gurobi) - if (requireNamespace('gurobi', quietly=TRUE)) { - cat(format(" Gurobi optimizer is open..."), "\n") - model = list() - model$obj = cvec - model$A = Amat - model$sense = rep(NA, length(sense)) - model$sense[sense=="E"] = '=' - model$sense[sense=="L"] = '<=' - model$sense[sense=="G"] = '>=' - model$rhs = bvec - model$vtypes = vtype - model$ub = ub - - t_lim = list(TimeLimit = t_max, OutputFlag = trace) - - cat(format(" Finding the optimal matches..."), "\n") - ptm = proc.time() - out = gurobi::gurobi(model, t_lim) - time = (proc.time()-ptm)[3] - - if (out$status == "INFEASIBLE") { - cat(format(" Error: problem infeasible!"), "\n") - obj_total = NA - obj_dist_mat = NA - t_id = NA - c_id = NA - group_id = NA - time = NA - } - - if (out$status == "OPTIMAL" || out$status == "TIME_LIMIT") { - - if (out$status == "OPTIMAL") { - cat(format(" Optimal matches found"), "\n") - } - - else { - cat(format(" Time limit reached, best suboptimal solution given"), "\n") - } - - if (approximate == 1) { - rel = .relaxation_b(n_t, n_c, out$x, dist_mat, subset_weight, "gurobi", round_cplex, trace) - out$x = rel$sol - out$objval = rel$obj - time = time + rel$time - } - - #! Matched units indexes - t_id = unique(sort(rep(1:n_t, n_c))[out$x[1:(n_t*n_c)]==1]) - c_id = (c_index+n_t)[out$x[1:(n_t*n_c)]==1] - - #! Group (or pair) identifier - group_id_t = 1:(length(t_id)) - group_id_c = sort(rep(1:(length(t_id)), n_controls)) - group_id = c(group_id_t, group_id_c) - - #! Optimal value of the objective function - obj_total = out$objval - - if (!is.null(dist_mat)) { - obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$x[1:(n_t*n_c)]==1)) - } else { - obj_dist_mat = NULL - } - } - } else { - stop('Required solver not installed') - } - - } - - #! CPLEX - else if (solver == "cplex") { - #library(Rcplex) - if (requireNamespace('Rcplex', quietly=TRUE)) { - cat(format(" CPLEX optimizer is open..."), "\n") - cat(format(" Finding the optimal matches..."), "\n") - ptm = proc.time() - out = Rcplex::Rcplex(cvec, Amat, bvec, ub = ub, sense = sense, vtype = vtype, n = 1, - control = list(trace = trace, round = round_cplex, tilim = t_max)) - time = (proc.time()-ptm)[3] - - if (out$status==108) { - cat(format(" Error: time limit exceeded, no integer solution!"), "\n") - obj_total = NA - obj_dist_mat = NA - t_id = NA - c_id = NA - group_id = NA - time = NA - } else if (is.na(out$obj)) { - cat(format(" Error: problem infeasible!"), "\n") - obj_total = NA - obj_dist_mat = NA - t_id = NA - c_id = NA - group_id = NA - time = NA - } - - if (!is.na(out$obj)) { - cat(format(" Optimal matches found"), "\n") - - if (approximate == 1) { - rel = .relaxation_b(n_t, n_c, out$xopt, dist_mat, subset_weight, "cplex", round_cplex, trace) - out$xopt = rel$sol - out$obj = rel$obj - time = time + rel$time - } - - - #! Matched controls indexes - t_id = unique(sort(rep(1:n_t, n_c))[out$xopt[1:(n_t*n_c)]==1]) - c_id = (c_index+n_t)[out$xopt[1:(n_t*n_c)]==1] - - #! Group (or pair) identifier - group_id_t = 1:(length(t_id)) - group_id_c = sort(rep(1:(length(t_id)), n_controls)) - group_id = c(group_id_t, group_id_c) - - #! Optimal value of the objective function - obj_total = out$obj - - if (!is.null(dist_mat)) { - obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$xopt[1:(n_t*n_c)]==1)) - } else { - obj_dist_mat = NULL - } - - } - } else { - stop('Required solver not installed') - } - - } - - #! GLPK - else if (solver == "glpk") { - #library(Rglpk) - cat(format(" GLPK optimizer is open..."), "\n") - dir = rep(NA, length(prmtrs$sense)) - dir[prmtrs$sense=="E"] = '==' - dir[prmtrs$sense=="L"] = '<=' - dir[prmtrs$sense=="G"] = '>=' - - bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), - upper = list(ind=c(1:length(ub)), val=ub)) - - cat(format(" Finding the optimal matches..."), "\n") - ptm = proc.time() - out= Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = FALSE) - time = (proc.time()-ptm)[3] - - if (out$status!=0) { - cat(format(" Error: problem infeasible!"), "\n") - obj_total = NA - obj_dist_mat = NA - t_id = NA - c_id = NA - group_id = NA - time = NA - } - - if (out$status==0) { - cat(format(" Optimal matches found"), "\n") - - if (approximate == 1) { - rel = .relaxation_b(n_t, n_c, out$solution, dist_mat, subset_weight, "glpk", round_cplex, trace) - out$solution = rel$sol - out$optimum = rel$obj - time = time + rel$time - } - - - #! Matched controls indexes - t_id = unique(sort(rep(1:n_t, n_c))[out$solution[1:(n_t*n_c)]==1]) - c_id = (c_index+n_t)[out$solution[1:(n_t*n_c)]==1] - - #! Group (or pair) identifier - group_id_t = 1:(length(t_id)) - group_id_c = sort(rep(1:(length(t_id)), n_controls)) - group_id = c(group_id_t, group_id_c) - - #! Optimal value of the objective function - obj_total = out$optimum - - if (!is.null(dist_mat)) { - obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$solution[1:(n_t*n_c)]==1)) - } else { - obj_dist_mat = NULL - } - - } - } - - #! Symphony - else { - #library(Rsymphony) - if (requireNamespace('Rsymphony', quietly=TRUE)) { - cat(format(" Symphony optimizer is open..."), "\n") - - dir = rep(NA, length(prmtrs$sense)) - dir[prmtrs$sense=="E"] = '==' - dir[prmtrs$sense=="L"] = '<=' - dir[prmtrs$sense=="G"] = '>=' - - bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), - upper = list(ind=c(1:length(ub)), val=ub)) - - cat(format(" Finding the optimal matches..."), "\n") - ptm = proc.time() - out= Rsymphony::Rsymphony_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = FALSE, time_limit = t_max) - time = (proc.time()-ptm)[3] - - if (out$status==228) { - cat(format(" Error: problem exceeded the time limit and no feasible solution is found!"), "\n") - obj_total = NA - obj_dist_mat = NA - t_id = NA - c_id = NA - group_id = NA - time = NA - } - else if (out$status!=0) { - cat(format(" Error: problem infeasible!"), "\n") - obj_total = NA - obj_dist_mat = NA - t_id = NA - c_id = NA - group_id = NA - time = NA - } - - if (out$status==0) { - cat(format(" Optimal matches found"), "\n") - - if (approximate == 1) { - rel = .relaxation_b(n_t, n_c, out$solution, dist_mat, subset_weight, "symphony", round_cplex, trace) - out$solution = rel$sol - out$objval = rel$obj - time = time + rel$time - } - - #! Matched controls indexes - t_id = unique(sort(rep(1:n_t, n_c))[out$solution[1:(n_t*n_c)]==1]) - c_id = (c_index+n_t)[out$solution[1:(n_t*n_c)]==1] - - #! Group (or pair) identifier - group_id_t = 1:(length(t_id)) - group_id_c = sort(rep(1:(length(t_id)), n_controls)) - group_id = c(group_id_t, group_id_c) - - #! Optimal value of the objective function - obj_total = out$objval - - if (!is.null(dist_mat)) { - obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$solution[1:(n_t*n_c)]==1)) - } else { - obj_dist_mat = NULL - } - - } - } else { - stop('Required solver not installed') - } - - } - #! Output - return(list(obj_total = obj_total, obj_dist_mat = obj_dist_mat, - t_id = t_id, c_id = c_id, group_id = group_id, time = time, status=out$status)) +#! bmatch +bmatch = function(t_ind, dist_mat = NULL, subset_weight = NULL, n_controls = 1, total_groups = NULL, + mom = NULL, + ks = NULL, + exact = NULL, + near_exact = NULL, + fine = NULL, + near_fine = NULL, + near = NULL, + far = NULL, + #use_controls = NULL, + solver = NULL) { + + use_controls = NULL + + if (is.null(mom)) { + mom_covs = NULL + mom_tols = NULL + mom_targets = NULL + } else { + mom_covs = mom$covs + mom_tols = mom$tols + mom_targets = mom$targets + } + + if (is.null(ks)) { + ks_covs = NULL + ks_n_grid = 10 + ks_tols = NULL + } else { + ks_covs = ks$covs + ks_n_grid = ks$n_grid + ks_tols = ks$tols + } + + if (is.null(exact)) { + exact_covs = NULL + } else { + exact_covs = exact$covs + } + + if (is.null(near_exact)) { + near_exact_covs = NULL + near_exact_devs = NULL + } else { + near_exact_covs = near_exact$covs + near_exact_devs = near_exact$devs + } + + if (is.null(fine)) { + fine_covs = NULL + } else { + fine_covs = fine$covs + } + + if (is.null(near_fine)) { + near_fine_covs = NULL + near_fine_devs = NULL + } else { + near_fine_covs = near_fine$covs + near_fine_devs = near_fine$devs + } + + if (is.null(near)) { + near_covs = NULL + near_pairs = NULL + near_groups = NULL + } else { + near_covs = near$covs + near_pairs = near$pairs + near_groups = near$groups + } + + if (is.null(far)) { + far_covs = NULL + far_pairs = NULL + far_groups = NULL + } else { + far_covs = far$covs + far_pairs = far$pairs + far_groups = far$groups + } + + if (is.null(solver)) { + t_max = 60 * 15 + approximate = 1 + solver = "highs" + } else { + t_max = solver$t_max + approximate = solver$approximate + trace = solver$trace + round_cplex = solver$round_cplex + solver = solver$name + } + + + + #! CALL ERROR HANDLING + + if (is.null(subset_weight)) { + subset_weight = 0 + } + + #! Generate the parameters + cat(format(" Building the matching problem..."), "\n") + prmtrs = .problemparameters(t_ind, dist_mat, subset_weight, n_controls, total_groups, + mom_covs, mom_tols, mom_targets, + ks_covs, ks_n_grid, ks_tols, + exact_covs, + near_exact_covs, near_exact_devs, + fine_covs, + near_fine_covs, near_fine_devs, + near_covs, near_pairs, near_groups, + far_covs, far_pairs, far_groups, + use_controls, + approximate) + n_t = prmtrs$n_t + n_c = prmtrs$n_c + + cvec = prmtrs$cvec + Amat = prmtrs$Amat + bvec = prmtrs$bvec + ub = prmtrs$ub + sense = prmtrs$sense + vtype = prmtrs$vtype + c_index = prmtrs$c_index + + #! Find matches and calculate the elapsed time + #! Gurobi + if (solver == "gurobi") { + #library(gurobi) + if (requireNamespace('gurobi', quietly=TRUE)) { + cat(format(" Gurobi optimizer is open..."), "\n") + model = list() + model$obj = cvec + model$A = Amat + model$sense = rep(NA, length(sense)) + model$sense[sense=="E"] = '=' + model$sense[sense=="L"] = '<=' + model$sense[sense=="G"] = '>=' + model$rhs = bvec + model$vtypes = vtype + model$ub = ub + + t_lim = list(TimeLimit = t_max, OutputFlag = trace) + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out = gurobi::gurobi(model, t_lim) + time = (proc.time()-ptm)[3] + + if (out$status == "INFEASIBLE") { + cat(format(" Error: problem infeasible!"), "\n") + obj_total = NA + obj_dist_mat = NA + t_id = NA + c_id = NA + group_id = NA + time = NA + } + + if (out$status == "OPTIMAL" || out$status == "TIME_LIMIT") { + + if (out$status == "OPTIMAL") { + cat(format(" Optimal matches found"), "\n") + } + + else { + cat(format(" Time limit reached, best suboptimal solution given"), "\n") + } + + if (approximate == 1) { + rel = .relaxation_b(n_t, n_c, out$x, dist_mat, subset_weight, "gurobi", round_cplex, trace) + out$x = rel$sol + out$objval = rel$obj + time = time + rel$time + } + + #! Matched units indexes + t_id = unique(sort(rep(1:n_t, n_c))[out$x[1:(n_t*n_c)]==1]) + c_id = (c_index+n_t)[out$x[1:(n_t*n_c)]==1] + + #! Group (or pair) identifier + group_id_t = 1:(length(t_id)) + group_id_c = sort(rep(1:(length(t_id)), n_controls)) + group_id = c(group_id_t, group_id_c) + + #! Optimal value of the objective function + obj_total = out$objval + + if (!is.null(dist_mat)) { + obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$x[1:(n_t*n_c)]==1)) + } else { + obj_dist_mat = NULL + } + } + } else { + stop('Required solver not installed') + } + + } + + #! CPLEX + else if (solver == "cplex") { + #library(Rcplex) + if (requireNamespace('Rcplex', quietly=TRUE)) { + cat(format(" CPLEX optimizer is open..."), "\n") + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out = Rcplex::Rcplex(cvec, Amat, bvec, ub = ub, sense = sense, vtype = vtype, n = 1, + control = list(trace = trace, round = round_cplex, tilim = t_max)) + time = (proc.time()-ptm)[3] + + if (out$status==108) { + cat(format(" Error: time limit exceeded, no integer solution!"), "\n") + obj_total = NA + obj_dist_mat = NA + t_id = NA + c_id = NA + group_id = NA + time = NA + } else if (is.na(out$obj)) { + cat(format(" Error: problem infeasible!"), "\n") + obj_total = NA + obj_dist_mat = NA + t_id = NA + c_id = NA + group_id = NA + time = NA + } + + if (!is.na(out$obj)) { + cat(format(" Optimal matches found"), "\n") + + if (approximate == 1) { + rel = .relaxation_b(n_t, n_c, out$xopt, dist_mat, subset_weight, "cplex", round_cplex, trace) + out$xopt = rel$sol + out$obj = rel$obj + time = time + rel$time + } + + + #! Matched controls indexes + t_id = unique(sort(rep(1:n_t, n_c))[out$xopt[1:(n_t*n_c)]==1]) + c_id = (c_index+n_t)[out$xopt[1:(n_t*n_c)]==1] + + #! Group (or pair) identifier + group_id_t = 1:(length(t_id)) + group_id_c = sort(rep(1:(length(t_id)), n_controls)) + group_id = c(group_id_t, group_id_c) + + #! Optimal value of the objective function + obj_total = out$obj + + if (!is.null(dist_mat)) { + obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$xopt[1:(n_t*n_c)]==1)) + } else { + obj_dist_mat = NULL + } + + } + } else { + stop('Required solver not installed') + } + + } + + #! GLPK + else if (solver == "glpk") { + #library(Rglpk) + if (requireNamespace('Rglpk', quietly = TRUE)) { + cat(format(" GLPK optimizer is open..."), "\n") + dir = rep(NA, length(prmtrs$sense)) + dir[prmtrs$sense=="E"] = '==' + dir[prmtrs$sense=="L"] = '<=' + dir[prmtrs$sense=="G"] = '>=' + + bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), + upper = list(ind=c(1:length(ub)), val=ub)) + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out= Rglpk::Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = FALSE) + time = (proc.time()-ptm)[3] + + if (out$status!=0) { + cat(format(" Error: problem infeasible!"), "\n") + obj_total = NA + obj_dist_mat = NA + t_id = NA + c_id = NA + group_id = NA + time = NA + } + + if (out$status==0) { + cat(format(" Optimal matches found"), "\n") + + if (approximate == 1) { + rel = .relaxation_b(n_t, n_c, out$solution, dist_mat, subset_weight, "glpk", round_cplex, trace) + out$solution = rel$sol + out$optimum = rel$obj + time = time + rel$time + } + + + #! Matched controls indexes + t_id = unique(sort(rep(1:n_t, n_c))[out$solution[1:(n_t*n_c)]==1]) + c_id = (c_index+n_t)[out$solution[1:(n_t*n_c)]==1] + + #! Group (or pair) identifier + group_id_t = 1:(length(t_id)) + group_id_c = sort(rep(1:(length(t_id)), n_controls)) + group_id = c(group_id_t, group_id_c) + + #! Optimal value of the objective function + obj_total = out$optimum + + if (!is.null(dist_mat)) { + obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$solution[1:(n_t*n_c)]==1)) + } else { + obj_dist_mat = NULL + } + + } + } + else { + stop('Required solver not installed') + } + } + + + #! HiGHS + else if (solver == "highs"){ + #library(highs) + cat(format(" HiGHS optimizer is open..."), "\n") + lhs = rep(-Inf, length(sense)) + rhs = rep(Inf, length(sense)) + lhs[sense == "G"] = bvec[sense == "G"] + rhs[sense == "L"] = bvec[sense == "L"] + lhs[sense == "E"] = bvec[sense == "E"] + rhs[sense == "E"] = bvec[sense == "E"] + + types = vtype + types[types=="B"] = "I" + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out = highs_solve(L = cvec, + lower = 0, + upper = ub, + A = Amat, + lhs = lhs, + rhs = rhs, + types = types, + control = (highs_control(time_limit = t_max))) + time = (proc.time()-ptm)[3] + + if (out$status == 7 | out$status == 13){ + if (out$status == 7){ + cat(format(" Optimal matches found"), "\n") + } + else if (out$status == 13){ + cat(format(" Time limit reached!"), "\n") + } + if (approximate == 1) { + rel = .relaxation_b(n_t, n_c, out$primal_solution, dist_mat, subset_weight, "highs", round_cplex, trace) + out$primal_solution = rel$sol + out$objval = rel$obj + time = time + rel$time + } + + #! Matched units indexes + t_id = unique(sort(rep(1:n_t, n_c))[round(out$primal_solution[1:(n_t*n_c)], 1e-10)==1]) + c_id = (c_index+n_t)[round(out$primal_solution[1:(n_t*n_c)], 1e-10)==1] + + #! Group (or pair) identifier + group_id_t = 1:(length(t_id)) + group_id_c = sort(rep(1:(length(t_id)), n_controls)) + group_id = c(group_id_t, group_id_c) + + #! Optimal value of the objective function + obj_total = out$objval + + if (!is.null(dist_mat)) { + obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * round(out$primal_solution[1:(n_t*n_c)], 1e-10)==1)) + } else { + obj_dist_mat = NULL + } + } + else if (out$status == 8) { + cat(format(" Error: problem infeasible!"), "\n") + obj_total = NA + id = NA + time = NA + } + else{ + outmessage = paste0(" Error: HiGHS solver message: ", out$status_message) + cat(format(outmessage), "\n") + obj_total = NA + id = NA + time = NA + } + } + + #! Symphony + else { + #library(Rsymphony) + if (requireNamespace('Rsymphony', quietly=TRUE)) { + cat(format(" Symphony optimizer is open..."), "\n") + + dir = rep(NA, length(prmtrs$sense)) + dir[prmtrs$sense=="E"] = '==' + dir[prmtrs$sense=="L"] = '<=' + dir[prmtrs$sense=="G"] = '>=' + + bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), + upper = list(ind=c(1:length(ub)), val=ub)) + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out= Rsymphony::Rsymphony_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = FALSE, time_limit = t_max) + time = (proc.time()-ptm)[3] + + if (out$status==228) { + cat(format(" Error: problem exceeded the time limit and no feasible solution is found!"), "\n") + obj_total = NA + obj_dist_mat = NA + t_id = NA + c_id = NA + group_id = NA + time = NA + } + else if (out$status!=0) { + cat(format(" Error: problem infeasible!"), "\n") + obj_total = NA + obj_dist_mat = NA + t_id = NA + c_id = NA + group_id = NA + time = NA + } + + if (out$status==0) { + cat(format(" Optimal matches found"), "\n") + + if (approximate == 1) { + rel = .relaxation_b(n_t, n_c, out$solution, dist_mat, subset_weight, "symphony", round_cplex, trace) + out$solution = rel$sol + out$objval = rel$obj + time = time + rel$time + } + + #! Matched controls indexes + t_id = unique(sort(rep(1:n_t, n_c))[out$solution[1:(n_t*n_c)]==1]) + c_id = (c_index+n_t)[out$solution[1:(n_t*n_c)]==1] + + #! Group (or pair) identifier + group_id_t = 1:(length(t_id)) + group_id_c = sort(rep(1:(length(t_id)), n_controls)) + group_id = c(group_id_t, group_id_c) + + #! Optimal value of the objective function + obj_total = out$objval + + if (!is.null(dist_mat)) { + obj_dist_mat = sum(c(as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE)) * out$solution[1:(n_t*n_c)]==1)) + } else { + obj_dist_mat = NULL + } + + } + } else { + stop('Required solver not installed') + } + + } + #! Output + return(list(obj_total = obj_total, obj_dist_mat = obj_dist_mat, + t_id = t_id, c_id = c_id, group_id = group_id, time = time, status=out$status)) } \ No newline at end of file diff --git a/R/cardmatch.r b/R/cardmatch.r index cabbfa1..d2d3975 100644 --- a/R/cardmatch.r +++ b/R/cardmatch.r @@ -22,9 +22,9 @@ cardmatch = function(t_ind, mom = NULL, fine = NULL, solver = NULL) { } if (is.null(solver)) { - solver = 'glpk' t_max = 60 * 15 approximate = 1 + solver = "highs" } else { t_max = solver$t_max approximate = solver$approximate @@ -158,43 +158,118 @@ cardmatch = function(t_ind, mom = NULL, fine = NULL, solver = NULL) { #! GLPK else if (solver == "glpk") { #library(Rglpk) - cat(format(" GLPK optimizer is open..."), "\n") - dir = rep(NA, length(prmtrs$sense)) - dir[prmtrs$sense=="E"] = '==' - dir[prmtrs$sense=="L"] = '<=' - dir[prmtrs$sense=="G"] = '>=' + if (requireNamespace('Rglpk', quietly = TRUE)) { + cat(format(" GLPK optimizer is open..."), "\n") + dir = rep(NA, length(prmtrs$sense)) + dir[prmtrs$sense=="E"] = '==' + dir[prmtrs$sense=="L"] = '<=' + dir[prmtrs$sense=="G"] = '>=' + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out= Rglpk::Rglpk_solve_LP(cvec, Amat, dir, bvec, types = vtype, max = TRUE) + time = (proc.time()-ptm)[3] + + if (out$status!=0) { + cat(format(" Error: problem infeasible!"), "\n") + obj_total = NA + obj_dist_mat = NA + t_id = NA + c_id = NA + group_id = NA + time = NA + } + + if (out$status==0) { + cat(format(" Optimal matches found"), "\n") + + #! Matched units indexes + t_id = (1:n_dec_vars)[t_ind==1 & out$solution==1] + c_id = (1:n_dec_vars)[t_ind==0 & out$solution==1] + + #! Group (or pair) identifier + group_id = c(1:(length(t_id)), 1:(length(c_id))) + + #! Optimal value of the objective function + obj_total = out$optimum + + } + } + else { + stop('Required solver not installed') + } + } + + + #! HiGHS + else if (solver == "highs"){ + #library(highs) + cat(format(" HiGHS optimizer is open..."), "\n") + lhs = rep(-Inf, length(sense)) + rhs = rep(Inf, length(sense)) + lhs[sense == "G"] = bvec[sense == "G"] + rhs[sense == "L"] = bvec[sense == "L"] + lhs[sense == "E"] = bvec[sense == "E"] + rhs[sense == "E"] = bvec[sense == "E"] + + types = vtype + types[types=="B"] = "I" cat(format(" Finding the optimal matches..."), "\n") ptm = proc.time() - out= Rglpk_solve_LP(cvec, Amat, dir, bvec, types = vtype, max = TRUE) + out = highs_solve(L = cvec, + lower = 0, + upper = 1, + A = Amat, + lhs = lhs, + rhs = rhs, + types = types, + maximum = TRUE, + control = (highs_control(time_limit = t_max))) time = (proc.time()-ptm)[3] - if (out$status!=0) { + if (out$status == 7){ + cat(format(" Optimal matches found"), "\n") + + #! Matched units indexes + t_id = (1:n_dec_vars)[t_ind==1 & round(out$primal_solution, 1e-10)==1] + c_id = (1:n_dec_vars)[t_ind==0 & round(out$primal_solution, 1e-10)==1] + + #! Group (or pair) identifier + group_id = c(1:(length(t_id)), 1:(length(c_id))) + + #! Optimal value of the objective function + obj_total = out$objective_value + } + else if (out$status == 8) { cat(format(" Error: problem infeasible!"), "\n") obj_total = NA - obj_dist_mat = NA - t_id = NA - c_id = NA - group_id = NA + id = NA time = NA } - - if (out$status==0) { - cat(format(" Optimal matches found"), "\n") + else if (out$status == 13) { + cat(format(" Time limit reached!"), "\n") #! Matched units indexes - t_id = (1:n_dec_vars)[t_ind==1 & out$solution==1] - c_id = (1:n_dec_vars)[t_ind==0 & out$solution==1] + t_id = (1:n_dec_vars)[t_ind==1 & round(out$primal_solution, 1e-10)==1] + c_id = (1:n_dec_vars)[t_ind==0 & round(out$primal_solution, 1e-10)==1] #! Group (or pair) identifier group_id = c(1:(length(t_id)), 1:(length(c_id))) #! Optimal value of the objective function - obj_total = out$optimum - + obj_total = out$objective_value + } + else{ + outmessage = paste0(" Error: HiGHS solver message: ", out$status_message) + cat(format(outmessage), "\n") + obj_total = NA + id = NA + time = NA } } + #! Symphony else { #library(Rsymphony) diff --git a/R/constraintmatrix.r b/R/constraintmatrix.r index 9963a44..16cb8ab 100644 --- a/R/constraintmatrix.r +++ b/R/constraintmatrix.r @@ -1,454 +1,456 @@ -#! Builds the constraint matrix -.constraintmatrix = function(t_ind, n_controls, total_groups, - mom_covs, mom_tols, mom_targets, - ks_covs, ks_covs_aux, ks_n_grid, ks_tols, - exact_covs, - near_exact_covs, near_exact_devs, - fine_covs, - near_fine_covs, near_fine_devs, - near_covs, near_pairs, near_groups, - far_covs, far_pairs, far_groups, - use_controls, - approximate) { - - #! Number of treated units, number of controls - n_t = sum(t_ind) - n_c = length(t_ind)-n_t - - #! Total number of units - n_tot = n_t*n_c - - #! Build parts of the constraint matrix - #! Part 1 - if (approximate == 1 | n_controls == 1) { - row_ind_1 = sort(rep(1:n_t, n_c)) - col_ind_1 = 1:n_tot - ones_1 = rep(1, n_tot) - } - else { - row_ind_1 = c(sort(rep(1:n_t, n_c)), 1:n_t) - col_ind_1 = 1:(n_tot+n_t) - ones_1 = c(rep(1, n_tot), rep(-1*n_controls, n_t)) - } - - #! Part 2 - row_ind_2 = sort(rep(1:n_c, n_t))+n_t - col_ind_2 = rep(seq(1, n_t*n_c, n_c), n_c)+(sort(rep(1:n_c, n_t))-1) - ones_2 = rep(1, n_tot) - #! Current max row index - row_ind_cur = max(row_ind_2) - - #! Parts 3 and 4: moments and K-S - mom_ks_covs = NULL - if ((!is.null(mom_covs) & is.null(mom_targets)) | !is.null(ks_covs)) { - row_ind_3.4 = 0 - #! Number of moment covariates - n_mom_covs = 0 - if(!is.null(mom_covs) & is.null(mom_targets)) { - n_mom_covs = ncol(mom_covs) - } - #! Number of K-S covariates - n_ks_covs = 0 - if(!is.null(ks_covs)) { - n_ks_covs = ncol(ks_covs) - } - # Bind moment and K-S covariates - if(!is.null(mom_covs) & is.null(mom_targets) & is.null(ks_covs_aux)) { - mom_ks_covs = mom_covs - mom_ks_tols = mom_tols - } - if((is.null(mom_covs) & is.null(mom_targets)) & !is.null(ks_covs_aux)) { - mom_ks_covs = ks_covs_aux - mom_ks_tols = NA - for (i in 1:ncol(ks_covs)) { - mom_ks_tols = c(mom_ks_tols, rep(ks_tols[i], ks_n_grid[i]), rep(0, max(ks_n_grid)-ks_n_grid[i])) - } - mom_ks_tols = mom_ks_tols[-1] - } - if((!is.null(mom_covs) & is.null(mom_targets)) & !is.null(ks_covs_aux)) { - mom_ks_covs = cbind(mom_covs, ks_covs_aux) - mom_ks_tols = mom_tols - for (i in 1:ncol(ks_covs)) { - mom_ks_tols = c(mom_ks_tols, rep(ks_tols[i], ks_n_grid[i]), rep(0, max(ks_n_grid)-ks_n_grid[i])) - } - } - } - if (!is.null(mom_ks_covs)) { - n_mom_ks_covs = ncol(mom_ks_covs) - if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { - row_ind_3.4 = sort(rep(1:(2*n_mom_ks_covs)+n_t+n_c, n_tot)) - } - col_ind_3.4 = NA - mom_ks_vals_3.4 = NA - j = 1 - k = 0 - for (i in 1:n_mom_ks_covs) { - if (n_mom_covs != 0 & i <= n_mom_covs) { - if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { - col_ind_3.4 = c(col_ind_3.4, rep(1:n_tot, 2)) - } - } - if (n_ks_covs != 0 & i > n_mom_covs) { - if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { - col_ind_3.4 = c(col_ind_3.4, rep(1:n_tot, 2)) - k = k+1 - if (k >= max(ks_n_grid)) { - j = j+1 - k = 0 - } - } - } - temp_mean_1 = rep(mom_ks_covs[t_ind==0, i], n_t)-(mom_ks_covs[t_ind==1, i])[sort(rep(1:n_t, n_c))] - if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { - temp_mean_2 = temp_mean_1-(mom_ks_tols[i]*rep(1, n_t*n_c)) - temp_mean_3 = -temp_mean_1-(mom_ks_tols[i]*rep(1, n_t*n_c)) - } - mom_ks_vals_3.4 = c(mom_ks_vals_3.4, temp_mean_2, temp_mean_3) - if (i == 1) { - col_ind_3.4 = col_ind_3.4[-1] - mom_ks_vals_3.4 = mom_ks_vals_3.4[-1] - } - } - #! Current max row index - row_ind_cur = max(row_ind_3.4) - } - - #! Moment target part - rows_target = NULL - cols_target = NULL - vals_target = NULL - if (!is.null(mom_covs) & !is.null(mom_targets)) { - n_mom_covs = ncol(mom_covs) - rows_target = sort(rep(1:(4*n_mom_covs)+row_ind_cur, n_tot)) - - for (i in 1:n_mom_covs) { - cols_target = c(cols_target, rep(1:n_tot, 4)) - temp_treatment_1 = (mom_covs[t_ind==1, i])[sort(rep(1:n_t, n_c))] - (mom_targets[i] + mom_tols[i]) - temp_treatment_2 = -1*(mom_covs[t_ind==1, i])[sort(rep(1:n_t, n_c))] + (mom_targets[i] - mom_tols[i]) - temp_control_1 = rep(mom_covs[t_ind==0, i], n_t) - (mom_targets[i] + mom_tols[i]) - temp_control_2 = -1*rep(mom_covs[t_ind==0, i], n_t) + (mom_targets[i] - mom_tols[i]) - vals_target = c(vals_target, temp_treatment_1, temp_treatment_2, temp_control_1, temp_control_2) - } - row_ind_cur = max(rows_target) - } - - - #! Part 5: exact - rows_exact = NULL - cols_exact = NULL - vals_exact = NULL - if (!is.null(exact_covs)) { - n_exact_cats = ncol(exact_covs) - for (i in 1:n_exact_cats) { - rows_exact = c(rows_exact, rep(row_ind_cur+i, n_t*n_c)) - cols_exact = c(cols_exact, 1:(n_t*n_c)) - dist_exact_cov = abs(outer(exact_covs[t_ind==1, i], exact_covs[t_ind==0, i], "-")) - dist_exact_cov = t(dist_exact_cov) - vals_exact = c(vals_exact, as.vector(dist_exact_cov)) - } - row_ind_5 = rows_exact - col_ind_5 = cols_exact - exact_vals_5 = vals_exact - row_ind_cur = max(row_ind_5) - } - - #! Part 6: near-exact - rows_near_exact = NULL - cols_near_exact = NULL - vals_near_exact = NULL - if (!is.null(near_exact_covs)) { - n_near_exact_cats = ncol(near_exact_covs) - for (i in 1:n_near_exact_cats) { - rows_near_exact = c(rows_near_exact, rep(row_ind_cur+i, n_t*n_c)) - cols_near_exact = c(cols_near_exact, 1:(n_t*n_c)) - dist_near_exact_cov = abs(outer(near_exact_covs[t_ind==1, i], near_exact_covs[t_ind==0, i], "-")) - dist_near_exact_cov = t(dist_near_exact_cov) - vals_near_exact = c(vals_near_exact, as.vector(dist_near_exact_cov)) - } - row_ind_6 = rows_near_exact - col_ind_6 = cols_near_exact - near_exact_vals_6 = vals_near_exact - row_ind_cur = max(row_ind_6) - } - - #! Part 7: fine - bvec_7 = NULL - rows_fine = NULL - cols_fine = NULL - vals_fine = NULL - if (!is.null(fine_covs)) { - #! Transform fine_covs to a matrix of binary inds. for each cat. of each fine balancing covariate - fine_covs_2 = rep(NA, nrow(fine_covs)) - n_fine_covs = ncol(fine_covs) - j = 1 - for (i in 1:n_fine_covs) { - aux = factor(fine_covs[, i]) - fine_covs_2 = cbind(fine_covs_2, diag(nlevels(aux))[aux,]) - if (j == 1) { - fine_covs_2 = fine_covs_2[, -1] - } - j = j+1 - } - n_fine_cats = ncol(fine_covs_2) - for (i in 1:n_fine_cats) { - rows_fine = c(rows_fine, rep(row_ind_cur+i, n_t*n_c)) - cols_fine = c(cols_fine, 1:(n_t*n_c)) - dist_fine_cov = outer(fine_covs_2[t_ind==1, i], fine_covs_2[t_ind==0, i], "-") - dist_fine_cov = t(dist_fine_cov) - vals_fine = c(vals_fine, as.vector(dist_fine_cov)) - } - row_ind_7 = rows_fine - col_ind_7 = cols_fine - fine_vals_7 = vals_fine - bvec_7 = rep(0, n_fine_cats) - row_ind_cur = max(row_ind_7) - } - - #! Part 8: near-fine - bvec_8 = NULL - rows_near_fine = NULL - cols_near_fine = NULL - vals_near_fine = NULL - if (!is.null(near_fine_covs)) { - #! Transform fine_covs to a matrix of binary inds. for each cat. of each fine balancing covariate - near_fine_covs_2 = rep(NA, nrow(near_fine_covs)) - n_near_fine_covs = ncol(near_fine_covs) - j = 1 - for (i in 1:n_near_fine_covs) { - near_aux = factor(near_fine_covs[, i]) - near_fine_covs_2 = cbind(near_fine_covs_2, diag(nlevels(near_aux))[near_aux,]) - if (j == 1) { - near_fine_covs_2 = near_fine_covs_2[, -1] - } - j = j+1 - } - n_near_fine_cats = ncol(near_fine_covs_2) - for (i in 1:n_near_fine_cats) { - rows_near_fine = c(rows_near_fine, rep(row_ind_cur+i, n_t*n_c)) - cols_near_fine = c(cols_near_fine, 1:(n_t*n_c)) - dist_near_fine_cov = outer(near_fine_covs_2[t_ind==1, i], near_fine_covs_2[t_ind==0, i], "-") - dist_near_fine_cov = t(dist_near_fine_cov) - vals_near_fine = c(vals_near_fine, as.vector(dist_near_fine_cov)) - } - row_ind_cur = max(rows_near_fine) - for (i in 1:n_near_fine_cats) { - rows_near_fine = c(rows_near_fine, rep(row_ind_cur+i, n_t*n_c)) - } - row_ind_8 = rows_near_fine - col_ind_8 = c(cols_near_fine, cols_near_fine) - near_fine_vals_8 = c(vals_near_fine, vals_near_fine) - bvec_8 = rep(0, n_near_fine_cats) - row_ind_cur = max(row_ind_8) - } - - #! Part 9: Far - rows_ind_far_pairs = list() - if (!is.null(far_covs)) { - row_ind_9 = NULL - col_ind_9 = NULL - far_cov_vals_9 = NULL - n_far_covs = ncol(far_covs) - for (j in 1:n_far_covs) { - far_cov = far_covs[,j] - #! Far on average constraints - if (!is.null(far_groups)) { - far_group = far_groups[j] - row_ind_far_all = sort(c(rep(row_ind_cur+1, n_tot))) - col_ind_far_all = rep(1:n_tot, 1) - temp_mean_3 = (-rep(far_cov[t_ind==0], n_t)+((far_cov[t_ind==1])[sort(rep(1:n_t, n_c))]))-(far_group*rep(1, n_t*n_c)) - vals_far_all = c(temp_mean_3) - row_ind_cur = max(row_ind_far_all) - } - #! Far on all pairs constraints - if (!is.null(far_pairs)) { - far_pair = far_pairs[j] - aux = abs(outer(far_cov[t_ind==1], far_cov[t_ind==0], FUN = "-")) - temp = as.vector(matrix(t(aux), nrow = 1, byrow = TRUE)) - cols_ind_far_pairs = which(temp0) { - rows_ind_far_pairs[[j]] = row_ind_cur+(1:length(cols_ind_far_pairs)) - vals_far_pairs = rep(1, length(cols_ind_far_pairs)) - row_ind_cur = max(rows_ind_far_pairs[[j]]) - } - if (length(cols_ind_far_pairs)==0) { - cols_ind_far_pairs = NULL - rows_ind_far_pairs[[j]] = -1 - vals_far_pairs = NULL - } - } - #! Put together - if (!is.null(far_groups) && is.null(far_pairs)) { - row_ind_9 = c(row_ind_9, row_ind_far_all) - col_ind_9 = c(col_ind_9, col_ind_far_all) - far_cov_vals_9 = c(far_cov_vals_9, vals_far_all) - } - if (is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { - row_ind_9 = c(row_ind_9, rows_ind_far_pairs[[j]]) - col_ind_9 = c(col_ind_9, cols_ind_far_pairs) - far_cov_vals_9 = c(far_cov_vals_9, vals_far_pairs) - } - if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { - row_ind_9 = c(row_ind_9, row_ind_far_all, rows_ind_far_pairs[[j]]) - col_ind_9 = c(col_ind_9, col_ind_far_all, cols_ind_far_pairs) - far_cov_vals_9 = c(far_cov_vals_9, vals_far_all, vals_far_pairs) - } - if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] == -1) { - row_ind_9 = c(row_ind_9, row_ind_far_all) - col_ind_9 = c(col_ind_9, col_ind_far_all) - far_cov_vals_9 = c(far_cov_vals_9, vals_far_all) - } - } - } - - #! Part 10: Near - rows_ind_near_pairs = list() - if (!is.null(near_covs)) { - row_ind_10 = NULL - col_ind_10 = NULL - near_cov_vals_10 = NULL - n_near_covs = ncol(near_covs) - for (j in 1:n_near_covs) { - near_cov = near_covs[,j] - #! Near on average constraints - if (!is.null(near_groups)) { - near_group = near_groups[j] - row_ind_near_all = sort(c(rep(row_ind_cur+1, n_tot))) - col_ind_near_all = rep(1:n_tot, 1) - temp_mean_4 = (-rep(near_cov[t_ind==0], n_t)+((near_cov[t_ind==1])[sort(rep(1:n_t, n_c))]))-(near_group*rep(1, n_t*n_c)) - vals_near_all = c(temp_mean_4) - row_ind_cur = max(row_ind_near_all) - } - #! Near on all pairs constraints - if (!is.null(near_pairs)) { - near_pair = near_pairs[j] - aux = abs(outer(near_cov[t_ind==1], near_cov[t_ind==0], FUN = "-")) - temp = as.vector(matrix(t(aux), nrow = 1, byrow = TRUE)) - cols_ind_near_pairs = which(temp>near_pair) - if (length(cols_ind_near_pairs)>0) { - rows_ind_near_pairs[[j]] = row_ind_cur+(1:length(cols_ind_near_pairs)) - vals_near_pairs = rep(1, length(cols_ind_near_pairs)) - row_ind_cur = max(rows_ind_near_pairs[[j]]) - } - if (length(cols_ind_near_pairs)==0) { - cols_ind_near_pairs = NULL - rows_ind_near_pairs[[j]] = -1 - vals_near_pairs = NULL - } - } - #! Put together - if (!is.null(near_groups) && is.null(near_pairs)) { - row_ind_10 = c(row_ind_10, row_ind_near_all) - col_ind_10 = c(col_ind_10, col_ind_near_all) - near_cov_vals_10 = c(near_cov_vals_10, vals_near_all) - } - if (is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { - row_ind_10 = c(row_ind_10, rows_ind_near_pairs[[j]]) - col_ind_10 = c(col_ind_10, cols_ind_near_pairs) - near_cov_vals_10 = c(near_cov_vals_10, vals_near_pairs) - } - if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { - row_ind_10 = c(row_ind_10, row_ind_near_all, rows_ind_near_pairs[[j]]) - col_ind_10 = c(col_ind_10, col_ind_near_all, cols_ind_near_pairs) - near_cov_vals_10 = c(near_cov_vals_10, vals_near_all, vals_near_pairs) - } - if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] == -1) { - row_ind_10 = c(row_ind_10, row_ind_near_all) - col_ind_10 = c(col_ind_10, col_ind_near_all) - near_cov_vals_10 = c(near_cov_vals_10, vals_near_all) - } - } - } - - # Part 11: use controls - if (!is.null(use_controls)) { - use_controls = use_controls[(n_t+1):(n_t+n_c)] - use_controls_aux = rep(use_controls, n_t) - col_ind_11 = (1:n_tot)[use_controls_aux==1] - row_ind_11 = rep(row_ind_cur+1, length(col_ind_11)) - use_controls_vals_11 = rep(1, length(col_ind_11)) - - row_ind_cur = max(row_ind_11) - } - - # Part 12: total_groups - if (!is.null(total_groups)) { - row_ind_12 = rep(row_ind_cur+1, n_t*n_c) - col_ind_12 = 1:(n_t*n_c) - ones_12 = rep(1, n_t*n_c) - - row_ind_cur = max(row_ind_12) - } - - #! Put all the parts of the constraint matrix together - #! Parts 1 and 2 - row_ind = c(row_ind_1, row_ind_2) - col_ind = c(col_ind_1, col_ind_2) - vals = c(ones_1, ones_2) - #! Parts 3 and 4 - if (!is.null(mom_ks_covs)) { - row_ind = c(row_ind, row_ind_3.4) - col_ind = c(col_ind, col_ind_3.4) - vals = c(vals, mom_ks_vals_3.4) - } - #! Part 3b - if (!is.null(mom_covs) & !is.null(mom_targets)) { - row_ind = c(row_ind, rows_target) - col_ind = c(col_ind, cols_target) - vals = c(vals, vals_target) - } - #! Part 5 - if (!is.null(exact_covs)) { - row_ind = c(row_ind, row_ind_5) - col_ind = c(col_ind, col_ind_5) - vals = c(vals, exact_vals_5) - } - #! Part 6 - if (!is.null(near_exact_covs)) { - row_ind = c(row_ind, row_ind_6) - col_ind = c(col_ind, col_ind_6) - vals = c(vals, near_exact_vals_6) - } - #! Part 7 - if (!is.null(fine_covs)) { - row_ind = c(row_ind, row_ind_7) - col_ind = c(col_ind, col_ind_7) - vals = c(vals, fine_vals_7) - } - #! Part 8 - if (!is.null(near_fine_covs)) { - row_ind = c(row_ind, row_ind_8) - col_ind = c(col_ind, col_ind_8) - vals = c(vals, near_fine_vals_8) - } - #! Part 9 - if (!is.null(far_covs)) { - row_ind = c(row_ind, row_ind_9) - col_ind = c(col_ind, col_ind_9) - vals = c(vals, far_cov_vals_9) - } - #! Part 10 - if (!is.null(near_covs)) { - row_ind = c(row_ind, row_ind_10) - col_ind = c(col_ind, col_ind_10) - vals = c(vals, near_cov_vals_10) - } - #! Part 11 - if (!is.null(use_controls)) { - row_ind = c(row_ind, row_ind_11) - col_ind = c(col_ind, col_ind_11) - vals = c(vals, use_controls_vals_11) - } - #! Part 12 - if (!is.null(total_groups)) { - row_ind = c(row_ind, row_ind_12) - col_ind = c(col_ind, col_ind_12) - vals = c(vals, ones_12) - } - - aux = cbind(row_ind, col_ind, vals)[order(col_ind), ] - cnstrn_mat = simple_triplet_matrix(i = aux[, 1], j = aux[, 2], v = aux[, 3]) - - #! Output - return(list(cnstrn_mat = cnstrn_mat, bvec_7 = bvec_7, bvec_8 = bvec_8, rows_ind_far_pairs = rows_ind_far_pairs, rows_ind_near_pairs = rows_ind_near_pairs)) - +#! Builds the constraint matrix +.constraintmatrix = function(t_ind, n_controls, total_groups, + mom_covs, mom_tols, mom_targets, + ks_covs, ks_covs_aux, ks_n_grid, ks_tols, + exact_covs, + near_exact_covs, near_exact_devs, + fine_covs, + near_fine_covs, near_fine_devs, + near_covs, near_pairs, near_groups, + far_covs, far_pairs, far_groups, + use_controls, + approximate) { + + #! Number of treated units, number of controls + n_t = sum(t_ind) + n_c = length(t_ind)-n_t + + #! Total number of units + n_tot = n_t*n_c + + #! Build parts of the constraint matrix + #! Part 1 + if (approximate == 1 | n_controls == 1) { + row_ind_1 = sort(rep(1:n_t, n_c)) + col_ind_1 = 1:n_tot + ones_1 = rep(1, n_tot) + } + else { + row_ind_1 = c(sort(rep(1:n_t, n_c)), 1:n_t) + col_ind_1 = 1:(n_tot+n_t) + ones_1 = c(rep(1, n_tot), rep(-1*n_controls, n_t)) + } + + #! Part 2 + row_ind_2 = sort(rep(1:n_c, n_t))+n_t + col_ind_2 = rep(seq(1, n_t*n_c, n_c), n_c)+(sort(rep(1:n_c, n_t))-1) + ones_2 = rep(1, n_tot) + #! Current max row index + row_ind_cur = max(row_ind_2) + + #! Parts 3 and 4: moments and K-S + mom_ks_covs = NULL + if ((!is.null(mom_covs) & is.null(mom_targets)) | !is.null(ks_covs)) { + row_ind_3.4 = 0 + #! Number of moment covariates + n_mom_covs = 0 + if(!is.null(mom_covs) & is.null(mom_targets)) { + n_mom_covs = ncol(mom_covs) + } + #! Number of K-S covariates + n_ks_covs = 0 + if(!is.null(ks_covs)) { + n_ks_covs = ncol(ks_covs) + } + # Bind moment and K-S covariates + if(!is.null(mom_covs) & is.null(mom_targets) & is.null(ks_covs_aux)) { + mom_ks_covs = mom_covs + mom_ks_tols = mom_tols + } + if((is.null(mom_covs) & is.null(mom_targets)) & !is.null(ks_covs_aux)) { + mom_ks_covs = ks_covs_aux + mom_ks_tols = NA + for (i in 1:ncol(ks_covs)) { + mom_ks_tols = c(mom_ks_tols, rep(ks_tols[i], ks_n_grid[i]), rep(0, max(ks_n_grid)-ks_n_grid[i])) + } + mom_ks_tols = mom_ks_tols[-1] + } + if((!is.null(mom_covs) & is.null(mom_targets)) & !is.null(ks_covs_aux)) { + mom_ks_covs = cbind(mom_covs, ks_covs_aux) + mom_ks_tols = mom_tols + for (i in 1:ncol(ks_covs)) { + mom_ks_tols = c(mom_ks_tols, rep(ks_tols[i], ks_n_grid[i]), rep(0, max(ks_n_grid)-ks_n_grid[i])) + } + } + } + if (!is.null(mom_ks_covs)) { + n_mom_ks_covs = ncol(mom_ks_covs) + if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { + row_ind_3.4 = sort(rep(1:(2*n_mom_ks_covs)+n_t+n_c, n_tot)) + } + col_ind_3.4 = NA + mom_ks_vals_3.4 = NA + j = 1 + k = 0 + for (i in 1:n_mom_ks_covs) { + if (n_mom_covs != 0 & i <= n_mom_covs) { + if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { + col_ind_3.4 = c(col_ind_3.4, rep(1:n_tot, 2)) + } + } + if (n_ks_covs != 0 & i > n_mom_covs) { + if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { + col_ind_3.4 = c(col_ind_3.4, rep(1:n_tot, 2)) + k = k+1 + if (k >= max(ks_n_grid)) { + j = j+1 + k = 0 + } + } + } + temp_mean_1 = rep(mom_ks_covs[t_ind==0, i], n_t)-(mom_ks_covs[t_ind==1, i])[sort(rep(1:n_t, n_c))] + if ((!is.null(mom_tols) & is.null(mom_targets)) | !is.null(ks_tols)) { + temp_mean_2 = temp_mean_1-(mom_ks_tols[i]*rep(1, n_t*n_c)) + temp_mean_3 = -temp_mean_1-(mom_ks_tols[i]*rep(1, n_t*n_c)) + } + mom_ks_vals_3.4 = c(mom_ks_vals_3.4, temp_mean_2, temp_mean_3) + if (i == 1) { + col_ind_3.4 = col_ind_3.4[-1] + mom_ks_vals_3.4 = mom_ks_vals_3.4[-1] + } + } + #! Current max row index + row_ind_cur = max(row_ind_3.4) + } + + #! Moment target part + rows_target = NULL + cols_target = NULL + vals_target = NULL + if (!is.null(mom_covs) & !is.null(mom_targets)) { + n_mom_covs = ncol(mom_covs) + rows_target = sort(rep(1:(4*n_mom_covs)+row_ind_cur, n_tot)) + + for (i in 1:n_mom_covs) { + cols_target = c(cols_target, rep(1:n_tot, 4)) + temp_treatment_1 = (mom_covs[t_ind==1, i])[sort(rep(1:n_t, n_c))] - (mom_targets[i] + mom_tols[i]) + temp_treatment_2 = -1*(mom_covs[t_ind==1, i])[sort(rep(1:n_t, n_c))] + (mom_targets[i] - mom_tols[i]) + temp_control_1 = rep(mom_covs[t_ind==0, i], n_t) - (mom_targets[i] + mom_tols[i]) + temp_control_2 = -1*rep(mom_covs[t_ind==0, i], n_t) + (mom_targets[i] - mom_tols[i]) + vals_target = c(vals_target, temp_treatment_1, temp_treatment_2, temp_control_1, temp_control_2) + } + row_ind_cur = max(rows_target) + } + + + #! Part 5: exact + rows_exact = NULL + cols_exact = NULL + vals_exact = NULL + if (!is.null(exact_covs)) { + n_exact_cats = ncol(exact_covs) + for (i in 1:n_exact_cats) { + rows_exact = c(rows_exact, rep(row_ind_cur+i, n_t*n_c)) + cols_exact = c(cols_exact, 1:(n_t*n_c)) + dist_exact_cov = abs(outer(exact_covs[t_ind==1, i], exact_covs[t_ind==0, i], "-")) + dist_exact_cov = t(dist_exact_cov) + vals_exact = c(vals_exact, as.vector(dist_exact_cov)) + } + row_ind_5 = rows_exact + col_ind_5 = cols_exact + exact_vals_5 = vals_exact + row_ind_cur = max(row_ind_5) + } + + #! Part 6: near-exact + rows_near_exact = NULL + cols_near_exact = NULL + vals_near_exact = NULL + if (!is.null(near_exact_covs)) { + n_near_exact_cats = ncol(near_exact_covs) + for (i in 1:n_near_exact_cats) { + rows_near_exact = c(rows_near_exact, rep(row_ind_cur+i, n_t*n_c)) + cols_near_exact = c(cols_near_exact, 1:(n_t*n_c)) + dist_near_exact_cov = abs(outer(near_exact_covs[t_ind==1, i], near_exact_covs[t_ind==0, i], "-")) + dist_near_exact_cov = t(dist_near_exact_cov) + vals_near_exact = c(vals_near_exact, as.vector(dist_near_exact_cov)) + } + row_ind_6 = rows_near_exact + col_ind_6 = cols_near_exact + near_exact_vals_6 = vals_near_exact + row_ind_cur = max(row_ind_6) + } + + #! Part 7: fine + bvec_7 = NULL + rows_fine = NULL + cols_fine = NULL + vals_fine = NULL + if (!is.null(fine_covs)) { + #! Transform fine_covs to a matrix of binary inds. for each cat. of each fine balancing covariate + fine_covs_2 = rep(NA, nrow(fine_covs)) + n_fine_covs = ncol(fine_covs) + j = 1 + for (i in 1:n_fine_covs) { + aux = factor(fine_covs[, i]) + fine_covs_2 = cbind(fine_covs_2, diag(nlevels(aux))[aux,]) + if (j == 1) { + fine_covs_2 = fine_covs_2[, -1] + } + j = j+1 + } + n_fine_cats = ncol(fine_covs_2) + for (i in 1:n_fine_cats) { + rows_fine = c(rows_fine, rep(row_ind_cur+i, n_t*n_c)) + cols_fine = c(cols_fine, 1:(n_t*n_c)) + dist_fine_cov = outer(fine_covs_2[t_ind==1, i], fine_covs_2[t_ind==0, i], "-") + dist_fine_cov = t(dist_fine_cov) + vals_fine = c(vals_fine, as.vector(dist_fine_cov)) + } + row_ind_7 = rows_fine + col_ind_7 = cols_fine + fine_vals_7 = vals_fine + bvec_7 = rep(0, n_fine_cats) + row_ind_cur = max(row_ind_7) + } + + #! Part 8: near-fine + bvec_8 = NULL + rows_near_fine = NULL + cols_near_fine = NULL + vals_near_fine = NULL + if (!is.null(near_fine_covs)) { + #! Transform fine_covs to a matrix of binary inds. for each cat. of each fine balancing covariate + near_fine_covs_2 = rep(NA, nrow(near_fine_covs)) + n_near_fine_covs = ncol(near_fine_covs) + j = 1 + for (i in 1:n_near_fine_covs) { + near_aux = factor(near_fine_covs[, i]) + near_fine_covs_2 = cbind(near_fine_covs_2, diag(nlevels(near_aux))[near_aux,]) + if (j == 1) { + near_fine_covs_2 = near_fine_covs_2[, -1] + } + j = j+1 + } + n_near_fine_cats = ncol(near_fine_covs_2) + for (i in 1:n_near_fine_cats) { + rows_near_fine = c(rows_near_fine, rep(row_ind_cur+i, n_t*n_c)) + cols_near_fine = c(cols_near_fine, 1:(n_t*n_c)) + dist_near_fine_cov = outer(near_fine_covs_2[t_ind==1, i], near_fine_covs_2[t_ind==0, i], "-") + dist_near_fine_cov = t(dist_near_fine_cov) + vals_near_fine = c(vals_near_fine, as.vector(dist_near_fine_cov)) + } + row_ind_cur = max(rows_near_fine) + for (i in 1:n_near_fine_cats) { + rows_near_fine = c(rows_near_fine, rep(row_ind_cur+i, n_t*n_c)) + } + row_ind_8 = rows_near_fine + col_ind_8 = c(cols_near_fine, cols_near_fine) + near_fine_vals_8 = c(vals_near_fine, vals_near_fine) + bvec_8 = rep(0, n_near_fine_cats) + row_ind_cur = max(row_ind_8) + } + + #! Part 9: Far + rows_ind_far_pairs = list() + if (!is.null(far_covs)) { + row_ind_9 = NULL + col_ind_9 = NULL + far_cov_vals_9 = NULL + n_far_covs = ncol(far_covs) + for (j in 1:n_far_covs) { + far_cov = far_covs[,j] + #! Far on average constraints + if (!is.null(far_groups)) { + far_group = far_groups[j] + row_ind_far_all = sort(c(rep(row_ind_cur+1, n_tot))) + col_ind_far_all = rep(1:n_tot, 1) + temp_mean_3 = (-rep(far_cov[t_ind==0], n_t)+((far_cov[t_ind==1])[sort(rep(1:n_t, n_c))]))-(far_group*rep(1, n_t*n_c)) + vals_far_all = c(temp_mean_3) + row_ind_cur = max(row_ind_far_all) + } + #! Far on all pairs constraints + if (!is.null(far_pairs)) { + far_pair = far_pairs[j] + aux = abs(outer(far_cov[t_ind==1], far_cov[t_ind==0], FUN = "-")) + temp = as.vector(matrix(t(aux), nrow = 1, byrow = TRUE)) + cols_ind_far_pairs = which(temp0) { + rows_ind_far_pairs[[j]] = row_ind_cur+(1:length(cols_ind_far_pairs)) + vals_far_pairs = rep(1, length(cols_ind_far_pairs)) + row_ind_cur = max(rows_ind_far_pairs[[j]]) + } + if (length(cols_ind_far_pairs)==0) { + cols_ind_far_pairs = NULL + rows_ind_far_pairs[[j]] = -1 + vals_far_pairs = NULL + } + } + #! Put together + if (!is.null(far_groups) && is.null(far_pairs)) { + row_ind_9 = c(row_ind_9, row_ind_far_all) + col_ind_9 = c(col_ind_9, col_ind_far_all) + far_cov_vals_9 = c(far_cov_vals_9, vals_far_all) + } + if (is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { + row_ind_9 = c(row_ind_9, rows_ind_far_pairs[[j]]) + col_ind_9 = c(col_ind_9, cols_ind_far_pairs) + far_cov_vals_9 = c(far_cov_vals_9, vals_far_pairs) + } + if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { + row_ind_9 = c(row_ind_9, row_ind_far_all, rows_ind_far_pairs[[j]]) + col_ind_9 = c(col_ind_9, col_ind_far_all, cols_ind_far_pairs) + far_cov_vals_9 = c(far_cov_vals_9, vals_far_all, vals_far_pairs) + } + if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] == -1) { + row_ind_9 = c(row_ind_9, row_ind_far_all) + col_ind_9 = c(col_ind_9, col_ind_far_all) + far_cov_vals_9 = c(far_cov_vals_9, vals_far_all) + } + } + } + + #! Part 10: Near + rows_ind_near_pairs = list() + if (!is.null(near_covs)) { + row_ind_10 = NULL + col_ind_10 = NULL + near_cov_vals_10 = NULL + n_near_covs = ncol(near_covs) + for (j in 1:n_near_covs) { + near_cov = near_covs[,j] + #! Near on average constraints + if (!is.null(near_groups)) { + near_group = near_groups[j] + row_ind_near_all = sort(c(rep(row_ind_cur+1, n_tot))) + col_ind_near_all = rep(1:n_tot, 1) + temp_mean_4 = (-rep(near_cov[t_ind==0], n_t)+((near_cov[t_ind==1])[sort(rep(1:n_t, n_c))]))-(near_group*rep(1, n_t*n_c)) + vals_near_all = c(temp_mean_4) + row_ind_cur = max(row_ind_near_all) + } + #! Near on all pairs constraints + if (!is.null(near_pairs)) { + near_pair = near_pairs[j] + aux = abs(outer(near_cov[t_ind==1], near_cov[t_ind==0], FUN = "-")) + temp = as.vector(matrix(t(aux), nrow = 1, byrow = TRUE)) + cols_ind_near_pairs = which(temp>near_pair) + if (length(cols_ind_near_pairs)>0) { + rows_ind_near_pairs[[j]] = row_ind_cur+(1:length(cols_ind_near_pairs)) + vals_near_pairs = rep(1, length(cols_ind_near_pairs)) + row_ind_cur = max(rows_ind_near_pairs[[j]]) + } + if (length(cols_ind_near_pairs)==0) { + cols_ind_near_pairs = NULL + rows_ind_near_pairs[[j]] = -1 + vals_near_pairs = NULL + } + } + #! Put together + if (!is.null(near_groups) && is.null(near_pairs)) { + row_ind_10 = c(row_ind_10, row_ind_near_all) + col_ind_10 = c(col_ind_10, col_ind_near_all) + near_cov_vals_10 = c(near_cov_vals_10, vals_near_all) + } + if (is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { + row_ind_10 = c(row_ind_10, rows_ind_near_pairs[[j]]) + col_ind_10 = c(col_ind_10, cols_ind_near_pairs) + near_cov_vals_10 = c(near_cov_vals_10, vals_near_pairs) + } + if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { + row_ind_10 = c(row_ind_10, row_ind_near_all, rows_ind_near_pairs[[j]]) + col_ind_10 = c(col_ind_10, col_ind_near_all, cols_ind_near_pairs) + near_cov_vals_10 = c(near_cov_vals_10, vals_near_all, vals_near_pairs) + } + if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] == -1) { + row_ind_10 = c(row_ind_10, row_ind_near_all) + col_ind_10 = c(col_ind_10, col_ind_near_all) + near_cov_vals_10 = c(near_cov_vals_10, vals_near_all) + } + } + } + + # Part 11: use controls + if (!is.null(use_controls)) { + use_controls = use_controls[(n_t+1):(n_t+n_c)] + use_controls_aux = rep(use_controls, n_t) + col_ind_11 = (1:n_tot)[use_controls_aux==1] + row_ind_11 = rep(row_ind_cur+1, length(col_ind_11)) + use_controls_vals_11 = rep(1, length(col_ind_11)) + + row_ind_cur = max(row_ind_11) + } + + # Part 12: total_groups + if (!is.null(total_groups)) { + row_ind_12 = rep(row_ind_cur+1, n_t*n_c) + col_ind_12 = 1:(n_t*n_c) + ones_12 = rep(1, n_t*n_c) + + row_ind_cur = max(row_ind_12) + } + + #! Put all the parts of the constraint matrix together + #! Parts 1 and 2 + row_ind = c(row_ind_1, row_ind_2) + col_ind = c(col_ind_1, col_ind_2) + vals = c(ones_1, ones_2) + #! Parts 3 and 4 + if (!is.null(mom_ks_covs)) { + row_ind = c(row_ind, row_ind_3.4) + col_ind = c(col_ind, col_ind_3.4) + vals = c(vals, mom_ks_vals_3.4) + } + #! Part 3b + if (!is.null(mom_covs) & !is.null(mom_targets)) { + row_ind = c(row_ind, rows_target) + col_ind = c(col_ind, cols_target) + vals = c(vals, vals_target) + } + #! Part 5 + if (!is.null(exact_covs)) { + row_ind = c(row_ind, row_ind_5) + col_ind = c(col_ind, col_ind_5) + vals = c(vals, exact_vals_5) + } + #! Part 6 + if (!is.null(near_exact_covs)) { + row_ind = c(row_ind, row_ind_6) + col_ind = c(col_ind, col_ind_6) + vals = c(vals, near_exact_vals_6) + } + #! Part 7 + if (!is.null(fine_covs)) { + row_ind = c(row_ind, row_ind_7) + col_ind = c(col_ind, col_ind_7) + vals = c(vals, fine_vals_7) + } + #! Part 8 + if (!is.null(near_fine_covs)) { + row_ind = c(row_ind, row_ind_8) + col_ind = c(col_ind, col_ind_8) + vals = c(vals, near_fine_vals_8) + } + #! Part 9 + if (!is.null(far_covs)) { + row_ind = c(row_ind, row_ind_9) + col_ind = c(col_ind, col_ind_9) + vals = c(vals, far_cov_vals_9) + } + #! Part 10 + if (!is.null(near_covs)) { + row_ind = c(row_ind, row_ind_10) + col_ind = c(col_ind, col_ind_10) + vals = c(vals, near_cov_vals_10) + } + #! Part 11 + if (!is.null(use_controls)) { + row_ind = c(row_ind, row_ind_11) + col_ind = c(col_ind, col_ind_11) + vals = c(vals, use_controls_vals_11) + } + #! Part 12 + if (!is.null(total_groups)) { + row_ind = c(row_ind, row_ind_12) + col_ind = c(col_ind, col_ind_12) + vals = c(vals, ones_12) + } + + aux = cbind(row_ind, col_ind, vals)[order(col_ind), ] + + aux = aux[(aux[, 3] != 0),] + cnstrn_mat = simple_triplet_matrix(i = aux[, 1], j = aux[, 2], v = aux[, 3]) + + #! Output + return(list(cnstrn_mat = cnstrn_mat, bvec_7 = bvec_7, bvec_8 = bvec_8, rows_ind_far_pairs = rows_ind_far_pairs, rows_ind_near_pairs = rows_ind_near_pairs)) + } \ No newline at end of file diff --git a/R/nmatch.r b/R/nmatch.r index cda874d..f2d51ad 100644 --- a/R/nmatch.r +++ b/R/nmatch.r @@ -1,880 +1,956 @@ -nmatch = function(dist_mat, subset_weight = NULL, total_pairs = NULL, - mom = NULL, - exact = NULL, - near_exact = NULL, - fine = NULL, - near_fine = NULL, - near = NULL, - far = NULL, - solver = NULL) { - - if (is.null(mom)) { - mom_covs = NULL - mom_tols = NULL - mom_targets = NULL - } else { - mom_covs = mom$covs - mom_tols = mom$tols - mom_targets = mom$targets - } - - if (is.null(exact)) { - exact_covs = NULL - } else { - exact_covs = exact$covs - } - - if (is.null(near_exact)) { - near_exact_covs = NULL - near_exact_devs = NULL - } else { - near_exact_covs = near_exact$covs - near_exact_devs = near_exact$devs - } - - if (is.null(fine)) { - fine_covs = NULL - } else { - fine_covs = fine$covs - } - - if (is.null(near_fine)) { - near_fine_covs = NULL - near_fine_devs = NULL - } else { - near_fine_covs = near_fine$covs - near_fine_devs = near_fine$devs - } - - if (is.null(near)) { - near_covs = NULL - near_pairs = NULL - near_groups = NULL - } else { - near_covs = near$covs - near_pairs = near$pairs - near_groups = near$groups - } - - if (is.null(far)) { - far_covs = NULL - far_pairs = NULL - far_groups = NULL - } else { - far_covs = far$covs - far_pairs = far$pairs - far_groups = far$groups - } - - if (is.null(solver)) { - solver = 'glpk' - t_max = 60 * 15 - approximate = 1 - } else { - t_max = solver$t_max - approximate = solver$approximate - trace = solver$trace - round_cplex = solver$round_cplex - solver = solver$name - } - - cat(format(" Building the matching problem..."), "\n") - - #! Total number of units - n_tot = nrow(dist_mat) - - #! Total number of decision variables - n_dec = (n_tot*(n_tot-1))-sum(1:(n_tot-1)) - - if (is.null(subset_weight)) { - subset_weight = 0 - } - - #! cvec - cvec = t(dist_mat)[lower.tri(dist_mat)]-(subset_weight*rep(1, n_dec)) - - #! Amat - #rows_far_all = NULL - #cols_far_all = NULL - #vals_far_all = NULL - #rows_far_pairs = NULL - #cols_far_pairs = NULL - #vals_far_pairs = NULL - #rows_near_all = NULL - #cols_near_all = NULL - #vals_near_all = NULL - #rows_near_pairs = NULL - #cols_near_pairs = NULL - #vals_near_pairs = NULL - rows_far= NULL - cols_far = NULL - vals_far = NULL - rows_near = NULL - cols_near = NULL - vals_near = NULL - rows_mom = NULL - cols_mom = NULL - vals_mom = NULL - rows_exact = NULL - cols_exact = NULL - vals_exact = NULL - rows_near_exact = NULL - cols_near_exact = NULL - vals_near_exact = NULL - rows_fine = NULL - cols_fine = NULL - vals_fine = NULL - rows_near_fine = NULL - cols_near_fine = NULL - vals_near_fine = NULL - rows_n = NULL - cols_n = NULL - vals_n = NULL - rows_target = NULL - cols_target = NULL - vals_target = NULL - - #! Nonbipartite matching constraints - rows_nbm = sort(rep(1:n_tot, n_tot-1)) - temp = matrix(0, nrow = n_tot, ncol = n_tot) - temp[lower.tri(temp)] = 1:n_dec - temp = temp+t(temp) - diag(temp) = NA - cols_nbm = as.vector(t(temp)) - cols_nbm = cols_nbm[!is.na(cols_nbm)] - vals_nbm = rep(1, (n_tot-1)*n_tot) - row_count = max(rows_nbm) - - #! Far constraints - rows_ind_far_pairs = list() - if (!is.null(far_covs)) { - rows_far = NULL - cols_far = NULL - vals_far = NULL - n_far_covs = ncol(far_covs) - for (j in 1:n_far_covs) { - far_cov = far_covs[,j] - #! Far on average constraints - if (!is.null(far_groups)) { - far_group = far_groups[j] - row_ind_far_all = rep(row_count+1, n_dec) - col_ind_far_all = rep(1:n_dec, 1) - i_ind = rep(1:(n_tot-1), (n_tot-1):1) - aux = matrix(rep(1:n_tot, n_tot), nrow = n_tot, byrow = F) - j_ind = aux[lower.tri(aux)] - vals_far_all = far_cov[i_ind]-far_cov[j_ind]-(far_group*rep(1, n_dec)) - row_count = max(row_ind_far_all) - } - #! Far on all pairs constraints - if (!is.null(far_pairs)) { - far_pair = far_pairs[j] - aux = abs(outer(far_cov, far_cov, FUN = "-")) - temp = as.vector(matrix(t(aux)[lower.tri(aux)], nrow = 1, byrow = TRUE)) - cols_ind_far_pairs = which(temp0) { - rows_ind_far_pairs[[j]] = row_count+(1:length(cols_ind_far_pairs)) - vals_far_pairs = rep(1, length(cols_ind_far_pairs)) - row_count = max(rows_ind_far_pairs[[j]]) - } - if (length(cols_ind_far_pairs)==0) { - cols_ind_far_pairs = NULL - rows_ind_far_pairs[[j]] = -1 - vals_far_pairs = NULL - } - } - #! Put together - if (!is.null(far_groups) && is.null(far_pairs)) { - rows_far = c(rows_far, row_ind_far_all) - cols_far = c(cols_far, col_ind_far_all) - vals_far = c(vals_far, vals_far_all) - } - if (is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { - rows_far = c(rows_far, rows_ind_far_pairs[[j]]) - cols_far = c(cols_far, cols_ind_far_pairs) - vals_far = c(vals_far, vals_far_pairs) - } - if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { - rows_far = c(rows_far, row_ind_far_all, rows_ind_far_pairs[[j]]) - cols_far = c(cols_far, col_ind_far_all, cols_ind_far_pairs) - vals_far = c(vals_far, vals_far_all, vals_far_pairs) - } - if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] == -1) { - rows_far = c(rows_far, row_ind_far_all) - cols_far = c(cols_far, col_ind_far_all) - vals_far = c(vals_far, vals_far_all) - } - } - } - - #! Far on average constraints - #if (!is.null(far_covs)) { - # rows_far_all = rep(row_count+1, n_dec) - # cols_far_all = 1:n_dec - # i_ind = rep(1:(n_tot-1), (n_tot-1):1) - # aux = matrix(rep(1:n_tot, n_tot), nrow = n_tot, byrow = F) - # j_ind = aux[lower.tri(aux)] - # vals_far_all = far_covs[i_ind]-far_covs[j_ind]-(far_groups*rep(1, n_dec)) - # row_count = max(rows_far_all) - #} - - #! Far on all pairs constraints - #if (!is.null(far_covs)) { - # aux = abs(outer(far_covs, far_covs, FUN = "-")) - # temp = aux[lower.tri(aux)] - # cols_far_pairs = which(tempnear_pair) - if (length(cols_ind_near_pairs)>0) { - rows_ind_near_pairs[[j]] = row_count+(1:length(cols_ind_near_pairs)) - vals_near_pairs = rep(1, length(cols_ind_near_pairs)) - row_count = max(rows_ind_near_pairs[[j]]) - } - if (length(cols_ind_near_pairs)==0) { - cols_ind_near_pairs = NULL - rows_ind_near_pairs[[j]] = -1 - vals_near_pairs = NULL - } - } - #! Put together - if (!is.null(near_groups) && is.null(near_pairs)) { - rows_near = c(rows_near, row_ind_near_all) - cols_near = c(cols_near, col_ind_near_all) - vals_near = c(vals_near, vals_near_all) - } - if (is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { - rows_near = c(rows_near, rows_ind_near_pairs[[j]]) - cols_near = c(cols_near, cols_ind_near_pairs) - vals_near = c(vals_near, vals_near_pairs) - } - if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { - rows_near = c(rows_near, row_ind_near_all, rows_ind_near_pairs[[j]]) - cols_near = c(cols_near, col_ind_near_all, cols_ind_near_pairs) - vals_near = c(vals_near, vals_near_all, vals_near_pairs) - } - if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] == -1) { - rows_near = c(rows_near, row_ind_near_all) - cols_near = c(cols_near, col_ind_near_all) - vals_near = c(vals_near, vals_near_all) - } - } - } - - #! Near on average constraints - #if (!is.null(near_covs)) { - # rows_near_all = rep(row_count+1, n_dec) - # cols_near_all = 1:n_dec - # i_ind = rep(1:(n_tot-1), (n_tot-1):1) - # aux = matrix(rep(1:n_tot, n_tot), nrow = n_tot, byrow = F) - # j_ind = aux[lower.tri(aux)] - # vals_near_all = near_covs[i_ind]-near_covs[j_ind]-(near_groups*rep(1, n_dec)) - # row_count = max(rows_near_all) - #} - - #! Near on all pairs constraints - #if (!is.null(near_covs)) { - # aux = abs(outer(near_covs, near_covs, FUN = "-")) - # temp = aux[lower.tri(aux)] - # cols_near_pairs = which(temp>near_pairs) - # rows_near_pairs = row_count+(1:length(cols_near_pairs)) - # vals_near_pairs = rep(1, length(cols_near_pairs)) - # row_count = max(rows_near_pairs) - #} - #! Near on all pairs constraints - #if (!is.null(near_covs)) { - # for (i in ncol(near_covs)) { - # aux = abs(outer(near_covs[i, ], near_covs[i, ], FUN = "-")) - # temp = aux[lower.tri(aux)] - # cols_near_pairs = c(cols_near_pairs, which(temp>near_pairs[i])) - # rows_near_pairs = c(rows_near_pairs, row_count+(1:length(which(temp=' - model$rhs = bvec - model$vtypes = var_type - model$ub = ub - - t_lim = list(TimeLimit = t_max, OutputFlag = trace) - - ptm = proc.time() - out = gurobi::gurobi(model, t_lim) - time = (proc.time()-ptm)[3] - - # Output - if (out$status == "INFEASIBLE") { - cat(format(" Error: problem infeasible!"), "\n") - obj_val = NA - obj_dist_mat = NA - id_1 = NA - id_2 = NA - group_id = NA - time = NA - } - - if (out$status == "OPTIMAL" || out$status == "TIME_LIMIT") { - if (out$status == "OPTIMAL") { - cat(format(" Optimal matches found"), "\n") - } - else { - cat(format(" Time limit reached, best suboptimal solution given"), "\n") - } - - if (approximate == 1) { - rel = .relaxation_n(n_tot, out$x, dist_mat, subset_weight, "gurobi", round_cplex, trace) - out$x = rel$sol - out$objval = rel$obj - time = time + rel$time - } - - i_ind = rep(1:(n_tot-1), (n_tot-1):1) - aux = matrix(1:n_tot, nrow = n_tot, ncol = n_tot) - j_ind = aux[lower.tri(aux)] - - group_1 = i_ind[out$x==1] - group_2 = j_ind[out$x==1] - max_groups = apply(cbind(group_1, group_2), 1, max) - - id_1 = group_1[max_groups<=n_tot] - id_2 = group_2[max_groups<=n_tot] - - #! Group identifier - group_id_1 = 1:(length(id_1)) - group_id_2 = 1:(length(id_2)) - group_id = c(group_id_1, group_id_2) - - obj_val = out$objval - obj_dist_mat = sum(t(dist_mat)[lower.tri(dist_mat)] * out$x) - } - } else { - stop('suggested package not installed') - } - } - - if (solver=="glpk") { - #library("Rglpk") - cat(format(" GLPK optimizer is open..."), "\n") - dir = rep(NA, length(sense)) - dir[sense=="E"] = '==' - dir[sense=="L"] = '<=' - dir[sense=="G"] = '>=' - bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), - upper = list(ind=c(1:length(ub)), val=ub)) - ptm = proc.time() - out = Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = var_type, max = FALSE) - time = (proc.time()-ptm)[3] - - # Output - if (out$status!=0) { - cat(format(" Error: problem infeasible!"), "\n") - obj_val = NA - obj_dist_mat = NA - id_1 = NA - id_2 = NA - group_id = NA - time = NA - } - - if (out$status==0) { - cat(format(" Optimal matches found"), "\n") - - if (approximate == 1) { - rel = .relaxation_n(n_tot, out$solution, dist_mat, subset_weight, "glpk", round_cplex, trace) - out$solution = rel$sol - out$optimum = rel$obj - time = time + rel$time - } - - i_ind = rep(1:(n_tot-1), (n_tot-1):1) - aux = matrix(1:n_tot, nrow = n_tot, ncol = n_tot) - j_ind = aux[lower.tri(aux)] - - group_1 = i_ind[out$solution==1] - group_2 = j_ind[out$solution==1] - max_groups = apply(cbind(group_1, group_2), 1, max) - - id_1 = group_1[max_groups<=n_tot] - id_2 = group_2[max_groups<=n_tot] - - #! Group identifier - group_id_1 = 1:(length(id_1)) - group_id_2 = 1:(length(id_2)) - group_id = c(group_id_1, group_id_2) - - obj_val = out$optimum - obj_dist_mat = sum(t(dist_mat)[lower.tri(dist_mat)] * out$solution) - } - } - - if (solver=="symphony") { - #library("Rsymphony") - if (requireNamespace('Rsymphony', quietly = TRUE)) { - cat(format(" Symphony optimizer is open..."), "\n") - dir = rep(NA, length(sense)) - dir[sense=="E"] = '==' - dir[sense=="L"] = '<=' - dir[sense=="G"] = '>=' - bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), - upper = list(ind=c(1:length(ub)), val=ub)) - ptm = proc.time() - out = Rsymphony::Rsymphony_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = var_type, max = FALSE, time_limit = t_max) - time = (proc.time()-ptm)[3] - - # Output - if (out$status==228) { - cat(format(" Error: problem exceeded the time limit and no feasible solution is found!"), "\n") - obj_val = NA - obj_dist_mat = NA - id_1 = NA - id_2 = NA - group_id = NA - time = NA - } - else if (out$status!=0) { - cat(format(" Error: problem infeasible!"), "\n") - obj_val = NA - obj_dist_mat = NA - id_1 = NA - id_2 = NA - group_id = NA - time = NA - } - - if (out$status==0) { - cat(format(" Optimal matches found"), "\n") - - if (approximate == 1) { - rel = .relaxation_n(n_tot, out$solution, dist_mat, subset_weight, "symphony") - out$solution = rel$sol - out$objval = rel$obj - time = time + rel$time - } - - i_ind = rep(1:(n_tot-1), (n_tot-1):1) - aux = matrix(1:n_tot, nrow = n_tot, ncol = n_tot) - j_ind = aux[lower.tri(aux)] - - group_1 = i_ind[out$solution==1] - group_2 = j_ind[out$solution==1] - max_groups = apply(cbind(group_1, group_2), 1, max) - - id_1 = group_1[max_groups<=n_tot] - id_2 = group_2[max_groups<=n_tot] - - #! Group identifier - group_id_1 = 1:(length(id_1)) - group_id_2 = 1:(length(id_2)) - group_id = c(group_id_1, group_id_2) - - obj_val = out$objval - obj_dist_mat = sum(t(dist_mat)[lower.tri(dist_mat)] * out$solution) - } - } else { - stop('suggested package not installed') - } - } - - return = list(obj_total = obj_val, obj_dist_mat = obj_dist_mat, id_1 = id_1, id_2 = id_2, - group_id = group_id,time = time) -} +nmatch = function(dist_mat, subset_weight = NULL, total_pairs = NULL, + mom = NULL, + exact = NULL, + near_exact = NULL, + fine = NULL, + near_fine = NULL, + near = NULL, + far = NULL, + solver = NULL) { + + if (is.null(mom)) { + mom_covs = NULL + mom_tols = NULL + mom_targets = NULL + } else { + mom_covs = mom$covs + mom_tols = mom$tols + mom_targets = mom$targets + } + + if (is.null(exact)) { + exact_covs = NULL + } else { + exact_covs = exact$covs + } + + if (is.null(near_exact)) { + near_exact_covs = NULL + near_exact_devs = NULL + } else { + near_exact_covs = near_exact$covs + near_exact_devs = near_exact$devs + } + + if (is.null(fine)) { + fine_covs = NULL + } else { + fine_covs = fine$covs + } + + if (is.null(near_fine)) { + near_fine_covs = NULL + near_fine_devs = NULL + } else { + near_fine_covs = near_fine$covs + near_fine_devs = near_fine$devs + } + + if (is.null(near)) { + near_covs = NULL + near_pairs = NULL + near_groups = NULL + } else { + near_covs = near$covs + near_pairs = near$pairs + near_groups = near$groups + } + + if (is.null(far)) { + far_covs = NULL + far_pairs = NULL + far_groups = NULL + } else { + far_covs = far$covs + far_pairs = far$pairs + far_groups = far$groups + } + + if (is.null(solver)) { + t_max = 60 * 15 + approximate = 1 + solver = "highs" + } else { + t_max = solver$t_max + approximate = solver$approximate + trace = solver$trace + round_cplex = solver$round_cplex + solver = solver$name + } + + cat(format(" Building the matching problem..."), "\n") + + #! Total number of units + n_tot = nrow(dist_mat) + + #! Total number of decision variables + n_dec = (n_tot*(n_tot-1))-sum(1:(n_tot-1)) + + if (is.null(subset_weight)) { + subset_weight = 0 + } + + #! cvec + cvec = t(dist_mat)[lower.tri(dist_mat)]-(subset_weight*rep(1, n_dec)) + + #! Amat + #rows_far_all = NULL + #cols_far_all = NULL + #vals_far_all = NULL + #rows_far_pairs = NULL + #cols_far_pairs = NULL + #vals_far_pairs = NULL + #rows_near_all = NULL + #cols_near_all = NULL + #vals_near_all = NULL + #rows_near_pairs = NULL + #cols_near_pairs = NULL + #vals_near_pairs = NULL + rows_far= NULL + cols_far = NULL + vals_far = NULL + rows_near = NULL + cols_near = NULL + vals_near = NULL + rows_mom = NULL + cols_mom = NULL + vals_mom = NULL + rows_exact = NULL + cols_exact = NULL + vals_exact = NULL + rows_near_exact = NULL + cols_near_exact = NULL + vals_near_exact = NULL + rows_fine = NULL + cols_fine = NULL + vals_fine = NULL + rows_near_fine = NULL + cols_near_fine = NULL + vals_near_fine = NULL + rows_n = NULL + cols_n = NULL + vals_n = NULL + rows_target = NULL + cols_target = NULL + vals_target = NULL + + #! Nonbipartite matching constraints + rows_nbm = sort(rep(1:n_tot, n_tot-1)) + temp = matrix(0, nrow = n_tot, ncol = n_tot) + temp[lower.tri(temp)] = 1:n_dec + temp = temp+t(temp) + diag(temp) = NA + cols_nbm = as.vector(t(temp)) + cols_nbm = cols_nbm[!is.na(cols_nbm)] + vals_nbm = rep(1, (n_tot-1)*n_tot) + row_count = max(rows_nbm) + + #! Far constraints + rows_ind_far_pairs = list() + if (!is.null(far_covs)) { + rows_far = NULL + cols_far = NULL + vals_far = NULL + n_far_covs = ncol(far_covs) + for (j in 1:n_far_covs) { + far_cov = far_covs[,j] + #! Far on average constraints + if (!is.null(far_groups)) { + far_group = far_groups[j] + row_ind_far_all = rep(row_count+1, n_dec) + col_ind_far_all = rep(1:n_dec, 1) + i_ind = rep(1:(n_tot-1), (n_tot-1):1) + aux = matrix(rep(1:n_tot, n_tot), nrow = n_tot, byrow = F) + j_ind = aux[lower.tri(aux)] + vals_far_all = far_cov[i_ind]-far_cov[j_ind]-(far_group*rep(1, n_dec)) + row_count = max(row_ind_far_all) + } + #! Far on all pairs constraints + if (!is.null(far_pairs)) { + far_pair = far_pairs[j] + aux = abs(outer(far_cov, far_cov, FUN = "-")) + temp = as.vector(matrix(t(aux)[lower.tri(aux)], nrow = 1, byrow = TRUE)) + cols_ind_far_pairs = which(temp0) { + rows_ind_far_pairs[[j]] = row_count+(1:length(cols_ind_far_pairs)) + vals_far_pairs = rep(1, length(cols_ind_far_pairs)) + row_count = max(rows_ind_far_pairs[[j]]) + } + if (length(cols_ind_far_pairs)==0) { + cols_ind_far_pairs = NULL + rows_ind_far_pairs[[j]] = -1 + vals_far_pairs = NULL + } + } + #! Put together + if (!is.null(far_groups) && is.null(far_pairs)) { + rows_far = c(rows_far, row_ind_far_all) + cols_far = c(cols_far, col_ind_far_all) + vals_far = c(vals_far, vals_far_all) + } + if (is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { + rows_far = c(rows_far, rows_ind_far_pairs[[j]]) + cols_far = c(cols_far, cols_ind_far_pairs) + vals_far = c(vals_far, vals_far_pairs) + } + if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] != -1) { + rows_far = c(rows_far, row_ind_far_all, rows_ind_far_pairs[[j]]) + cols_far = c(cols_far, col_ind_far_all, cols_ind_far_pairs) + vals_far = c(vals_far, vals_far_all, vals_far_pairs) + } + if (!is.null(far_groups) && !is.null(far_pairs) && rows_ind_far_pairs[[j]] == -1) { + rows_far = c(rows_far, row_ind_far_all) + cols_far = c(cols_far, col_ind_far_all) + vals_far = c(vals_far, vals_far_all) + } + } + } + + #! Far on average constraints + #if (!is.null(far_covs)) { + # rows_far_all = rep(row_count+1, n_dec) + # cols_far_all = 1:n_dec + # i_ind = rep(1:(n_tot-1), (n_tot-1):1) + # aux = matrix(rep(1:n_tot, n_tot), nrow = n_tot, byrow = F) + # j_ind = aux[lower.tri(aux)] + # vals_far_all = far_covs[i_ind]-far_covs[j_ind]-(far_groups*rep(1, n_dec)) + # row_count = max(rows_far_all) + #} + + #! Far on all pairs constraints + #if (!is.null(far_covs)) { + # aux = abs(outer(far_covs, far_covs, FUN = "-")) + # temp = aux[lower.tri(aux)] + # cols_far_pairs = which(tempnear_pair) + if (length(cols_ind_near_pairs)>0) { + rows_ind_near_pairs[[j]] = row_count+(1:length(cols_ind_near_pairs)) + vals_near_pairs = rep(1, length(cols_ind_near_pairs)) + row_count = max(rows_ind_near_pairs[[j]]) + } + if (length(cols_ind_near_pairs)==0) { + cols_ind_near_pairs = NULL + rows_ind_near_pairs[[j]] = -1 + vals_near_pairs = NULL + } + } + #! Put together + if (!is.null(near_groups) && is.null(near_pairs)) { + rows_near = c(rows_near, row_ind_near_all) + cols_near = c(cols_near, col_ind_near_all) + vals_near = c(vals_near, vals_near_all) + } + if (is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { + rows_near = c(rows_near, rows_ind_near_pairs[[j]]) + cols_near = c(cols_near, cols_ind_near_pairs) + vals_near = c(vals_near, vals_near_pairs) + } + if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] != -1) { + rows_near = c(rows_near, row_ind_near_all, rows_ind_near_pairs[[j]]) + cols_near = c(cols_near, col_ind_near_all, cols_ind_near_pairs) + vals_near = c(vals_near, vals_near_all, vals_near_pairs) + } + if (!is.null(near_groups) && !is.null(near_pairs) && rows_ind_near_pairs[[j]] == -1) { + rows_near = c(rows_near, row_ind_near_all) + cols_near = c(cols_near, col_ind_near_all) + vals_near = c(vals_near, vals_near_all) + } + } + } + + #! Near on average constraints + #if (!is.null(near_covs)) { + # rows_near_all = rep(row_count+1, n_dec) + # cols_near_all = 1:n_dec + # i_ind = rep(1:(n_tot-1), (n_tot-1):1) + # aux = matrix(rep(1:n_tot, n_tot), nrow = n_tot, byrow = F) + # j_ind = aux[lower.tri(aux)] + # vals_near_all = near_covs[i_ind]-near_covs[j_ind]-(near_groups*rep(1, n_dec)) + # row_count = max(rows_near_all) + #} + + #! Near on all pairs constraints + #if (!is.null(near_covs)) { + # aux = abs(outer(near_covs, near_covs, FUN = "-")) + # temp = aux[lower.tri(aux)] + # cols_near_pairs = which(temp>near_pairs) + # rows_near_pairs = row_count+(1:length(cols_near_pairs)) + # vals_near_pairs = rep(1, length(cols_near_pairs)) + # row_count = max(rows_near_pairs) + #} + #! Near on all pairs constraints + #if (!is.null(near_covs)) { + # for (i in ncol(near_covs)) { + # aux = abs(outer(near_covs[i, ], near_covs[i, ], FUN = "-")) + # temp = aux[lower.tri(aux)] + # cols_near_pairs = c(cols_near_pairs, which(temp>near_pairs[i])) + # rows_near_pairs = c(rows_near_pairs, row_count+(1:length(which(temp=' + model$rhs = bvec + model$vtypes = var_type + model$ub = ub + + t_lim = list(TimeLimit = t_max, OutputFlag = trace) + + ptm = proc.time() + out = gurobi::gurobi(model, t_lim) + time = (proc.time()-ptm)[3] + + # Output + if (out$status == "INFEASIBLE") { + cat(format(" Error: problem infeasible!"), "\n") + obj_val = NA + obj_dist_mat = NA + id_1 = NA + id_2 = NA + group_id = NA + time = NA + } + + if (out$status == "OPTIMAL" || out$status == "TIME_LIMIT") { + if (out$status == "OPTIMAL") { + cat(format(" Optimal matches found"), "\n") + } + else { + cat(format(" Time limit reached, best suboptimal solution given"), "\n") + } + + if (approximate == 1) { + rel = .relaxation_n(n_tot, out$x, dist_mat, subset_weight, "gurobi", round_cplex, trace) + out$x = rel$sol + out$objval = rel$obj + time = time + rel$time + } + + i_ind = rep(1:(n_tot-1), (n_tot-1):1) + aux = matrix(1:n_tot, nrow = n_tot, ncol = n_tot) + j_ind = aux[lower.tri(aux)] + + group_1 = i_ind[out$x==1] + group_2 = j_ind[out$x==1] + max_groups = apply(cbind(group_1, group_2), 1, max) + + id_1 = group_1[max_groups<=n_tot] + id_2 = group_2[max_groups<=n_tot] + + #! Group identifier + group_id_1 = 1:(length(id_1)) + group_id_2 = 1:(length(id_2)) + group_id = c(group_id_1, group_id_2) + + obj_val = out$objval + obj_dist_mat = sum(t(dist_mat)[lower.tri(dist_mat)] * out$x) + } + } else { + stop('suggested package not installed') + } + } + + if (solver=="glpk") { + #library("Rglpk") + if (requireNamespace('Rglpk', quietly = TRUE)) { + cat(format(" GLPK optimizer is open..."), "\n") + dir = rep(NA, length(sense)) + dir[sense=="E"] = '==' + dir[sense=="L"] = '<=' + dir[sense=="G"] = '>=' + bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), + upper = list(ind=c(1:length(ub)), val=ub)) + ptm = proc.time() + out = Rglpk::Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = var_type, max = FALSE) + time = (proc.time()-ptm)[3] + + # Output + if (out$status!=0) { + cat(format(" Error: problem infeasible!"), "\n") + obj_val = NA + obj_dist_mat = NA + id_1 = NA + id_2 = NA + group_id = NA + time = NA + } + + if (out$status==0) { + cat(format(" Optimal matches found"), "\n") + + if (approximate == 1) { + rel = .relaxation_n(n_tot, out$solution, dist_mat, subset_weight, "glpk", round_cplex, trace) + out$solution = rel$sol + out$optimum = rel$obj + time = time + rel$time + } + + i_ind = rep(1:(n_tot-1), (n_tot-1):1) + aux = matrix(1:n_tot, nrow = n_tot, ncol = n_tot) + j_ind = aux[lower.tri(aux)] + + group_1 = i_ind[out$solution==1] + group_2 = j_ind[out$solution==1] + max_groups = apply(cbind(group_1, group_2), 1, max) + + id_1 = group_1[max_groups<=n_tot] + id_2 = group_2[max_groups<=n_tot] + + #! Group identifier + group_id_1 = 1:(length(id_1)) + group_id_2 = 1:(length(id_2)) + group_id = c(group_id_1, group_id_2) + + obj_val = out$optimum + obj_dist_mat = sum(t(dist_mat)[lower.tri(dist_mat)] * out$solution) + } + } + else { + stop('suggested package not installed') + } + } + + if (solver=="symphony") { + #library("Rsymphony") + if (requireNamespace('Rsymphony', quietly = TRUE)) { + cat(format(" Symphony optimizer is open..."), "\n") + dir = rep(NA, length(sense)) + dir[sense=="E"] = '==' + dir[sense=="L"] = '<=' + dir[sense=="G"] = '>=' + bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), + upper = list(ind=c(1:length(ub)), val=ub)) + ptm = proc.time() + out = Rsymphony::Rsymphony_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = var_type, max = FALSE, time_limit = t_max) + time = (proc.time()-ptm)[3] + + # Output + if (out$status==228) { + cat(format(" Error: problem exceeded the time limit and no feasible solution is found!"), "\n") + obj_val = NA + obj_dist_mat = NA + id_1 = NA + id_2 = NA + group_id = NA + time = NA + } + else if (out$status!=0) { + cat(format(" Error: problem infeasible!"), "\n") + obj_val = NA + obj_dist_mat = NA + id_1 = NA + id_2 = NA + group_id = NA + time = NA + } + + if (out$status==0) { + cat(format(" Optimal matches found"), "\n") + + if (approximate == 1) { + rel = .relaxation_n(n_tot, out$solution, dist_mat, subset_weight, "symphony") + out$solution = rel$sol + out$objval = rel$obj + time = time + rel$time + } + + i_ind = rep(1:(n_tot-1), (n_tot-1):1) + aux = matrix(1:n_tot, nrow = n_tot, ncol = n_tot) + j_ind = aux[lower.tri(aux)] + + group_1 = i_ind[out$solution==1] + group_2 = j_ind[out$solution==1] + max_groups = apply(cbind(group_1, group_2), 1, max) + + id_1 = group_1[max_groups<=n_tot] + id_2 = group_2[max_groups<=n_tot] + + #! Group identifier + group_id_1 = 1:(length(id_1)) + group_id_2 = 1:(length(id_2)) + group_id = c(group_id_1, group_id_2) + + obj_val = out$objval + obj_dist_mat = sum(t(dist_mat)[lower.tri(dist_mat)] * out$solution) + } + } else { + stop('suggested package not installed') + } + } + + return = list(obj_total = obj_val, obj_dist_mat = obj_dist_mat, id_1 = id_1, id_2 = id_2, + group_id = group_id,time = time) +} diff --git a/R/oneprob_profmatch.r b/R/oneprob_profmatch.r index 469f168..8f7c442 100644 --- a/R/oneprob_profmatch.r +++ b/R/oneprob_profmatch.r @@ -4,10 +4,10 @@ mom_targets <- mom$targets if (is.null(solver)) { - solver = 'glpk' t_max = 60 * 15 approximate = 1 - } else { + solver = "highs" + }else { t_max = solver$t_max approximate = solver$approximate trace = solver$trace @@ -126,34 +126,100 @@ #! GLPK else if (solver == "glpk") { #library(Rglpk) - cat(format(" GLPK optimizer is open..."), "\n") - dir = rep(NA, length(prmtrs$sense)) - dir[prmtrs$sense=="E"] = '==' - dir[prmtrs$sense=="L"] = '<=' - dir[prmtrs$sense=="G"] = '>=' + if (requireNamespace('Rglpk', quietly=TRUE)) { + cat(format(" GLPK optimizer is open..."), "\n") + dir = rep(NA, length(prmtrs$sense)) + dir[prmtrs$sense=="E"] = '==' + dir[prmtrs$sense=="L"] = '<=' + dir[prmtrs$sense=="G"] = '>=' + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out= Rglpk::Rglpk_solve_LP(cvec, Amat, dir, bvec, types = vtype, max = TRUE) + time = (proc.time()-ptm)[3] + + if (out$status!=0) { + cat(format(" Error: problem infeasible!"), "\n") + obj_total = NA + obj_dist_mat = NA + id = NA + time = NA + } + + if (out$status==0) { + cat(format(" Optimal matches found"), "\n") + + #! Matched units indexes + id = (1:n_dec_vars)[t_ind==1 & out$solution==1] + + #! Optimal value of the objective function + obj_total = out$optimum + + } + } + else { + stop('Required solver not installed') + } + } + + #! HiGHS + else if (solver == "highs"){ + #library(highs) + cat(format(" HiGHS optimizer is open..."), "\n") + lhs = rep(-Inf, length(sense)) + rhs = rep(Inf, length(sense)) + lhs[sense == "G"] = bvec[sense == "G"] + rhs[sense == "L"] = bvec[sense == "L"] + lhs[sense == "E"] = bvec[sense == "E"] + rhs[sense == "E"] = bvec[sense == "E"] + + types = vtype + types[types=="B"] = "I" + cat(format(" Finding the optimal matches..."), "\n") ptm = proc.time() - out= Rglpk_solve_LP(cvec, Amat, dir, bvec, types = vtype, max = TRUE) + out = highs_solve(L = cvec, + lower = 0, + upper = 1, + A = Amat, + lhs = lhs, + rhs = rhs, + types = types, + maximum = TRUE, + control = (highs_control(time_limit = t_max))) time = (proc.time()-ptm)[3] - if (out$status!=0) { + if (out$status == 7){ + cat(format(" Optimal matches found"), "\n") + + #! Matched units indexes + id = (1:n_dec_vars)[round(out$primal_solution, 1e-10)==1] + + #! Optimal value of the objective function + obj_total = out$objective_value + } + else if (out$status == 8) { cat(format(" Error: problem infeasible!"), "\n") obj_total = NA - obj_dist_mat = NA id = NA time = NA } - - if (out$status==0) { - cat(format(" Optimal matches found"), "\n") + else if (out$status == 13) { + cat(format(" Time limit reached!"), "\n") #! Matched units indexes - id = (1:n_dec_vars)[t_ind==1 & out$solution==1] + id = (1:n_dec_vars)[round(out$primal_solution, 1e-10)==1] #! Optimal value of the objective function - obj_total = out$optimum - + obj_total = out$objective_value + } + else{ + outmessage = paste0(" Error: HiGHS solver message: ", out$status_message) + cat(format(outmessage), "\n") + obj_total = NA + id = NA + time = NA } } diff --git a/R/problemparameters_cardmatch.r b/R/problemparameters_cardmatch.r index baabe2b..149ba86 100644 --- a/R/problemparameters_cardmatch.r +++ b/R/problemparameters_cardmatch.r @@ -104,6 +104,7 @@ } #! aux = cbind(rows_ind, cols_ind, vals)[order(cols_ind), ] + aux = aux[(aux[, 3] != 0),] Amat = simple_triplet_matrix(i = aux[, 1], j = aux[, 2], v = aux[, 3]) #! Constraint vector, bvec diff --git a/R/problemparameters_profmatch.R b/R/problemparameters_profmatch.R index 980f132..fa4f084 100644 --- a/R/problemparameters_profmatch.R +++ b/R/problemparameters_profmatch.R @@ -1,4 +1,4 @@ -#! Generate the parameters for cardmatch +#! Generate the parameters for profmatch .problemparameters_profmatch = function(mom_covs, mom_tols, mom_targets) { if (is.vector(mom_covs) == TRUE){ @@ -42,6 +42,7 @@ vals = c(vals_mom) aux = cbind(rows_ind, cols_ind, vals)[order(cols_ind), ] + aux = aux[(aux[, 3] != 0),] Amat = simple_triplet_matrix(i = aux[, 1], j = aux[, 2], v = aux[, 3]) #! Constraint vector, bvec diff --git a/R/relaxation_b.r b/R/relaxation_b.r index 4f939ba..da6b332 100644 --- a/R/relaxation_b.r +++ b/R/relaxation_b.r @@ -34,6 +34,58 @@ #### SOLVER PART ####### + if (solver == "highs"){ + #library(highs) + cat(format(" HiGHS optimizer is open..."), "\n") + lhs = rep(-Inf, length(sense)) + rhs = rep(Inf, length(sense)) + lhs[sense == "G"] = bvec[sense == "G"] + rhs[sense == "L"] = bvec[sense == "L"] + lhs[sense == "E"] = bvec[sense == "E"] + rhs[sense == "E"] = bvec[sense == "E"] + + types = vtype + types[types=="B"] = "I" + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out = highs_solve(L = cvec, + lower = 0, + upper = 1, + A = Amat, + lhs = lhs, + rhs = rhs, + types = types, + maximum = TRUE) + time = (proc.time()-ptm)[3] + + if (out$status == 7 | out$status == 13){ + if (out$status == 7){ + cat(format(" Optimal matches found"), "\n") + } + else if (out$status == 13){ + cat(format(" Time limit reached!"), "\n") + } + sol = out$primal_solution + if(is.null(dist_mat)) { + obj = sum(-1*(rep(1, n_tot)) * (round(sol, 1e-10)==1)) + } else { + obj = sum((as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE))-(subset_weight*rep(1, n_t*n_c)))*(round(sol, 1e-10)==1)) + } + } + else if (out$status == 8) { + cat(format(" Error: problem infeasible!"), "\n") + obj = 0 + sol = NULL + } + else{ + outmessage = paste0(" Error: HiGHS solver message: ", out$status_message) + cat(format(outmessage), "\n") + obj = 0 + sol = NULL + } + } + if (solver == "cplex"){ #library(Rcplex) if (requireNamespace('Rcplex', quietly = TRUE)) { @@ -137,31 +189,35 @@ # GLPK if (solver == "glpk") { #library(Rglpk) - dir = rep(NA, length(sense)) - dir[sense=="E"] = '==' - dir[sense=="L"] = '<=' - dir[sense=="G"] = '>=' - - bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), - upper = list(ind=c(1:length(ub)), val=ub)) - - ptm = proc.time() - out= Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = TRUE) - time = (proc.time()-ptm)[3] - - if (out$status!=0) { - cat(format(" Error: problem infeasible!"), "\n") - obj = 0 - sol = NULL - } else { - sol = out$solution - if(is.null(dist_mat)) { - obj = sum(-1*(rep(1, n_tot)) * out$solution) + if (requireNamespace('Rglpk', quietly = TRUE)) { + dir = rep(NA, length(sense)) + dir[sense=="E"] = '==' + dir[sense=="L"] = '<=' + dir[sense=="G"] = '>=' + + bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), + upper = list(ind=c(1:length(ub)), val=ub)) + + ptm = proc.time() + out= Rglpk::Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = TRUE) + time = (proc.time()-ptm)[3] + + if (out$status!=0) { + cat(format(" Error: problem infeasible!"), "\n") + obj = 0 + sol = NULL } else { - obj = sum((as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE))-(subset_weight*rep(1, n_t*n_c)))*out$solution) + sol = out$solution + if(is.null(dist_mat)) { + obj = sum(-1*(rep(1, n_tot)) * out$solution) + } else { + obj = sum((as.vector(matrix(t(dist_mat), nrow = 1, byrow = TRUE))-(subset_weight*rep(1, n_t*n_c)))*out$solution) + } } } - + else { + stop('suggested package not installed') + } } return(list(sol = sol, obj = obj, time = time)) diff --git a/R/relaxation_n.r b/R/relaxation_n.r index f73f3ea..fc6047c 100644 --- a/R/relaxation_n.r +++ b/R/relaxation_n.r @@ -84,6 +84,47 @@ } } + if (solver == "highs") { + #library(highs) + lhs = rep(-Inf, length(sense)) + rhs = rep(Inf, length(sense)) + lhs[sense == "G"] = bvec[sense == "G"] + rhs[sense == "L"] = bvec[sense == "L"] + lhs[sense == "E"] = bvec[sense == "E"] + rhs[sense == "E"] = bvec[sense == "E"] + + types = vtype + types[types=="B"] = "I" + + cat(format(" Finding the optimal matches..."), "\n") + ptm = proc.time() + out = highs_solve(L = cvec, + lower = 0, + upper = ub, + A = Amat, + lhs = lhs, + rhs = rhs, + types = types, + maximum = TRUE) + time = (proc.time()-ptm)[3] + if (out$status == 8) { + cat(format(" Error: problem infeasible!"), "\n") + obj = 0 + sol = NULL + } + + else if (out$status == 7 | out$status == 13){ + if (out$status == 7){ + cat(format(" Optimal matches found"), "\n") + } + else if (out$status == 13){ + cat(format(" Time limit reached!"), "\n") + } + sol = out$primal_solution + obj = sum((t(dist_mat)[lower.tri(dist_mat)]-(subset_weight*rep(1, n_dec))) * (round(out$primal_solution, 1e-10) == 1)) + } + } + if (solver == "symphony") { #library(Rsymphony) if (requireNamespace('Rsymphony', quietly = TRUE)) { @@ -115,46 +156,51 @@ # GLPK if (solver == "glpk") { #library(Rglpk) - #dir = rep(NA, length(sense)) - #dir[sense=="E"] = '==' - #dir[sense=="L"] = '<=' - #dir[sense=="G"] = '>=' - # - #bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), - # upper = list(ind=c(1:length(ub)), val=ub)) - # - #ptm = proc.time() - #out= Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = TRUE) - #time = (proc.time()-ptm)[3] - # - #if (out$status!=0) { - # cat(format(" Error: problem infeasible!"), "\n") - # obj = 0 - # sol = NULL - #} else { - # sol = out$solution - # obj = sum((t(dist_mat)[lower.tri(dist_mat)]-(subset_weight*rep(1, n_dec))) * out$solution) - #} - ptm = proc.time() - Amat = matrix(0, nrow = n_tot, ncol = n_tot) - res = matrix(0, nrow = n_tot, ncol = n_tot) - - Amat[upper.tri(Amat)] = coef - - max_edge = max(Amat) - while (max_edge > 0) { - row = which(Amat == max_edge, arr.ind=TRUE)[1,1] - col = which(Amat == max_edge, arr.ind=TRUE)[1,2] - res[row,col] = 1 - Amat[row,] = 0 - Amat[,row] = 0 - Amat[col,] = 0 - Amat[,col] = 0 + if (requireNamespace('Rglpk', quietly = TRUE)) { + #dir = rep(NA, length(sense)) + #dir[sense=="E"] = '==' + #dir[sense=="L"] = '<=' + #dir[sense=="G"] = '>=' + # + #bound = list(lower = list(ind=c(1:length(ub)), val=rep(0,length(ub))), + # upper = list(ind=c(1:length(ub)), val=ub)) + # + #ptm = proc.time() + #out= Rglpk::Rglpk_solve_LP(cvec, Amat, dir, bvec, bounds = bound, types = vtype, max = TRUE) + #time = (proc.time()-ptm)[3] + # + #if (out$status!=0) { + # cat(format(" Error: problem infeasible!"), "\n") + # obj = 0 + # sol = NULL + #} else { + # sol = out$solution + # obj = sum((t(dist_mat)[lower.tri(dist_mat)]-(subset_weight*rep(1, n_dec))) * out$solution) + #} + ptm = proc.time() + Amat = matrix(0, nrow = n_tot, ncol = n_tot) + res = matrix(0, nrow = n_tot, ncol = n_tot) + + Amat[upper.tri(Amat)] = coef + max_edge = max(Amat) + while (max_edge > 0) { + row = which(Amat == max_edge, arr.ind=TRUE)[1,1] + col = which(Amat == max_edge, arr.ind=TRUE)[1,2] + res[row,col] = 1 + Amat[row,] = 0 + Amat[,row] = 0 + Amat[col,] = 0 + Amat[,col] = 0 + max_edge = max(Amat) + } + sol = res[upper.tri(res)] + obj = sum((t(dist_mat)[lower.tri(dist_mat)]-(subset_weight*rep(1, n_dec))) * sol) + time = (proc.time()-ptm)[3] + } + else { + stop('suggested package not installed') } - sol = res[upper.tri(res)] - obj = sum((t(dist_mat)[lower.tri(dist_mat)]-(subset_weight*rep(1, n_dec))) * sol) - time = (proc.time()-ptm)[3] } diff --git a/man/bmatch.Rd b/man/bmatch.Rd index c0a82d9..e6fd2cd 100644 --- a/man/bmatch.Rd +++ b/man/bmatch.Rd @@ -86,7 +86,7 @@ Optimization solver parameters: a list with four objects, \code{solver = list(name = name, t_max = t_max, approximate = 1, round_cplex = 0,}\cr \code{ trace_cplex = 0)}. -\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi} and \code{symphony}. The default solver is \code{glpk} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. +\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi}, \code{highs}, and \code{symphony}. The default solver is \code{highs} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. \code{t_max} is a scalar with the maximum time limit for finding the matches. This option is specific to \code{cplex} and \code{gurobi}. If the optimal matches are not found within this time limit, a partial, suboptimal solution is given. diff --git a/man/cardmatch.Rd b/man/cardmatch.Rd index b9c266d..e50b664 100644 --- a/man/cardmatch.Rd +++ b/man/cardmatch.Rd @@ -20,7 +20,7 @@ \code{mom = list(covs = mom_covs, tols = mom_tols, targets = mom_targets)}. -\code{mom_covs} is a matrix where each column is a covariate whose mean is to be balanced. \code{mom_tols} is a vector of tolerances for the maximum difference in means for the covariates in \code{mom_covs}. \code{mom_targets} is a vector of target moments (e.g., means) of a distribution to be approximated by matched sampling. If \code{mom_targets} is specified, then \code{profmatch} will select units from each treatment group so that each matched group differs at most by \code{mom_tols} units from the respective moments in \code{mom_targets}. As a result, the matched groups will differ at most \code{mom_tols * 2} from each other. Under certain assumptions, \code{mom_targets} can be used for constructing a representative matched sample. The lengths of \code{mom_tols} and \code{mom_target} have to be equal to the number of columns of \code{mom_covs}. Note that the columns of \code{mom_covs} can be transformations of the original covariates to balance higher order single-dimensional moments like variances and skewness, and multidimensional moments such as correlations (Zubizarreta 2012).} +\code{mom_covs} is a matrix where each column is a covariate whose mean is to be balanced. \code{mom_tols} is a vector of tolerances for the maximum difference in means for the covariates in \code{mom_covs}. \code{mom_targets} is a vector of target moments (e.g., means) of a distribution to be approximated by matched sampling. If \code{mom_targets} is specified, then \code{cardmatch} will select units from each treatment group so that each matched group differs at most by \code{mom_tols} units from the respective moments in \code{mom_targets}. As a result, the matched groups will differ at most \code{mom_tols * 2} from each other. Under certain assumptions, \code{mom_targets} can be used for constructing a representative matched sample. The lengths of \code{mom_tols} and \code{mom_target} have to be equal to the number of columns of \code{mom_covs}. Note that the columns of \code{mom_covs} can be transformations of the original covariates to balance higher order single-dimensional moments like variances and skewness, and multidimensional moments such as correlations (Zubizarreta 2012).} \item{fine}{Fine balance parameters: a list with one argument, @@ -34,9 +34,9 @@ Optimization solver parameters: a list with four objects, \code{solver = list(name = name, t_max = t_max, approximate = 1, round_cplex = 0,}\cr \code{ trace_cplex = 0)}. -\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi} and \code{symphony}. The default solver is \code{glpk} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. +\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi}, \code{highs}, and \code{symphony}. The default solver is \code{highs} with \code{approximate = 1}. For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. -\code{t_max} is a scalar with the maximum time limit for finding the matches. This option is specific to \code{cplex} and \code{gurobi}. If the optimal matches are not found within this time limit, a partial, suboptimal solution is given. +\code{t_max} is a scalar with the maximum time limit for finding the matches. This option is specific to \code{cplex}, \code{gurobi}, and \code{highs}. If the optimal matches are not found within this time limit, a partial, suboptimal solution is given. \code{approximate} is a scalar that determines the method of solution. If \code{approximate = 1} (the default), an approximate solution is found via a relaxation of the original integer program. This method of solution is faster than \code{approximate = 0}, but some balancing constraints may be violated to some extent. This option works only with \code{n_controls = 1}, i.e. pair matching. @@ -119,7 +119,7 @@ fine = list(covs = fine_covs) # Solver options t_max = 60*5 -solver = "glpk" +solver = "highs" approximate = 0 solver = list(name = solver, t_max = t_max, approximate = approximate, round_cplex = 0, trace = 0) diff --git a/man/designmatch-package.Rd b/man/designmatch-package.Rd index d24940b..69c6c2e 100644 --- a/man/designmatch-package.Rd +++ b/man/designmatch-package.Rd @@ -1,63 +1,63 @@ -\name{designmatch-package} -\alias{designmatch-package} -\alias{designmatch} -\docType{package} -\title{ -Optimal Matched Design of Randomized Experiments and Observational Studies -} -\description{ -\code{designmatch} includes two functions for the construction of matched samples that are balanced and representative by design. These two functions are \code{bmatch} and \code{nmatch} for bipartite and nonbipartite matching, respectively. Both functions include options for directly balancing means, higher order moments, and distributions of the observed covariates. In both \code{bmatch} and \code{nmatch}, an integer programming (IP) problem is solved. This IP problem either minimizes the total sum of covariate distances between matched units, maximizes the total number of matched units, or optimizes a combination of the two, subject to matching and covariate balancing constraints. In order to solve these problems, four different optimization solvers can be used: CPLEX, GLPK, Gurobi, and Symphony. By default, both \code{bmatch} and \code{nmatch} solve a relaxation of these integer programs using GLPK, which runs quickly but may violate to some extent some of the balancing constraints. If the user wants to solve for an exact solution of the program, we strongly recommend using either CPLEX or Gurobi, which are much faster but require a license (free for academic users) and special installation (see the installation instructions). Between the two, Gurobi is considerably easier to install. Among others, \code{designmatch} can be used for matching in treatment-control as well as case-control observational studies; observational studies with instrumental variables and discontinuity designs; and for the design of randomized experiments, for example for matching before randomization. The package also includes functions for assessing covariate balance in the matched samples. -} -\details{ -\tabular{ll}{ -Package: \tab designmatch\cr -Type: \tab Package\cr -Version: \tab 0.2\cr -Date: \tab 2016-08-09\cr -License: \tab GPL-2 | GPL-3\cr -} -} -\author{ - Jose R. Zubizarreta , Cinar Kilcioglu . - -Maintainer: Jose R. Zubizarreta , Cinar Kilcioglu . -} -\references{ - Greevy, R., Lu, B., Silber, J. H., and Rosenbaum, P. R. (2004), "Optimal Multivariate -Matching Before Randomization," \emph{Biostatistics}, 5, 263-275. - - Hsu. J., Zubizarreta, J. R., Small, D. S., and Rosenbaum, P. R. (2015), "Strong Control of the Family-Wise Error Rate in Observational Studies that Discover Effect Modification by Exploratory Methods," \emph{Biometrika}, 102, 767-782. - - Keele, L., Titiunik, R., and Zubizarreta, J. R., (2015), "Enhancing a Geographic Regression Discontinuity Design Through Matching to Estimate the Effect of Ballot Initiatives on Voter Turnout," \emph{Journal of the Royal Statistical Society: Series A}, 178, 223-239. - - Kilcioglu, C., and Zubizarreta, J. R., (2016), "Maximizing the Information Content of a Balanced Matched Sample in a Study of the Economic Performance of Green Buildings," working paper. - - Lu, B., Greevy, R., Xu, X., and Beck C. (2011), "Optimal Nonbipartite Matching and its Statistical Applications," \emph{The American Statistician}, 65, 21-30. - - Rosenbaum, P. R. (2010), \emph{Design of Observational Studies}, Springer. - - Rosenbaum, P. R. (2012), "Optimal Matching of an Optimally Chosen Subset in Observa- -tional studies," \emph{Journal of Computational and Graphical Statistics}, 21, 57-71. - - Yang, D., Small, D., Silber, J. H., and Rosenbaum, P. R. (2012), "Optimal Matching With Minimal Deviation From Fine Balance in a Study of Obesity and Surgical Outcomes," \emph{Biometrics}, 68, 628-636. - - Yang. F., Zubizarreta, J. R., Small, D. S., Lorch, S. A., and Rosenbaum, P. R. (2014), "Dissonant Conclusions When Testing the Validity of an Instrumental Variable," \emph{The American Statistician}, 68, 253-263. - - Zou, J., and Zubizarreta, J. R., (2015) "Covariate Balanced Restricted Randomization: Optimal Designs, Exact Tests, and Asymptotic Results," working paper. - - Zubizarreta, J. R., Reinke, C. E., Kelz, R. R., Silber, J. H., and Rosenbaum, P. R. (2011), "Matching for Several Sparse Nominal Variables in a Case-Control Study of Readmission Following Surgery," \emph{The American Statistician}, 65, 229-238. - - Zubizarreta, J. R. (2012), "Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery," \emph{Journal of the American Statistical Association}, 107, 1360-1371. - - Zubizarreta, J. R., Paredes, R. D., and Rosenbaum, P. R. (2014), "Matching for Balance, Pairing for Heterogeneity in an Observational Study of the Effectiveness of For-profit and Not-for-profit High Schools in Chile," \emph{Annals of Applied Statistics}, 8, 204-231. -} -%~~ Optionally other standard keywords, one per line, from file KEYWORDS in the R documentation ~~ -%~~ directory ~~ -%\keyword{ package } -%\seealso{ -%~~ Optional links to other man pages, e.g. ~~ -%~~ \code{\link[:-package]{}} ~~ -%} -%\examples{ -%~~ simple examples of the most important functions ~~ -%} +\name{designmatch-package} +\alias{designmatch-package} +\alias{designmatch} +\docType{package} +\title{ +Optimal Matched Design of Randomized Experiments and Observational Studies +} +\description{ +\code{designmatch} includes two functions for the construction of matched samples that are balanced and representative by design. These two functions are \code{bmatch} and \code{nmatch} for bipartite and nonbipartite matching, respectively. Both functions include options for directly balancing means, higher order moments, and distributions of the observed covariates. In both \code{bmatch} and \code{nmatch}, an integer programming (IP) problem is solved. This IP problem either minimizes the total sum of covariate distances between matched units, maximizes the total number of matched units, or optimizes a combination of the two, subject to matching and covariate balancing constraints. In order to solve these problems, four different optimization solvers can be used: CPLEX, GLPK, Gurobi, HiGHS, and Symphony. By default, both \code{bmatch} and \code{nmatch} solve a relaxation of these integer programs using HiGHS, which runs quickly but may violate to some extent some of the balancing constraints. If the user wants to solve for an exact solution of the program, we strongly recommend using either CPLEX or Gurobi, which are much faster but require a license (free for academic users) and special installation (see the installation instructions). Between the two, Gurobi is considerably easier to install. Among others, \code{designmatch} can be used for matching in treatment-control as well as case-control observational studies; observational studies with instrumental variables and discontinuity designs; and for the design of randomized experiments, for example for matching before randomization. The package also includes functions for assessing covariate balance in the matched samples. +} +\details{ +\tabular{ll}{ +Package: \tab designmatch\cr +Type: \tab Package\cr +Version: \tab 0.2\cr +Date: \tab 2016-08-09\cr +License: \tab GPL-2 | GPL-3\cr +} +} +\author{ + Jose R. Zubizarreta , Cinar Kilcioglu . + +Maintainer: Jose R. Zubizarreta , Cinar Kilcioglu . +} +\references{ + Greevy, R., Lu, B., Silber, J. H., and Rosenbaum, P. R. (2004), "Optimal Multivariate +Matching Before Randomization," \emph{Biostatistics}, 5, 263-275. + + Hsu. J., Zubizarreta, J. R., Small, D. S., and Rosenbaum, P. R. (2015), "Strong Control of the Family-Wise Error Rate in Observational Studies that Discover Effect Modification by Exploratory Methods," \emph{Biometrika}, 102, 767-782. + + Keele, L., Titiunik, R., and Zubizarreta, J. R., (2015), "Enhancing a Geographic Regression Discontinuity Design Through Matching to Estimate the Effect of Ballot Initiatives on Voter Turnout," \emph{Journal of the Royal Statistical Society: Series A}, 178, 223-239. + + Kilcioglu, C., and Zubizarreta, J. R., (2016), "Maximizing the Information Content of a Balanced Matched Sample in a Study of the Economic Performance of Green Buildings," working paper. + + Lu, B., Greevy, R., Xu, X., and Beck C. (2011), "Optimal Nonbipartite Matching and its Statistical Applications," \emph{The American Statistician}, 65, 21-30. + + Rosenbaum, P. R. (2010), \emph{Design of Observational Studies}, Springer. + + Rosenbaum, P. R. (2012), "Optimal Matching of an Optimally Chosen Subset in Observa- +tional studies," \emph{Journal of Computational and Graphical Statistics}, 21, 57-71. + + Yang, D., Small, D., Silber, J. H., and Rosenbaum, P. R. (2012), "Optimal Matching With Minimal Deviation From Fine Balance in a Study of Obesity and Surgical Outcomes," \emph{Biometrics}, 68, 628-636. + + Yang. F., Zubizarreta, J. R., Small, D. S., Lorch, S. A., and Rosenbaum, P. R. (2014), "Dissonant Conclusions When Testing the Validity of an Instrumental Variable," \emph{The American Statistician}, 68, 253-263. + + Zou, J., and Zubizarreta, J. R., (2015) "Covariate Balanced Restricted Randomization: Optimal Designs, Exact Tests, and Asymptotic Results," working paper. + + Zubizarreta, J. R., Reinke, C. E., Kelz, R. R., Silber, J. H., and Rosenbaum, P. R. (2011), "Matching for Several Sparse Nominal Variables in a Case-Control Study of Readmission Following Surgery," \emph{The American Statistician}, 65, 229-238. + + Zubizarreta, J. R. (2012), "Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery," \emph{Journal of the American Statistical Association}, 107, 1360-1371. + + Zubizarreta, J. R., Paredes, R. D., and Rosenbaum, P. R. (2014), "Matching for Balance, Pairing for Heterogeneity in an Observational Study of the Effectiveness of For-profit and Not-for-profit High Schools in Chile," \emph{Annals of Applied Statistics}, 8, 204-231. +} +%~~ Optionally other standard keywords, one per line, from file KEYWORDS in the R documentation ~~ +%~~ directory ~~ +%\keyword{ package } +%\seealso{ +%~~ Optional links to other man pages, e.g. ~~ +%~~ \code{\link[:-package]{}} ~~ +%} +%\examples{ +%~~ simple examples of the most important functions ~~ +%} diff --git a/man/distmatch.Rd b/man/distmatch.Rd index cc3621c..0d7d9ee 100644 --- a/man/distmatch.Rd +++ b/man/distmatch.Rd @@ -25,7 +25,7 @@ Optimization solver parameters: a list with four objects, \code{solver = list(name = name, t_max = t_max, approximate = 1, round_cplex = 0,}\cr \code{ trace_cplex = 0)}. -\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi} and \code{symphony}. The default solver is \code{glpk} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. +\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi}, \code{highs}, and \code{symphony}. The default solver is \code{highs} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. \code{t_max} is a scalar with the maximum time limit for finding the matches. This option is specific to \code{cplex} and \code{gurobi}. If the optimal matches are not found within this time limit, a partial, suboptimal solution is given. @@ -104,7 +104,7 @@ fine = list(covs = fine_covs) # Solver options t_max = 60*5 -solver = "glpk" +solver = "highs" approximate = 0 solver = list(name = solver, t_max = t_max, approximate = approximate, round_cplex = 0, trace = 0) diff --git a/man/nmatch.Rd b/man/nmatch.Rd index c7c0f9c..6b743e9 100644 --- a/man/nmatch.Rd +++ b/man/nmatch.Rd @@ -71,7 +71,7 @@ where \code{fine_covs} is a matrix where each column is a nominal covariate for \code{solver = list(name = name, t_max = t_max, approximate = 1, round_cplex = 0,}\cr \code{ trace_cplex = 0)}. -\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi} and \code{symphony}. The default solver is \code{glpk} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that the installation of the \code{gurobi} interface for R is much simpler. +\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi}, \code{highs}, and \code{symphony}. The default solver is \code{highs} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that the installation of the \code{gurobi} interface for R is much simpler. \code{t_max} is a scalar with the maximum time limit for finding the matches. This option is specific to \code{cplex} and \code{gurobi}. If the optimal matches are not found within this time limit, a partial, suboptimal solution is given. @@ -149,7 +149,7 @@ tional studies," \emph{Journal of Computational and Graphical Statistics}, 21, 5 ## (see Rosenbaum 2012 and Zubizarreta et al. 2013 for a discussion). ## Here the balance requirements are mean and fine balance for ## different covariates. We require 50 pairs to be matched. -## Again, the solver used is glpk with the approximate option. +## Again, the solver used is HiGHS with the approximate option. ## Matrix of covariates #X_mat = cbind(age, education, black, hispanic, married, nodegree, re74, re75) @@ -171,7 +171,7 @@ tional studies," \emph{Journal of Computational and Graphical Statistics}, 21, 5 ## Solver options #t_max = 60*5 -#solver = "glpk" +#solver = "highs" #approximate = 1 #solver = list(name = solver, t_max = t_max, approximate = approximate, round_cplex = 0, #trace_cplex = 0) diff --git a/man/profmatch.Rd b/man/profmatch.Rd index 76e145c..4b4a42e 100644 --- a/man/profmatch.Rd +++ b/man/profmatch.Rd @@ -29,9 +29,9 @@ Optimization solver parameters: a list with four objects, \code{solver = list(name = name, t_max = t_max, approximate = 1, round_cplex = 0,}\cr \code{ trace_cplex = 0)}. -\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi} and \code{symphony}. The default solver is \code{glpk} with \code{approximate = 1}, so that by default an approximate solution is found (see \code{approximate} below). For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. +\code{solver} is a string that determines the optimization solver to be used. The options are: \code{cplex}, \code{glpk}, \code{gurobi}, \code{highs}, and \code{symphony}. The default solver is \code{highs} with \code{approximate = 1}. For an exact solution, we strongly recommend using \code{cplex} or \code{gurobi} as they are much faster than the other solvers, but they do require a license (free for academics, but not for people outside universities). Between \code{cplex} and \code{gurobi}, note that installing the R interface for \code{gurobi} is much simpler. -\code{t_max} is a scalar with the maximum time limit for finding the matches. This option is specific to \code{cplex} and \code{gurobi}. If the optimal matches are not found within this time limit, a partial, suboptimal solution is given. +\code{t_max} is a scalar with the maximum time limit for finding the matches. This option is specific to \code{cplex}, \code{gurobi}, and \code{highs}. If the optimal matches are not found within this time limit, a partial, suboptimal solution is given. \code{approximate} is a scalar that determines the method of solution. If \code{approximate = 1} (the default), an approximate solution is found via a relaxation of the original integer program. This method of solution is faster than \code{approximate = 0}, but some balancing constraints may be violated to some extent. This option works only with \code{n_controls = 1}, i.e. pair matching.