Package: opencesp 0.4.0

opencesp: Generation and Evaluation of Synthetic Tabular Datasets

Various tools developed as part of the Open-CESP (Centre de recherche en Epidémiologie et Santé des Populations) initiative to generate and evaluate synthetic datasets for statistical disclosure control. This includes tools to investigate the risk-utility tradeoff achievable with given synthesis methods, as well as statistical tools to estimate (conditional) probability distributions. The main eventual aim is to help researchers and statisticians disseminate open research data.

Authors:Rémy Chapelle [aut, cre], Centre de recherche en Epidémiologie et Santé des Populations [cph]

opencesp_0.4.0.tar.gz
opencesp_0.4.0.zip(r-4.7)opencesp_0.4.0.zip(r-4.6)opencesp_0.4.0.zip(r-4.5)
opencesp_0.4.0.tgz(r-4.6-x86_64)opencesp_0.4.0.tgz(r-4.6-arm64)opencesp_0.4.0.tgz(r-4.5-x86_64)opencesp_0.4.0.tgz(r-4.5-arm64)
opencesp_0.4.0.tar.gz(r-4.7-arm64)opencesp_0.4.0.tar.gz(r-4.7-x86_64)opencesp_0.4.0.tar.gz(r-4.6-arm64)opencesp_0.4.0.tar.gz(r-4.6-x86_64)
opencesp_0.4.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
opencesp/json (API)
NEWS

# Install 'opencesp' in R:
install.packages('opencesp', repos = c('https://rchapelle.r-universe.dev', 'https://cloud.r-project.org'))
Uses libs:
  • c++– GNU Standard C++ Library v3

On CRAN:

Conda:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

cpp

1.70 score 35 exports 66 dependencies

Last updated from:0d6fed2d6c. Checks:13 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-arm64OK131
linux-devel-x86_64OK128
source / vignettesOK204
linux-release-arm64OK144
linux-release-x86_64OK130
macos-release-arm64OK155
macos-release-x86_64OK203
macos-oldrel-arm64OK186
macos-oldrel-x86_64OK297
windows-develOK91
windows-releaseOK117
windows-oldrelOK93
wasm-releaseOK117

Exports:adaptive_matches_propASDEDavatarizeCCM_RSCCM_SRcor_F1CRM_RSCRM_SRdcrdep_orderGCAPget_precGTCAPhellinger_distanceimpute_rfind_blocksinterval_overlapLCMmatches_propmean_hellingeroutlier_coverageoutlier_learning_factorPCD_catPCD_numpgbpgb_controlpgb_cvhpMSEpMSE_cppredict_cde_pgbpredict_cde_pgb_rawresampleround_synthtsAUCuniv_att_prob

Dependencies:backportsbitbit64bootbroomclicliprclustercodetoolscpp11crayondplyrfastmapforcatsforeachgenericsglmnetgluehavenhmsiteratorsjomolatticelifecyclelme4magrittrMASSMatrixmiceminqamitmlnlmenloptrnnetnumDerivordinalpanPCAmixdatapillarpkgconfigprettyunitsprogresspurrrR6randomForestrbibutilsRcppRcppEigenRdpackreadrreformulasrlangrpartshapestringistringrsurvivaltibbletidyrtidyselecttzdbucminfutf8vctrsvroomwithr

Readme and manuals

Help Manual

Help pageTopics
opencesp: Generation and Evaluation of Synthetic Tabular Datasetsopencesp-package opencesp
Adaptive proportion of matchesadaptive_matches_prop
Average squared differences between empirical distributionsASDED
Avatarization of a datasetavatarize
Cross-classification metric in the real-synthetic orderCCM_RS
Cross-classification metric in the synthetic-real orderCCM_SR
Membership F1 metriccor_F1
Cross-regression metric in the real-synthetic orderCRM_RS
Cross-regression metric in the synthetic-real orderCRM_SR
Distance to the closest recorddcr
Dependency order in a data setdep_order
Generalized Correct Attribution Probability (GCAP)GCAP
Precision of numeric variablesget_prec
Generalized Targeted Correct Attribution Probability (GTCAP)GTCAP
Estimated Hellinger distancehellinger_distance
Random-forest imputationimpute_rf
Independent blocks of variablesind_blocks
Overlap of confidence intervalsinterval_overlap
Log-cluster metricLCM
Proportion of matchesmatches_prop
Mean Hellinger distancemean_hellinger
Outlier coverageoutlier_coverage
Outlier learning factoroutlier_learning_factor
Pairwise correlation difference adapted for categorical dataPCD_cat
Pairwise correlation differencePCD_num
Parallel gradient boostingpgb
Controls for PGB fitspgb_control
Parallel gradient boosting with cross-validationpgb_cvh
Propensity score mean-squared errorpMSE
Heuristics for the complexity parameter of tree-based estimation of propensity scorespMSE_cp
Predict conditional densities from PGB modelspredict_cde_pgb
Piecewise-constant conditional densities from PGB modelspredict_cde_pgb_raw
Predict method for PGB modelspredict.pgb
Sample conditionally on vector lengthresample
Round synthetic numeric variables to the precision observed in the original dataround_synth
ts-AUCtsAUC
Univariate correct attribution probabilityuniv_att_prob