Package: proBatch 2.0.0

Yuliya Burankova

proBatch: Tools for Diagnostics and Corrections of Batch Effects in Proteomics

These tools facilitate batch effects analysis and correction in high-throughput experiments. It was developed primarily for mass-spectrometry proteomics (DIA/SWATH), but could also be applicable to most omic data with minor adaptations. The package contains functions for diagnostics (proteome/genome-wide and feature-level), correction (normalization and batch effects correction) and quality control. Non-linear fitting based approaches were also included to deal with complex, mass spectrometry-specific signal drifts.

Authors:Jelena Cuklina [aut], Chloe H. Lee [aut], Patrick Pedrioli [aut], Olga Zolotareva [aut], Yuliya Burankova [cre]

proBatch_2.0.0.tar.gz
proBatch_2.0.0.zip(r-4.7)proBatch_2.0.0.zip(r-4.6)proBatch_2.0.0.zip(r-4.5)
proBatch_2.0.0.tgz(r-4.6-any)proBatch_2.0.0.tgz(r-4.5-any)
proBatch_2.0.0.tar.gz(r-4.7-any)proBatch_2.0.0.tar.gz(r-4.6-any)
proBatch_2.0.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
proBatch/json (API)
NEWS

# Install 'proBatch' in R:
install.packages('proBatch', repos = c('https://bioc-release.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/freddsle/probatch/issues

Datasets:

On BioConductor:proBatch-2.1.0(bioc 3.24)proBatch-2.0.0(bioc 3.23)

batcheffectnormalizationpreprocessingsoftwaremassspectrometryproteomicsqualitycontrolvisualization

5.73 score 88 downloads 1 mentions 86 exports 161 dependencies

Last updated from:06dc66163e (on RELEASE_3_23). Checks:1 NOTE, 9 OK. Indexed: no.

TargetResultTimeFilesSyslog
bioc-checksNOTE265
linux-devel-x86_64OK637
source / vignettesOK441
linux-release-x86_64OK568
macos-release-arm64OK341
macos-oldrel-arm64OK314
windows-develOK461
windows-releaseOK432
windows-oldrelOK421
wasm-releaseOK225

Exports:adjust_batch_trend_dfadjust_batch_trend_dmcalculate_feature_CVcalculate_peptide_corr_distrcalculate_PVCAcalculate_sample_corr_distrcenter_feature_batch_means_dfcenter_feature_batch_means_dmcenter_feature_batch_medians_dfcenter_feature_batch_medians_dmcheck_sample_consistencyconvert_annotation_classescorrect_batch_effects_dfcorrect_batch_effects_dmcorrect_with_ComBat_dfcorrect_with_ComBat_dmcorrect_with_removeBatchEffect_dmcreate_peptide_annotationdate_to_sample_orderdates_to_posixdefine_sample_orderfit_nonlineargenerate_colors_for_numericget_chainget_operation_logguess_factor_columns_if_neededhandle_factor_numeric_overlaphandle_missing_valueslog_transform_dflog_transform_dmlong_to_matrixmatrix_to_longmerge_rare_levelsnormalize_data_dfnormalize_data_dmnormalize_sample_medians_dfnormalize_sample_medians_dmpb_add_levelpb_aggregate_levelpb_as_longpb_as_widepb_assay_matrixpb_current_assaypb_evalpb_filterNApb_infIsNApb_nNApb_pipeline_namepb_register_steppb_transformpb_zeroIsNAplot_boxplotplot_corr_matrixplot_CV_distrplot_CV_distr.dfplot_heatmap_diagnosticplot_heatmap_genericplot_hierarchical_clusteringplot_iRTplot_NA_densityplot_NA_frequencyplot_NA_heatmapplot_PCAplot_peptide_corr_distributionplot_peptide_corr_distribution.corrDFplot_peptides_of_one_proteinplot_protein_corrplotplot_PVCAplot_PVCA.dfplot_sample_corr_distributionplot_sample_corr_distribution.corrDFplot_sample_corr_heatmapplot_sample_meanplot_single_featureplot_spike_inplot_split_violin_with_boxplotplot_with_fitting_curveprepare_PVCA_dfProBatchFeaturesProBatchFeatures_from_longquantile_normalize_dfquantile_normalize_dmsample_annotation_to_colorsunlog_dfunlog_dmwarn_unmapped_columns

Dependencies:abindaffyaffyioannotateAnnotationDbiAnnotationFilteraskpassbackportsbase64encBHBiobaseBiocBaseUtilsBiocGenericsBiocManagerBiocParallelBiostringsbitbit64blobbootbslibcachemcheckmatecliclueclustercodetoolscolorspacecorrplotcpp11crayoncrosstalkcurldata.tableDBIDelayedArraydigestdoParalleldplyrdynamicTreeCutedgeRevaluatefarverfastclusterfastmapfontawesomeforeachforeignformatRFormulafsfutile.loggerfutile.optionsgenefiltergenericsGenomicRangesggfortifyggplot2gluegridExtragtablehighrHmischtmlTablehtmltoolshtmlwidgetshttrigraphimputeIRangesisobanditeratorsjquerylibjsonliteKEGGRESTknitrlabelinglambda.rlaterlatticelazyevallifecyclelimmalme4locfitlubridatemagrittrMASSMatrixMatrixGenericsmatrixStatsmemoisemgcvmimeminqaMsCoreUtilsMultiAssayExperimentnlmenloptrnnetopensslotelpheatmappillarpkgconfigplotlyplyrpngpreprocessCorepromisesProtGenericspurrrpvcaQFeaturesR6rappdirsrbibutilsRColorBrewerRcppRcppEigenRdpackreformulasreshape2rlangrmarkdownrpartRSQLiterstudioapiS4ArraysS4VectorsS7sassscalesSeqinfosnowSparseArraystatmodstringistringrSummarizedExperimentsurvivalsvasystibbletidyrtidyselecttimechangetinytexutf8vctrsviridisviridisLitevsnwesandersonWGCNAwithrxfunXMLxtableXVectoryaml

proBatch

Rendered fromproBatch.Rmdusingknitr::rmarkdownon Jun 03 2026.

Last update: 2026-03-10
Started: 2019-09-30

ProBatchFeatures

Rendered fromproBatchFeatures.Rmdusingknitr::rmarkdownon Jun 03 2026.

Last update: 2026-03-10
Started: 2025-09-28

Readme and manuals

Help Manual

Help pageTopics
Subset `ProBatchFeatures` objects without dropping metadata.ProBatchFeatures-subset [,ProBatchFeatures,ANY,ANY,ANY-method
Calculate CV distribution for each featurecalculate_feature_CV
Calculate peptide correlation between and within peptides of one proteincalculate_peptide_corr_distr
Calculate variance distribution by variablecalculate_PVCA calculate_PVCA.default calculate_PVCA.ProBatchFeatures
Calculates correlation for all pairs of the samples in data matrix, labels as replicated/same_batch/unrelated in output columns (see "Value").calculate_sample_corr_distr
Check if sample annotation is consistent with data matrix and join the twocheck_sample_consistency
Convert factor and numeric columnsconvert_annotation_classes
Batch correction of normalized dataadjust_batch_trend_df adjust_batch_trend_dm center_feature_batch_means_df center_feature_batch_means_dm center_feature_batch_medians_df center_feature_batch_medians_dm correct_batch_effects correct_batch_effects_df correct_batch_effects_dm correct_with_ComBat_df correct_with_ComBat_dm
Batch effect correction with removeBatchEffect from limmacorrect_with_removeBatchEffect_dm
Prepare peptide annotation from long format data framecreate_peptide_annotation
Convert date/time to POSIXct and rank samples by itdate_to_sample_order
Convert date/time to POSIXctdates_to_posix
Defining sample order internallydefine_sample_order
Example multi-center DIA LFQ E. coli proteomics (DIA-NN)example_ecoli_data
Peptide annotation dataexample_peptide_annotation
Example protein data in long formatexample_proteome
Example protein data in matrixexample_proteome_matrix
Sample annotation data version 1example_sample_annotation
Plotting peptide measurementsfeature_level_diagnostics plot_iRT plot_peptides_of_one_protein plot_single_feature plot_spike_in plot_with_fitting_curve
Fit a non-linear trend (currently optimized for LOESS)fit_nonlinear
Retrieve operation chain as vector or single string "combat_on_mediannorm_on_log"get_chain
Access the operation log (structured)get_operation_log
Guess factors if numeric columns were not providedguess_factor_columns_if_needed
Handle factor columns that are duplicated in numeric_columnshandle_factor_numeric_overlap
Handle missing values in a data matrixhandle_missing_values
Long to wide data format conversionlong_to_matrix
Wide to long conversionmatrix_to_long
Data normalization methodsnormalize normalize_data_df normalize_data_dm normalize_sample_medians_df normalize_sample_medians_dm quantile_normalize_df quantile_normalize_dm
Add a new level from an external matrix and link to an existing assaypb_add_level
Aggregate features (e.g., peptide -> protein) and store as new levelpb_aggregate_level
Get current assay as LONG (via proBatch::matrix_to_long)pb_as_long
Get an assay matrix (wide)pb_as_wide
Convenience accessor for assay matrix by name/index (returns the 'intensity' assay)pb_assay_matrix
Current (latest) assay namepb_current_assay
Evaluate a pipeline and return the matrix, without storingpb_eval
Apply `QFeatures` missing-data helpers to stored assayspb_filterNA pb_infIsNA pb_missing_helpers pb_nNA pb_zeroIsNA
Pretty pipeline name derived from the assaypb_pipeline_name
Allow to register/override steps at runtime (e.g., map "combat" -> proBatch::combat_dm)pb_register_step
Compute a pipeline and optionally store only the final resultpb_transform
Visualise correlation matrixplot_corr_matrix
Plot CV distribution to compare various steps of the analysisplot_CV_distr
Plot the distribution (boxplots) of per-batch per-step CV of featuresplot_CV_distr.df
Plot the heatmap of samples (cols) vs features (rows)plot_heatmap_diagnostic plot_heatmap_diagnostic.default plot_heatmap_diagnostic.ProBatchFeatures
Plot the heatmapplot_heatmap_generic plot_heatmap_generic.default plot_heatmap_generic.ProBatchFeatures
cluster the data matrix to visually inspect which confounder dominatesplot_hierarchical_clustering plot_hierarchical_clustering.default plot_hierarchical_clustering.ProBatchFeatures
Plot intensity density by missingnessplot_NA_density plot_NA_density.default plot_NA_density.ProBatchFeatures
Plot missing-value frequency distributionplot_NA_frequency plot_NA_frequency.default plot_NA_frequency.ProBatchFeatures
Plot missing-value heatmap(s)plot_NA_heatmap plot_NA_heatmap.default plot_NA_heatmap.ProBatchFeatures
plot PCA plotplot_PCA plot_PCA.default plot_PCA.ProBatchFeatures
Create violin plot of peptide correlation distributionplot_peptide_corr_distribution plot_peptide_corr_distribution.corrDF
Peptide correlation matrix (heatmap)plot_protein_corrplot
Plot variance distribution by variableplot_PVCA plot_PVCA.default plot_PVCA.ProBatchFeatures
plot PVCA, when the analysis is completedplot_PVCA.df
Create violin plot of sample correlation distributionplot_sample_corr_distribution plot_sample_corr_distribution.corrDF
Sample correlation matrix (heatmap)plot_sample_corr_heatmap
Plot per-sample mean or boxplots for initial assessmentplot_boxplot plot_boxplot.default plot_boxplot.ProBatchFeatures plot_sample_mean plot_sample_mean.default plot_sample_mean.ProBatchFeatures plot_sample_mean_or_boxplot
Plot split violin plot (convenient to compare distribution before and after)plot_split_violin_with_boxplot
prepare the weights of Principal Variance Componentsprepare_PVCA_df prepare_PVCA_df.default prepare_PVCA_df.ProBatchFeatures
proBatch: A package for diagnostics and correction of batch effects, primarily in proteomicsproBatch-package proBatch
Construct a ProBatchFeatures object from a wide matrix + sample annotation.ProBatchFeatures
Construct from LONG df via proBatch::long_to_matrixProBatchFeatures_from_long
ProBatchFeatures: QFeatures subclass with operation log, levels/pipelines, and lazy storageProBatchFeatures-class
Generate colors for sample annotationsample_annotation_to_colors sample_annotation_to_colors.default sample_annotation_to_colors.ProBatchFeatures
Functions to log transform raw data before normalization and batch correctionlog_transform_df log_transform_dm log_transform_dm.default log_transform_dm.ProBatchFeatures transform_raw_data unlog_df unlog_dm unlog_dm.default unlog_dm.ProBatchFeatures
Warn about unmapped columnswarn_unmapped_columns