MiscMetabar 0.16.8

accu_plot(), dist_pos_control(), hill_curves_pq() and tsne_pq() no longer error under recent R-devel: the otu_table is now coerced to a base matrix before being passed to vegan functions (renyi(), renyiaccum(), specaccum(), vegdist()), which previously triggered an assignment of an object of class "numeric" is not valid for @'.Data' error because plain as.matrix() does not strip the S4 otu_table class.
Rarefaction across the package is now performed by an internal R-version-robust reimplementation rather than phyloseq::rarefy_even_depth(), whose replace = FALSE code path errors with invalid 'length.out' value under recent R-devel (phyloseq issue #1753). The reimplementation is bit-identical to phyloseq::rarefy_even_depth() for the same seed, depth and replace value (and is more correct in the degenerate case where a retained sample has a single read). This affects rarefy_pq(), adonis_pq(), adonis_rarperm_pq(), hill_test_rarperm_pq(), hill_pq(), biplot_pq(), ggvenn_pq(), upset_pq(), ggaluv_pq() and ggscatt_pq().
rarefy_pq() gains a replace argument (default FALSE, sampling without replacement) and accepts seed = FALSE to leave the random number generator untouched, mirroring phyloseq::rarefy_even_depth().
Further check-time reductions for CRAN compliance: extra examples \dontrun{} (kept for documentation, not run during checks).
Added file-level skip_on_cran() to some heavy test files.
Example speed-ups for some functions.
Fix several bugs when using Windows paths by quoting system call arguments with shQuote()

MiscMetabar 0.16.6

verify_tax_table() is now ~10× faster on full-size taxonomy tables.
divent_hill_matrix_pq() no longer recomputes the per-sample positive-subset (x <- x[x > 0]) once per Hill order. The loop is now sample-outer / q-inner, so each row is sliced once. Numeric output is bitwise-identical. Speeds up every Hill-diversity computation in the package: hill_pq(), hill_bar_pq(), hill_tuckey_pq(), profile_hill_pq(), psmelt_samples_pq(), plot_refseq_extremity_pq(), and the *_rarperm_pq family.
circle_pq() replaces a nested pbapply(., 2, pbtapply(., group, sum)) over the OTU table with two rowsum() calls. On data_fungi (1420 taxa × 185 samples) the example dropped from ~18 s to ~1.8 s (≈ 10× faster). Output unchanged.
format2dada2(fasta_db = …), hill_acc_pq(type = "sample"), adonis_rarperm_pq() are also faster.
New pkgdown article: vignettes/articles/timing.Rmd documents wall-clock cost of the main functions on data_fungi and data_fungi_mini, with a CSV refreshed by inst/benchmark/function_timings.R.
Pkgdown articles use fewer permutations / simulations to keep the site build under a few minutes.
Reduced R CMD check time to keep CRAN’s 10-minute budget. Examples for verify_tax_table(), adonis_pq(), plot_SCBD_pq(), multipatt_pq(), hill_pq(), plot_tsne_pq(), upset_test_pq(), summary_plot_pq(), ggvenn_pq(), plot_refseq_pq(), plot_seq_ratio_pq(), plot_refseq_extremity_pq(), glmutli_pq(), adonis_rarperm_pq(), lefser_pq(), var_par_pq(), var_par_rarperm_pq(), taxa_only_in_one_level(), distri_1_taxa(), accu_plot_balanced_modality(), multi_biplot_pq(), tax_bar_pq(), plot_var_part_pq(), track_wkflow(), reorder_taxa_pq() and transform_pq() now use data_fungi_mini (137 × 45) instead of the full data_fungi (185 × 1420), keeping behaviour identical but much faster.
hill_acc_pq(), iNEXT_pq(), format2dada2() and hill_test_rarperm_pq() examples moved from \donttest{} to \dontrun{}. These functions are inherently CPU-bound (sample-based accumulation, fasta reformatting, permutation × rarefaction × q-loop) and were the largest individual contributors to the 10-min CRAN budget. Their behaviour is documented in the corresponding vignettes.
verify_pq() example switched from data_fungi to data_fungi_mini (82 s → < 5 s).
Tests: plot_LCBD_pq() / LCBD_pq() smoke tests in tests/testthat/test_figures_beta_div.R lowered nperm from 100 to 9 (they only assert return class, not numeric stability).
Function defaults (nperm, n_permutations) are unchanged.

MiscMetabar 0.16.5

funguild_assign() and rotl_pq() examples now use \dontrun{} instead of \donttest{}. Both examples call external APIs (www.stbates.org and the Open Tree of Life respectively) that are not always reachable during CRAN’s --run-donttest check, causing spurious ERRORs.
verify_tax_table()’s introductory example was moved inside the existing \donttest{} block. The call against the full data_fungi dataset took ~70 s, which triggered the CRAN “examples > 5 s” NOTE on every check.
XVector removed from DESCRIPTION Imports. It was declared but never imported in NAMESPACE or used directly; Biostrings already loads it transitively. CRAN flagged this as “Namespace in Imports field not imported from”.
Bibliography: corrected the DOI for Taberlet et al. (2012) “Environmental DNA” in paper/bibliography.bib, paper/paper.bib, and the two vignettes/*.bib files (was the journal ISSN landing 10.1002/(issn)2637-4943, now the paper DOI 10.1111/j.1365-294X.2012.05542.x). README.md and the pkgdown site regenerate accordingly.

MiscMetabar 0.16.3

verify_tax_table() now recognises non-breaking space (U+00A0) and other Unicode separators (em space, ideographic space, …) as border / internal whitespace. Previously the detection regex ^\s|\s$ (TRE) and the stripping call trimws() only handled ASCII [ \t\r\n], so taxonomic values padded with NBSP — common in spreadsheet- or copy-paste-derived metadata — were silently kept as e.g. "Archaeospora ", causing duplicate genera and broken grouping downstream. Detection now uses grepl("^[\\s\\p{Z}]|[\\s\\p{Z}]$", val, perl = TRUE) and stripping uses gsub("^[\\s\\p{Z}]+|[\\s\\p{Z}]+$", "", val, perl = TRUE). Both clean_pq(..., tax_remove_border_spaces = TRUE) and clean_pq(..., tax_remove_all_space = TRUE) benefit from the fix.
verify_tax_table() gains a new check for invisible / unusual characters in taxonomic values: anything in Unicode category \p{C} (control / format / surrogate / private use / unassigned) or any \p{Z} separator other than plain ASCII space or tab. Typical offenders are NBSP (U+00A0), zero-width space (U+200B), zero-width joiner (U+200D) and C0 control characters. Three new parameters drive the check: detect_invisible_chars (default TRUE, warns when verbose = TRUE), replace_invisible_chars (default FALSE, requires modify_phyloseq = TRUE to strip), and invisible_chars_replacement (default ""). Warnings/messages report each offending value with the hexadecimal code points of the offending characters so the user can see what is hiding inside the string.
clean_pq() gains tax_replace_invisible_chars (default FALSE) which forwards to verify_tax_table() and strips invisible characters from the cleaned tax_table.
CRAN resubmission. Fixes the incoming-checks failure reported for 0.16.2: write_pq() no longer passes a DNAStringSet refseq slot directly to utils::write.table() — sequences are now coerced via as.character() first. This avoids dispatching to as.data.frame,XStringSet-method from R-devel’s data.frame(), which now forwards an internal validRN = FALSE argument that the XStringSet method’s .local does not accept.
Biostrings is now an Imports (moved from Suggests), so that the XVector classes stored in data/data_fungi*.rda are covered by MiscMetabar’s recursive strong dependency graph.
Replaced an unreachable ggstatsplot link in NEWS.md (www.indrapatil.com) with the CRAN page.
clean_pq() gains four FALSE-by-default toggles to apply verify_tax_table() modifications on the cleaned tax_table: remove_border_spaces (trim leading/trailing whitespace), remove_all_space (replace internal whitespace via replace_space_with, default "_"), replace_to_NA (set values matching unwanted_tax_patterns to NA; accepts a custom pattern vector), and redundant_suffix (drop redundant "_sp" tips where the genus is already filled; accepts a custom suffix string such as "_var"). Toggles can be enabled independently or combined in a single call; each modification emits a message and nothing fires when all toggles are FALSE.

MiscMetabar 0.16.2

MiscMetabar 0.16.1

cutadapt_remove_primers() gains a cutadapt_args parameter (default "") to pass additional arguments directly to cutadapt, such as "-e 0.01" to lower the maximum error rate from the cutadapt default of 10% to 1%.

MiscMetabar 0.15.2

hill_test_rarperm_pq(): fixed default type from "non-parametrique" to "nonparametric" to match the documented valid values and avoid confusion.
hill_test_rarperm_pq(): fixed example that incorrectly passed p.val = 0.9 (not a valid parameter); it now uses p_val_signif = 0.9 as intended.
ggstatsplot 1.0.0 compatibility notes: ggstatsplot 1.0.0 removes var.equal, nboot, and effsize.type from ggbetweenstats(); if you were passing these through ... to ggbetween_pq() or hill_test_rarperm_pq(), they will now be silently ignored. The palette argument now requires "package::palette" format (e.g. palette = "ggthemes::gdoc"), and the separate package argument has been removed from ggstatsplot.
hill_bar_pq() gains five parameters: error_fun (a function returning c(lower, upper) bounds, enabling asymmetric intervals such as quantile ranges; default mean ± SE), error_fun_lab (caption label; default "mean ± SE"), error_bar_alpha (transparency of the secondary top-half error bar drawn over jittered points; default 0.35), point_alpha (transparency of jittered data points; default 0.7), and letters_below_bar (when TRUE, compact letters are placed below the x-axis at a fixed position, giving a clean layout independent of data spread; default FALSE). Groups with NA values in the grouping variable now receive "n.d." letters when Tukey HSD is run, instead of being silently dropped.
umap_pq() no longer emits a tibble .name_repair deprecation warning when using pkg = "umap" (fixes #134).
hill_bar_pq() new function plotting Hill diversity bar charts (mean ±SE, jittered points, Kruskal-Wallis subtitle, optional Tukey HSD compact letter display) for one or multiple Hill orders via a patchwork layout.
tax_bar_pq() fixes a bug where nb_seq = FALSE with a grouping fact would sum binary per-sample presence values across samples sharing the same modality, inflating bar heights beyond the true OTU count. Each OTU is now counted at most once per group (present in ≥1 sample of that group), so bar segments correctly show the number of distinct OTUs in each taxonomic rank per modality.
tax_bar_pq() gains a n_sample_text_size parameter (default 2) controlling the font size of the per-group sample count label. The (n=X) annotation is now displayed below each bar rather than appended to the group x-axis label.
New transformation/normalisation functions collected in R/normalize_pq.R, documented in a new article (articles/normalization.html).
css_pq() new function wrapping metagenomeSeq::cumNorm() for Cumulative Sum Scaling normalization.
gmpr_pq() new function implementing the Geometric Mean of Pairwise Ratios normalization (Chen et al. 2018) in pure R.
mcknight_residuals_pq() new function computing depth-robust alpha diversity as residuals of log-richness on log-depth (McKnight 2018; Mikryukov 2023).
rarefy_pq() new function wrapping phyloseq::rarefy_even_depth() with optional averaging over n rarefaction repetitions.
srs_pq() new function wrapping SRS::SRS() for Scaling with Ranked Subsampling normalization.
tmm_pq() new function wrapping edgeR::calcNormFactors(method = "TMM") for Trimmed Mean of M-values normalization.
transform_pq() new function providing a unified interface to common count transformations (tss, hellinger, clr, rclr, log1p, z, pa, rank) via vegan::decostand().
vst_pq() new function wrapping DESeq2::varianceStabilizingTransformation().
biplot_pq() gains a color_rank parameter (default NULL): when set to a taxonomic rank (e.g. "Class"), bars are colored by that rank instead of by sample modality, giving a taxonomic-composition view of the biplot. The fill legend is automatically titled with the rank name.
biplot_pq() gains a taxa_names_rank parameter (default NULL): when set to a taxonomic rank (e.g. "Genus"), the taxon axis labels display that rank instead of taxa_names(). Each OTU remains a separate bar regardless of shared rank values.
biplot_pq() no longer displays “Samples” on the taxon axis; the position used for the modality name annotations is now unlabeled.

MiscMetabar 0.15.1

New features

unwanted_tax_patterns is a new exported named character vector of regex patterns for common problematic taxonomy values (NA-like strings, "unclassified", "unknown", "Incertae_sedis", empty QIIME-style ranks, etc.). verify_tax_table() now uses it as the default for replace_to_NA, and other pqverse packages (e.g. dbpq::count_unwanted_tax()) can reuse it to keep patterns in sync.

Breaking changes

compare_pairs_pq(), ggbetween_pq(), hill_pq(), hill_tuckey_pq(), plot_refseq_extremity_pq(), and psmelt_samples_pq() now use divent::div_hill() instead of vegan::renyi() for Hill number computation, and compare_pairs_pq() uses divent::ent_shannon() / divent::ent_simpson() instead of vegan::diversity() for Shannon and Simpson indices. The default estimator is now "UnveilJ" (bias-corrected) rather than the naive plug-in estimator — diversity values will differ from previous versions. Pass estimator = "naive" via ... to restore old numeric behavior.

New features

divent_hill_matrix_pq() new exported utility to compute Hill numbers for all samples in an OTU table using divent::div_hill(). Accepts ... to forward any argument to divent::div_hill().
ggbetween_pq() gains a q parameter (default c(0, 1, 2)) to control which Hill diversity orders are computed. One plot is produced per value.
hill_acc_pq() gains a type parameter ("individual" or "sample"). type = "sample" computes sample-based accumulation curves by pooling samples incrementally across random permutations using divent::div_hill(), with a confidence ribbon. When merge_sample_by is set, one curve per group is drawn on the same plot. type = "individual" preserves the previous individual-based behaviour.
profile_hill_pq() new function wrapping divent::profile_hill() |> autoplot() to visualize Hill diversity profiles across all orders for all samples in a phyloseq object.

Deprecated

The hill_scales parameter in hill_pq(), hill_tuckey_pq(), and psmelt_samples_pq() is deprecated in favour of q. Use q = c(0, 1, 2) going forward.

MiscMetabar 0.14.6

Add find_vsearch() and install_vsearch() to make vsearch-based functions work on all platforms including Windows. install_vsearch() downloads the vsearch binary from GitHub, and find_vsearch() automatically locates it. All vsearch-calling functions now default to find_vsearch() instead of a hard-coded "vsearch" path. Users can also set options(MiscMetabar.vsearchpath = "/path/to/vsearch") for custom installations.
Add ridges_sam_pq(), the sample-centric counterpart of ridges_pq(): each ridge represents a taxon (at a given taxonomic level) and the x-axis shows the abundance distribution across samples, colored by a sample factor.
Add params output_data_frame to function track_wkflow_samples()
cutadapt_remove_primers() gains a verbose parameter (default TRUE). Set verbose = FALSE to fully silence cutadapt stdout/stderr and the completion message — unlike suppressMessages() or capture.output(), which cannot intercept system command output.
Fix a bug in chimera_removal_vs() where matrix dimensions were dropped when the input had only one sample (one row), causing downstream [, ...] indexing to fail with “incorrect number of dimensions”. All three subsetting branches now use drop = FALSE.
Many functions accepting a fact parameter now handle single-level factors gracefully: functions that require multiple groups (hill_pq(), hill_test_rarperm_pq(), graph_test_pq(), multipatt_pq(), ancombc_pq(), ggbetween_pq(), venn_pq(), ggvenn_pq(), upset_pq(), accu_plot(), accu_plot_balanced_modality(), plot_tsne_pq()) now emit an informative error message, while functions that can produce meaningful output with a single level (circle_pq(), sankey_pq(), are_modality_even_depth()) no longer crash.
Fix a bug in format2sintax() where the pattern_tax parameter was referenced by the wrong internal name (pattern_k), causing an error when using the taxnames argument.
Add reorder_distinct_colors() to reassign fill and color scales in ggplot objects so that adjacent segments have maximally different colors, with optional colorblind optimization and lightness alternation.
tax_bar_pq() gains show_values and minimum_value_to_show parameters to display abundance values (or percentages when percent_bar = TRUE) inside bar segments.
treemap_pq() now uses log10(x + 1) instead of log10(x) so that taxa with a count of 1 are still visible. New parameters show_na (default TRUE) to display NA taxa as a grey area, na_label to customize the NA label, and min_text_size (default 0) to control the minimum font size for tile labels.
biplot_pq() gains split_by_sample, sample_border_col, and sample_border_width parameters. When split_by_sample = TRUE, bars are stacked by sample with visible borders, showing the distribution of sequences across individual samples instead of a merged total.
Add two parameters to tax_bar_pq(), bar_internal_color to color each cells of the colored bars and linewidth_bar_internal to set the linewidth.
tax_bar_pq() with label_taxa = TRUE now also draws left-side labels for taxa that appear in the first bar but are absent from the last bar, making all taxa visible when using add_ribbon = TRUE across a time factor. A warning is emitted when taxa only appear in intermediate levels and cannot be labelled on either side.

MiscMetabar 0.14.5

Bug fix in normalize_prop_pq when taxa_are_rows(physeq) were FALSE.
Improve the verify_pq() function for cases where taxa_names or sample_names are not consistent and to test for duplicate sequences in @refseq slot.
Add a function verify_tax_table() to verify some classic issues in tax_table.
Fix a bug in aldex_pq() and plot_ordination_pq(). Also fix a bug in plot_ordination_pq() when using phyloseq object where taxa are rows.
Add parameters show_count, facet_by, growing_text and text_size to treemap_pq(): show_count appends raw abundance counts to labels, facet_by splits the treemap into facets by a sample metadata column, and growing_text=FALSE forces all tile labels to the same font size (determined by text_size).
Extend track_wkflow_samples() to accept all input types supported by track_wkflow(): matrix, dada-class, derep-class, lists of dada/derep, and character vectors of fastq file paths (previously only phyloseq objects were accepted).
Fix a bug for case with only one column in slot @sam_data
Fix a bug in the name of plot in the result of hill_pq()
Fix a bug in mumu_pq() not deleting temporary file log.txt when keep_temporary_files=FALSE
Fix a bug in adonis_pq() when using na_remove = TRUE and multiple terms in formula.
Add parameter by to adonis_pq() to choose how to compute p-values (overall model, sequential terms, marginal effects, one-degree-of-freedom contrasts). The default is now by = “terms” that will assess significance for each term.
Add function lefser_pq() to run LEfSe analysis (differential analysis) from a phyloseq object using the package lefser.
Add function aldex_pq() to run ALDEX2 analysis (differential analysis) from a phyloseq object using the package ALDEx2 and the default parameters gamma=0.5.
Add the parameter rngseed in all functions which used phyloseq::rarefy_even_depth to set the seed for random number generator in order to increase reproducibility.
Better messages (and not error) in filter_asv_blast when the resulting table of OTU is empty
Improve ancombc_pq() function by allowing custom names in the tax_levels parameter.
Fix a bug in filt_taxa_pq when using both min_nb_seq and min_nb_occurence parameters.

MiscMetabar 0.14.4

New features and improvements

Add function plot_seq_ratio_pq() to explore the number of sequences per samples using difference ratio of the number of sequences per samples ordered by the number of sequences.
Add params discard_genus_alone, pattern_to_remove_tip and pattern_to_remove_node to rotl_pq() to enhance the default naming of nodes and tips
Improve documentation consistency following the style guide
Allow DNAStringSet object as input of swarm_clustering() and physeq_or_string_to_dna()
Add param rank_propagation in merge_taxa_vec() to dissable the rank propagation of NA when merging taxa. It is useful when merging taxa with informations in the tax_table slot that do not follow a strict taxonomic hierarchical structure (e.g. functional guilds).
Add param lulu_exact in mumu_pq() to force the use of the unmodified lulu algorithm (with possibles errors) thanks to the option –legacy in mumu software. Add param extra_mumu_args to mumu_pq() to pass extra arguments to mumu software (--minimum_match, --minimum_ratio_type, --minimum_ratio, --minimum_relative_cooccurence, --threads).
Add function plot_ordination_pq to plot ordination from vegan::vegdist object (useful when using aitchison and robust aitchison distances)

Bug fixes

Fix a bug in subset_taxa_pq() when the condition was TRUE only for one taxon
Fix warnings in graph_test_pq() with ggplot2 v.4.0.0
Fix a bug in upseq_pq() when using min_nb_seq parameter.
Fix a bug in blast function by allowing value to be equal (not strictly greater) to the threshold values id_cut, bit_score_cut, min_cover_cut and e_value_cut.
Fix a bug in swarm associated functions (swarm_clustering(), add_swarms_to_pq()) to take into account the d parameter. Also add a parameter fastidious that is automatically set to FALSE is d is different from 1.

BREAKING CHANGE

Replace species_colnames by taxonomic_ranks in rotl_pq()
Parameter name changes in plot_mt() and krona()
- plot_mt(): alpha → pval (aligns with existing pval pattern in other functions)
- krona(): file → file_path (aligns with existing file_path pattern)

MiscMetabar 0.14.3

Better message in subset_taxa_tax_control()
Add parameters text_size and text_size_info to expand or minimize text annotation in summary_plot_pq().
Add function filt_taxa_wo_NA() to filter out taxa with NA values at given taxonomic rank(s)
Fix a bug in format2dada2() by adding semicolons to fill all the taxonomic levels if from_sintax is TRUE
Fix a bug in adonis_pq() for method aitchison and robust.aitchison.

MiscMetabar 0.14.2

Minor bug fix for CRAN resubmission

MiscMetabar 0.14.1

Add the possibility to use to resolve conflict using resolve_vector_ranks() in the assign_sintax() function.
Add numerous parameters to assign_sintax(), in particular vote_algorithm to choose the algo resolving conflict.
Add param pattern_to_remove in format2dada2()

MiscMetabar 0.14.0

Better filter of parameters in add_new_taxonomy_pq(). Only parameters used by the assign_* function corresponding to method are used.
Add functions format2sintax(), format2dada2() and format2dada2_species to format fasta database in sintax, dada2 (dada2::assignTaxonomy()) and dada2 Species (dada2::assignSpecies()) format
Add function assign_dada2() to assign Taxonomy (with missing ranks if needed) and to assign species using dada2::assignSpecies() with only one database input. Add method dada2_2steps in function add_new_taxonomy_pq() which use assign_dada2() function.

MiscMetabar 0.13.0

Add function assign_blastn() and add a method blast in the function add_new_taxonomy_pq().
Add function resolve_vector_ranks() to resolve conflict in a vector of taxonomy values

MiscMetabar 0.12.1

Add parameter name min_bootstrap in add_new_taxonomy_pq()
Bug fix in assign_idtaxa()
Add parameters pattern_to_remove and remove_NA to simplify_taxo()

MiscMetabar 0.12.0

Add function assign_idtaxa() and learn_idtaxa() to facilitate the taxonomic assignation using the idtaxa algorithm from the DECIPHER R package.
Add option idtaxa to method in add_new_taxonomy_pq()
Add function tbl_sum_taxtable() to summarize tax_table from a phyloseq object
In function assign_sintax(), add params too_few (default value “align_start”) and too_many (default “merge”) to authorize db with variable numbers of rank and parenthesis in taxonomic name,

MiscMetabar 0.11.1

Add param suffix to add_blast_info() allowing multiple use of the function on the same phyloseq object (e.g. in order to used different database)
Add param return_DNAStringSet to write_temp_fasta() function to return a DNAStringSet object in place of a temporary file.
Add a vignette pkgnet-report.
Add the possibility to send fasta.gz file to count_seq()

MiscMetabar 0.11

Add function filt_taxa_pq() to filter taxa based on the number of sequences/occurences
Add functions no_legend() and hill_curves_pq() to plot hill diversity accumulation curves for phyloseq
Add function umap_pq() to compute Dimensionality Reduction with UMAP
Add function plot_complexity_pq() to plot kmer complexity of references sequences of a phyloseq object
Add param type to ridge_pq() to plot a cumulative version (type=“ecdf”) version of ridge
Introduce the idea of a pq-verse: some other packages will complete the MiscMetabar packages to make package maintenance easier. The `comparpq](https://github.com/adrientaudiere/comparpq) package will facilitate the comparison of phyloseq object with different taxonomy, different clustering methods, different samples with same modality or different primers.
Add functions assign_vsearch_lca(), assign_sintax() and internal function write_temp_fasta()
Add param method to add_new_taxonomy_pq() to allow the use of dada2::assign_taxonomy() (default, precedent only method available), assign_sintax() or assign_vsearch_lca()

MiscMetabar 0.10.4

Add functions plot_refseq_pq() and plot_refseq_extremity_pq() to plot the proportion of each nucleotide and the diversity of nucleotides from @refseq of a phyloseq object.

MiscMetabar 0.10.3

Add params type, na_remove and verbose to ggvenn_pq(). The type = “nb_seq” allow to plot Venn diagram with the number of shared sequences instead of shared ASV.
Add automatic report in json for the function cutadapt_remove_primers().
Add param verbose to track_wkflow() and improve examples for track_wkflow() and list_fastq_files

MiscMetabar 0.10.2

Improve code thanks to {lintr} package
Add option return_file_path to cutadapt_remove_primers() in order to facilitate targets pipeline
Add function sam_data_matching_names() to match and verify congruence between fastq files names and sample metadata (sam_data)

MiscMetabar 0.10.1

CRAN 2024-09-10

Delete function heat_tree_pq() because {metacoder} package is archived from CRAN.

MiscMetabar 0.9.4

Set a seed in the example of build_tree_pq to resubmit to CRAN Add a param return_a_vector in function filter_trim() to make possible to return a vector of path as it is useful when used with targets::tar_targets(..., format="file"))
Make some storage amelioration by replacing list() by vector(list, ...)

MiscMetabar 0.9.3

CRAN 2024-09-09

Homogenize terminology replacing ASV by taxa/taxon in documentation and code
Build an alias function filter_taxa_blast() for filter_asv_blast()
Build an alias function postcluster_pq() for asv2otu()
Add param return_data_for_venn in function ggvenn_pq in order to make more customizable plot following ggVennDiagram tutorial

BREAKING CHANGES

Replacing misnamed param rename_asv by rename_taxons in clean_pq()
Replacing misnamed param reorder_asv by reorder_taxons in clean_pq()

MiscMetabar 0.9.2

Add param default_fun in function merge_samples2() in order to replace the default function that change the sample data in case of merging. A useful parameter is default_fun=diff_fct_diff_class.
Add param kruskal_test to hill_pq() function to prevent user to mis-interpret Tuckey HSD result (and letters) if the global effect of the tested factor on Hill diversity is non significant.
Add param vioplot to hill_pq() function to allow violin plot instead of boxplot.
Modify rarefy_sample_count_by_modality to debug the case of modality with level of length one.

MiscMetabar 0.9.1

CRAN 2024-04-28

New functions

Add functions taxa_as_rows() and taxa_as_columns() to replace verbose called to clean_pq()
Add function ggscatt_pq() to plot and test for effect of a numerical columns in sam_data on Hill number. Its the equivalent for numerical variables of ggbetween_pq() which focus on the effect of a factor.
Add functions var_par_pq() , var_par_rarperm_pq() and plot_var_part_pq() to compute the partition of the variation of community and plot it. It introduce the notion of rarperm part in the function name. It refers to the fact that this function compute permutation of samples depth rarefaction to measure the variation due to the random process in rarefaction.
Add function hill_test_rarperm_pq() to test the effect of a factor on hill diversity accounting for the variation due to random nature of the rarefaction by sample depth.
Add function rarefy_sample_count_by_modality() to equalize the number of samples for each levels of a modality (factor)
Add function accu_plot_balanced_modality() to plot accumulation curves with balanced modality (same number of samples per level) and depth rarefaction (same number of sequences per sample)
Add function adonis_rarperm_pq() to compute multiple Permanova analyses on different sample depth rarefaction.
Add function ggaluv_pq() to plot taxonomic distribution in alluvial fashion with ggplot2 (using the ggalluvial package)
Add function glmutli_pq() to use automated model selection and multimodel inference with (G)LMs for phyloseq object

New parameters

Add param taxa_ranks in function psmelt_samples_pq() to group results by samples AND taxonomic ranks.
Add param q in functions hill_tuckey_pq() and hill_p() to choose the level of the hill number.
Add param na_remove in function hill_pq() to remove samples with NA in the factor fact.

MiscMetabar 0.8.1

Add param plot_with_tuckey to hill_pq().,
Add function formattable_pq() to make beautiful table of the distribution of taxa across a modality using visualization inside in the table.
Add functions fac2col() and transp() to facilitate manipulation of colors, especially in function formattable_pq()
Add functions signif_ancombc() and plot_ancombc_pq() to plot significant results from ancombc_pq() function
Add function distri_1_taxa() to summarize the distribution of one given taxa across level of a modality
Add function normalize_prop_pq() to implement the method proposed by McKnight et al. 2018
Add function psmelt_samples_pq() to build data frame of samples information including the number of sequences (Abundance) and Hill diversity metrics. Useful to use with the ggstatsplot packages (see examples).
Replace param variable by fact in function ggbetween_pq() and hill_pq() (keeping the variable option in hill_pq() for backward compatibility)
Fix a bug in the class of the return object of function chimera_removal_vs(). Now it return a matrix to be able to be parsed on to dada2::getUniques()

MiscMetabar 0.7

CRAN 2024-03-08

Add functions chimera_detection_vs() and chimera_removal_vs() to process chimera detection and removal using vsearch software
Add functions filter_trim(), sample_data_with_new_names() and rename_samples() to facilitate the use of targets for bioinformatic pipeline.
Add function add_info_to_sam_data() to expand sam_data slot using a data.frame and using nb_asv and nb_seq
Add functions swarm_clustering() and vsearch_clustering() and add swarm method in the function asv2otu()
Add function physeq_or_string_to_dna() mostly for internal use
Add function cutadapt_remove_primers() to remove primers using cutadapt
Add internal functions is_swarm_installed(), is_cutadapt_installed(), is_vsearch_installed() and is_falco_installed() to test for the availability of external software in order to run examples and test from testthat.
Submit to CRAN and change code to comply with their rules (patch 0.7.1 to 0.7.9)
Numerous examples and tests are skipped on CRAN because it spends to much time to run. Rules vignettes is updated to details the strategy for this.

BREAKING CHANGES

Harmonization of parameters names:
- add_nb_sequences -> add_nb_seq in ggvenn_pq()
- db -> db_url in get_funguild_db()
- db -> db_funguild in get_funguild_db()
- file -> file_path in get_file_extension()
- n_seq -> nb_seq in subsample_fastq()
- otutable -> otu_table in lulu()
- alpha -> pval in plot_edgeR_pq() and plot_deseq2_pq() and change default value from 0.01 to more classical 0.05
- sequences -> seq2search in function search_exact_seq_pq()
- seq_names -> dna_seq in function asv2otu
Removing the function install_pkg_needed() which do not comply with CRAN policies

MiscMetabar 0.6.0

Add function ancombc_pq() to simplify the call to ANCOMBC::ancombc2() : ANalysis of COmpositions of Microbiomes with Bias Correction 2
Add param taxa_names_from_physeq (default FALSE) to subset_taxa_pq()
Add param rarefy_by_sample (default FALSE) to function ggbetween_pq()
Add function are_modality_even_depth() to test if samples depth significantly vary among the modalities of a factor
Add functions merge_taxa_vec() and merge_samples2() from the speedyseq package into MiscMetabar to decrease package dependencies (Thanks to Mike R. Mclaren)
Add function reorder_taxa_pq() in order to replace the unique call to package MicroViz to decrease package dependencies.
Add functions get_funguild_db() and funguild_assign() from the FUNGuildR package into MiscMetabar to decrease package dependencies
Remove all dependencies from packages not available on CRAN or Bioconductor. Improve code using goodpractice::gp() and devtools::check() function
Add messages in various cases (NA in samples data, low number of sequences in samples, low number of sequences by taxa) when using verify_pq() with args verbose=TRUE
Fix a bug in multitax_bar_pq() when using nb_seq = FALSE

MiscMetabar 0.52

Add function ggbetween_pq() to facilitate comparison of hill number using the power of ggstatsplot::ggbetweenstats()
Add function plot_SCBD_pq() to plot species contributions to beta diversity (SCBD) of samples

MiscMetabar 0.51

Add function LCBD_pq() and plot_LCBD_pq() to compute, test and plot local contributions to beta diversity (LCBD) of samples
Add function tbl_sum_samdata() to summarize information from sample data in a table
Add function mumu_pq() to use mumu, a fast and robust C++ implementation of lulu.
Add (a mostly internal) function install_pkg_needed() to install pkg (mostly for package list in Suggest in DESCRIPTION) if needed by a function.
Add function add_funguild_info() and plot_guild_pq() to add and plot fungal guild information from taxonomy using FUNGuild package
Add function build_phytree_pq() to build 3 phylogenetic trees (NJ, UPGMA and ML using phangorn R package) from the refseq slot of a phyloseq object, possibly with bootstrap values. See the vignettes Tree visualization for an introduction to tree visualization using ggtree R package.

MiscMetabar 0.5

Phyloseq object are converted in taxa_are_columns in the ggvenn_pq() thanks to issue #31

BREAKING CHANGES

Rename param log_10 in function biplot_pq() into log10trans
Rename param log10transform in function circle_pq() into log10trans

MiscMetabar 0.42

Add argument one_plot (default FALSE, same behavior than before) to hill_pq function in order to return an unique ggplot2 object with the four plots inside.
Add argument correction_for_sample_size (default TRUE, same behavior than before) to hill_pq and hill_tuckey_pq function to allow removing any correction for uneven sampling depth.
Add function multitax_bar_pq() to plot 3 levels of taxonomy in function of samples attributes
Add function ridges_pq() to plot ridges of one taxonomic level in function of samples attributes
Add function treemap_pq to plot treemap of two taxonomic levels

MiscMetabar 0.41

Add function iNEXT_pq() to calculate hill diversity using the iNEXT package.
Add argument pairs to multi_biplot_pq() in order to indicate all pairs of samples we want to print.
Improve compare_pairs_pq() with information about the number of shared sequences among pairs.
Add function upset_pq() to plot upset of phyloseq object using the ComplexUpset package.
Add function upset_test_pq to test for differences between intersections (wrapper of ComplexUpset::upset_test() for phyloseq-object).
Add info (param add_info) in subtitle of the hill_pq() function.
Add argument remove_space to simplify_taxo() function.
Add argument simplify_taxo to clean_pq() function.
Change argument rarefy_nb_seq by rarefy_before_merging and add arguments rarefy_after_merging and add_nb_seq to ggvenn_pq() function.
Add arguments rarefy_after_merging to biplot_pq() and upset_pq() functions.
Add argument taxa_fill to upset_pq() function in order to fill the bar with taxonomic rank.
Add a function subsample_fastq() to make subset of fastq files in order to test your pipeline with all samples but with a low number of reads.
Add a function accu_samp_threshold() to compute the number of sequence to obtain a given proportion of ASV in accumulation curves (`accu_plot).
Add a function tax_bar_pq() in order to plot taxonomic distribution across samples.

MiscMetabar 0.40

Add function multi_biplot_pq() to visualize a collection of couples of samples for comparison through a list of biplot_pq().
Add options add_info, na_remove, and clean_pq to plot_tax_pq() function.
Add options vsearch_cluster_method and vsearch_args to otu2asv() for more detailed control of the vsearch software.
Suppression of buggy function MM_idtaxa().
Add a wrapper of write_pq() called save_pq() to save a phyloseq object in the three possible formats () at the same time
- 4 separate tables
- 1 table version
- 1 RData file
Add a function add_blast_info() to add information from blast_pq() to the tax_table slot of a phyloseq object.
Add option keep_temporary_files in asv2otu() function.
Improve the documentation of asv2otu() and fix a little bug in the name of the conserved ASV after asv2otu().
Test coverage largely improved leading to numerous minor bug fixes.
Add function search_exact_seq_pq() to search for exact matching of sequences using complement, reverse and reverse-complement against a phyloseq object.
Add function add_new_taxonomy_pq() to add new taxonomic rank to a phyloseq object. For example to add taxonomic assignment from a new database.
Add a battery of test using test_that package and improve code compatibility with cran recommendations.

BREAKING CHANGES

asv2otu() with method="vsearch" change two default values (to repeat the precedent behavior, use asv2otu(..., vsearch_cluster_method = "--cluster_fast", tax_adjust = 1)):
- vsearch_cluster_method = “–cluster_size”
- tax_adjust = 0

MiscMetabar 0.34

Add option add_nb_samples to ggvenn_pq() which add the number of samples to level name in the plot. Useful to see disequilibrium in the number of samples among the factor’s levels.
Add option args_makedb and args_blastn to functions blast_pq(), blast_to_phyloseq(), blast_to_derep() and filter_asv_blast().
Add option rarefy_nb_seqs to ggven_pq() in order to rarefy samples before plotting.
Add function SRS_curve_pq() to plot scaling with ranked subsampling (SRS) curves using the SRS::SRS_curve() function (see citation(“SRS”) for reference).
Add option nb_samples_info to biplot_pq() in order to add the number of samples merged by level of factors.
Add a message when two modalities differ greatly (more than x2) in their number of sequences in biplot_pq() and ggvenn_pq().
Add options na_remove, dist_method (including Aitchinson and robust-Aitchinson distance), correction_for_sample_size and rarefy_nb_seqs options to adonis_pq() function.
Add option na_remove to graph_test_pq() function.

MiscMetabar 0.33

Add function plot_tax_pq() to plot taxonomic distribution (nb of sequences or nb of ASV) across factor.
Add option add_points and make better axis of hill_pq() function
Add function blast_to_derep() in order to facilitate searching some fasta sequences in dereplicated sequences (obtained by dada2::derepFastq)

	Database (makeblastdb)	Sequences to blast (blastn)
`blast_to_phyloseq()`	Built from `ref_seq` slot(physeq-class)	Custom fasta file
`blast_to_derep()`	Built from dereplicate sequences (derep-class)	Custom fasta file
`blast_pq()`	Custom database or custom fasta file	`ref_seq` slot of a physeq object

Add functions tsne_pq() and plot_tsne_pq() to quickly visualize results of the t-SNE multidimensional analysis based on the Rtsne::Rtsne() function.

MiscMetabar 0.32

Add the possibility to select a folder in the function count_seq()
Add functions track_wkflow_samples() and select_one_sample()
Add option sam_data_first in function write_pq()
Add option reorder_asv and rename_asv to in function write_pq() and clean_pq
Add a function rotl_pq() to build a phylogenetic tree using the ASV binomial names of a physeq object and the Open Tree of Life tree.

MiscMetabar 0.31

Argument split_by to make multiple plot given a variable in sam_data slot (function ggvenn_pq())
Argument seq_names in asv2otu() function allow to clusterize sequences from a character vector of DNA.
Add a blast_pq() function to blast the sequences of the @ref_seq slot against a custom database
Add a filter_asv_blast() function to filter ASV in phyloseq dataset using blast against a custom database
Add a subset_taxa_pq() function to filter ASV based on a named conditional vector. Used in filter_asv_blast().
Add parameter force_taxa_as_columns (default FALSE) and force_taxa_as_rows (default FALSE) to clean_pq().
Add a first version of the function count_fastq_seq() to count sequences from fastq.gz files directly from R.
Add taxonomic info to track_wkflow() function (parameter taxonomy_rank)

MiscMetabar 0.3

Change some function names, mainly replacing physeq by pk.
Improve documentation using some rules documented in the Rules vignettes.
Add a option sam_names() to read_pq()
Correction of data_fungi and data_fungi_sp_known metadata

MiscMetabar 0.24

Add supplementary info in summary_plot_physeq()`
Better arguments in biplot_physeq()`)
Add merge_sample_by argument in biplot_physeq()`
Better documentation with more example.
For other minors bugs fixes and addition, see the list of commits

MiscMetabar 0.23

Adapt the function asv2otu() to IdClusters change in the DECIPHER package (commit 254100922f2093cc789d018c18a26752a3cda1e3). Then change the IdClusters function that was removed from DECIPHER to Clusterize function.
Better functioning of blast_to_phyloseq() when none query sequences are founded.
Add tax_adjust argument to asv2otu()function
Add some functions useful for the targets package
Add a biplot_physeq() function to visualize of two samples for comparison of physeq object
Add an argument modality in the tax_datatable() function to split OTU abundancy by level of the sample modality
Add a function multiple_share_bisamples() to help compare samples by pairs
Add a new function (ggVenn_phyloseq()) for better venn diagram but without area calculation (use venn_phyloseq() in this case).
Add two functions helpful for beta-diversity analysis (adonis_phyloseq() and physeq_graph_test())

MiscMetabar 0.22

Add badge to set the development lifecycle of each function
Add the lulu_phyloseq function to make easy the reclustering of phyloseq object using the lulu algorithm (https://www.nature.com/articles/s41467-017-01312-x) from the lulu package.

MiscMetabar 0.21

This is the first release of pkgdown.