NEWS

cummeRbund 2.9.3

Bugfix: - Introduced CHECK error by adding to .Rbuildignore...this is now fixed.

cummeRbund 2.9.2

version bump to let BioC nightly build grab commit.

cummeRbund 2.9.1

version bump for BioC devel release 3.1

cummeRbund 2.8.2

Bugfixes: - removed reference to sqliteCloseConnection() (not exported by RSQLite 1.0.0) in vignette.

cummeRbund 2.8.1

Bugfixes: - Made minimal changes for compatibility with RSQLite 1.0.0

cummeRbund 2.7.3

Bugfixes: - Fixed sigMatrix legend argument to comply with ggplot2 deprecations. No longer throws an error. New Features: Notes: - Trying out a few more indices to speed up queries using sampleIdList.

cummeRbund 2.7.1

Bugfixes: - Fixed 'fullnames' argument to cuffData::*Matrix() methods so that it does what it's supposed to do. - Added 'showPool' argument to fpkmSCVPlot. When TRUE, empirical mean and standard deviation are determined across all conditions as opposed to cross-replicate. This is set to TRUE anytime you have n<2 replicates per condition. - Added stat="identity" to expressionBarplot to comply with ggplot 0.9.3 enforcement. - 'labels' argument to csScatter is now working as it's supposed to. You can pass a vector of 'gene_short_name' identifiers to labels and these will be specifically called out in red text on scatterplot. - Added repFpkmMatrix() and replicates() methods to CuffFeature objects. - Removed unnecessary Joins to optimize retrieval speed for several key queries. - Fixed bug in csVolcano matrix that forced ylimits to be c(0,15) New Features: - Added csNMF() method for CuffData and CuffFeatureSet objects to perform non-negative matrix factorization. As of now, it's merely a wrapper around the default settings for NMFN::nnmf(), but hope to expand in the future. * Does not adjust sparsity of matrices after output, must be done by user as needed. - Added csPie() method for CuffGene objects. Allows for visualization of relative isoform, CDS, and promoter usage proportions as a pie chart by condition (or optionally as stacked bar charts by adding + coord_cartesian() ). - Added 'method' argument to csCluster and csHeatmap to allow custom distance functions for clustering. Default = "none" = JSdist(). You can now provide a function that returns a 'dist' object on rows of a matrix. - Added varModel.info tracking for compatibility with cuffdiff >=2.1. Will now find varModel.info file if exists, and incorporate into database. - dispersionPlot() method added for CuffSet object. This now appropriately draws from varModel.info and is the preferred visualization for dispersion of RNA-Seq data with cummeRbund. - Added diffTable() method to CuffData and CuffFeatureSet objects to allow a 'one-table' snapshot of results for all Features (CuffData) or a set of Features (CuffFeatureSet). This table outputs key values including gene name, gene short name, expression estimates and per-comparison fold-change, p-value, q-value, and significance values (yes/no). A convenient 'data-dump' function to merge across several tables. - Added coercion methods for CuffGene objects to create GRanges and GRangeslist objects (more BioC friendly!). Will work on making this possible on CuffFeatureSet and CuffFeature objects as well. - Added pass-through to select p.adjust method for getSig (method argument to getSig) - Added ability to revert to cuffdiff q-values for specific paired-wise interrogations with getSig as opposed to re-calculating new ones (useCuffMTC; default=FALSE) Notes: - Removed generic for 'featureNames'. Now appropriately uses featureNames generic from Biobase. As a consequence, Biobase is now a dependency. - Added passthrough to as.dist(...) in JSdist(...) - Added 'logMode' argument to csClusterPlot. - Added 'showPoints' argument to PCAplot to allow disabling of gene values in PCA plot. If false, only sample projections are plotted. - Added 'facet' argument to expressionPlot to disable faceting by feature_id. - shannon.entropy now uses log2 instead of log10 to constrain specificity scores between 0 and 1.

cummeRbund 1.99.6

Notes: - 'annotation' and "annotation<-" generics were moved to BiocGenerics 0.3.2. Now using appropriate generic function, but requiring BiocGenerics >= 0.3.2

cummeRbund 1.99.5

Bugfixes: - Added replicates argument to csDistHeat to view distances between individual replicate samples. - Appropriately distinguish now between 'annotation' (external attributes) and features (gene-level sub-features). - csHeatmap now has 'method' argument to pass function for any dissimilarity metric you desire. You must pass a function that returns a 'dist' object applied to rows of a matrix. Default is still JS-distance.

cummeRbund 1.99.3

New Features: - Added diffTable() method to return a table of differential results broken out by pairwise comparison. (more human-readable) - Added sigMatrix() method to CuffSet objects to draw heatmap showing number of significant genes by pairwise comparison at a given FDR. - A call to fpkm() now emits calculated (model-derived) standard deviation field as well. - Can now pass a GTF file as argument to readCufflinks() to integrate transcript model information into database backend * Added requirement for rtracklayer and GenomicFeatures packages. * You must also indicate which genome build the .gtf was created against by using the 'genome' argument to readCufflinks. - Integration with Gviz: * CuffGene objects now have a makeGeneRegionTrack() argument to create a GeneRegionTrack() from transcript model information * Can also make GRanges object * ONLY WORKS IF YOU READ .gtf FILE IN WITH readCufflinks() - Added csScatterMatrix() and csVolcanoMatrix() method to CuffData objects. - Added fpkmSCVPlot() as a CuffData method to visualize replicate-level coefficient of variation across fpkm range per condition. - Added PCAplot() and MDSplot() for dimensionality reduction visualizations (Principle components, and multi-dimensional scaling respectively) - Added csDistHeat() to create a heatmap of JS-distances between conditions. Bugfixes: - Fixed diffData 'features' argument so that it now does what it's supposed to do. - added DB() with signature(object="CuffSet") to NAMESPACE Notes: - Once again, there have been modifications to the underlying database schema so you will have to re-run readCufflinks(rebuild=T) to re-analyze existing datasets. - Importing 'defaults' from plyr instead of requiring entire package (keeps namespace cleaner). - Set pseudocount=0.0 as default for csDensity() and csScatter() methods (This prevents a visual bias for genes with FPKM <1 and ggplot2 handles removing true zero values).

cummeRbund 1.99.2

Bugfixes: - Fixed bug in replicate table that did not apply make.db.names to match samples table. - Fixed bug for missing values in *.count_tracking files. - Now correctly applying make.db.names to *.read_group_tracking files. - Now correctly allows for empty *.count_tracking and *.read_group_tracking files

cummeRbund 1.99.1

This represents a major set of improvements and feature additions to cummeRbund.
cummeRbund now incorporates additional information emitted from cuffdiff 2.0 including:
run parameters and information.
sample-level information such as mass and scaling factors.
individual replicate fpkms and associated statistics for all features.
raw and normalized count tables and associated statistics all features. New Features:
Please see updated vignette for overview of new features.
New dispersionPlot() to visualize model fit (mean count vs dispersion) at all feature levels.
New runInfo() method returns cuffdiff run parameters.
New replicates() method returns a data.frame of replicate-level parameters and information.
getGene() and getGenes() can now take a list of any tracking_id or gene_short_name (not just gene_ids) to retrieve a gene or geneset.
Added getFeatures() method to retrieve a CuffFeatureSet independent of gene-level attributes. This is ideal for looking at sets of features outside of the context of all other gene-related information (i.e. facilitates feature-level analysis)
Replicate-level fpkm data now available.
Condition-level raw and normalized count data now available.
repFpkm(), repFpkmMatrix, count(), and countMatrix are new accessor methods to CuffData, CuffFeatureSet, and CuffFeature objects.
All relevant plots now have a logical 'replicates' argument (default = F) that when set to TRUE will expose replicate FPKM values in appropriate ways.
MAPlot() now has 'useCount' argument to draw MA plots using count data as opposed to fpkm estimates. Notes:
Changed default csHeatmap colorscheme to the much more pleasing 'lightyellow' to 'darkred' through 'orange'.
SQLite journaling is no longer disabled by default (The benefits outweigh the moderate reduction in load times). Bugfixes:
Numerous random bug fixes to improve consistency and improve performance for large datasets.

cummeRbund 1.2.1

Bugfixes: -Fixed bug in CuffFeatureSet::expressionBarplot to make compatible with ggplot2 v0.9. New Features: - Added 'distThresh' argument to findSimilar. This allows you to retrieve all similar genes within a given JS distance as specified by distThresh. - Added 'returnGeneSet' argument to findSimilar. [default = T] If true, findSimilar returns a CuffGeneSet of genes matching criteria (default). If false, a rank-ordered data frame of JS distance values is returned. - findSimilar can now take a 'sampleIdList' argument. This should be a vector of sample names across which the distance between genes should be evaluated. This should be a subset of the output of samples(genes(cuff)). Notes: - Added requirement for 'fastcluster' package. There is very little footprint, and it makes a significant improvement in speed for the clustering analyses.

cummeRbund 1.1.5

Bugfixes: - Fixed minor bug in database setup that caused instability with cuffdiff --no-diff argument. - Fixed bug in csDendro method for CuffData objects.

cummeRbund 1.1.4

New Features: - Added MAplot() method to CuffData objects. Bugfixes: - Finished abrupt migration to reshape2. As a result fixed a bug in which 'cast' was still required for several functions and could not be found. Now appropriately using 'dcast' or 'acast'. - Fixed minor bug in CuffFeature::fpkmMatrix

cummeRbund 1.1.3

New Features: - getSig() has been split into two functions: getSig() now returns a vector of ids (no longer a list of vectors), and getSigTable() returns a 'testTable' of binary values indicating whether or not a gene was significant in a particular comparison. - Added ability in getSig() to limit retrieval of significant genes to two provided conditions (arguments x & y). (reduces time for function call if you have a specific comparison in mind a priori) * When you specify x & y with getSig(), q-values are recalculated from just those selected tests to reduce impact of multiple testing correction. * If you do not specificy x & y getSig() will return a vector of tracking_ids for all comparisons (with appropriate MTC). - You can now specify an 'alpha' for getSig() and getSigTable() [ 0.05 by default to match cuffdiff default ] by which to filter the resulting significance calls. - Added csSpecificity() method: This method returns a feature-X-condition matrix (same shape as fpkmMatrix) that provides a 'condition-specificity' score * defined as 1-(JSdist(p,q)) where p is is the density of expression (probability vector of log(FPKM+1)) of a given gene across all conditions, and q is the unit vector for that condition (ie. perfect expression in that particular condition) * specificity = 1.0 if the feature is expressed exclusively in that condition - Created csDendro() method: This method returns a object of class 'dendrogram' (and plots using grid) of JS distances between conditions for all genes in a CuffData, CuffGeneSet, or CuffFeatureSet object. * Useful for identifying relationships between conditions for subsets of features - New visual cues in several plot types that indicates the quantification status ('quant_stat' field) of a particular gene:condition. This information is useful to indicate whether or not to trust the expression values for a given gene under a specific condition, and may provide insight into outlier expression values. * This feature can be disabled by setting showStatus=F. - csDensity() is now available for CuffFeatureSet and CuffGeneSet objects Bugfixes: - Fixed bug in getGenes that may have resulted in long query lag for retrieving promoter diffData. As a result all calls to getGenes should be significanly faster. - CuffData fpkm argument 'features' now returns appropriate data.frame (includes previously un-reported data fields). - Replaced all instances of 'ln_fold_change' with the actual 'log2_fold_change'. Values were previously log2 fold change but database headers were not updated to reflect this. - Fixed bug that could cause readCufflinks() to die with error when using reshape2::melt instead of reshape::melt. Notes: - ***The structure of the underlying database has changed in this version. As a consequence, you must rebuild you cuffData.db file to use new version. readCufflinks(rebuild=T)*** - Updated vignette - A 'fullnames' logical argument was added to fpkmMatrix. If True, rownames for fpkmMatrix will be a concatenation of gene_short_name and tracking_id. This has the added benefit of making row labels in csHeatmap easier to read, as well as preserving uniqueness. - Slight speed improvements to JSdist (noticeable when using csCluster on large feature sets). - 'testTable' argument to getSig() has been dropped in lieu of new getSigTable() method.

cummeRbund 1.1.1

Bugfixes: - fixed issue in which there was no graceful error handling of missing CDS or TSS data in cuffdiff output. - Fixed issue in which distribution test data (promoters, splicing, relCDS) were not appropriately added to objects on creation. - Fixed bug that would sometimes cause csBoxplot() to throw an error when log-transforming fpkm data. Also added pseudocount argument. - Fixed bug that would cause diffData() to return a filtered subset of results by default. - Adjusted indexing of tables to improve performance on large datasets. - Fixed bug that caused diffData method to not be registered with CuffFeature and CuffGene objects. - Fixed bug that sometimes caused over-plotting of axis labels in csBarplots. New Features: - added getSig method to CuffSet class for rapid retrieval of significant features from all pairwise tests (as a list of IDs). By default the level is 'genes' but any feature level can be queried. - csCluster now uses Jensen-Shannon distance by default (as opposed to Euclidean) - Added 'xlimits' argument to csVolcano to constrain plot dimensions. - Enforced requirement in csVolcano for x and y arguments (as sample names). Notes: - Changed dependency 'reshape' to 'reshape2' - Changed the default orientation of expressionBarplot() for CuffFeatureSet objects. - Changed output of csCluster to a list format that includes clustering information. As a result, I created the function csClusterPlot to replace the previous default drawing behavior of csCluster. This allows for stable cluster analysis. - For consistency, the 'testId' slot for CuffDist objects was renamed to 'idField'. This brings the CuffDist class in line with the CuffData class. - CuffGene and CuffGeneSet now include slots for promoter, splicing, and relCDS distribution test results.

cummeRbund 1.0.0

Official public release. No changes from v0.99.5

cummeRbund 0.99.5

Significant speed improvements to readCufflinks() for large cuffdiff datasets.
Tables written first then indexed.
Added slot accessor methods to avoid using slots directly.

cummeRbund 0.99.4

Second beta release and submission to Bioconductor

cummeRbund 0.1.3 (2011-08-18)

First Beta release of cummeRbund and submission to Bioconductor for review and hosting.