NEWS
cummeRbund 2.9.3
- Bugfix:
- Introduced CHECK error by adding to .Rbuildignore...this is now fixed.
cummeRbund 2.9.2
- version bump to let BioC nightly build grab commit.
cummeRbund 2.9.1
- version bump for BioC devel release 3.1
cummeRbund 2.8.2
- Bugfixes:
- removed reference to sqliteCloseConnection() (not exported by RSQLite 1.0.0) in vignette.
cummeRbund 2.8.1
- Bugfixes:
- Made minimal changes for compatibility with RSQLite 1.0.0
cummeRbund 2.7.3
- Bugfixes:
- Fixed sigMatrix legend argument to comply with ggplot2 deprecations. No longer throws an error.
New Features:
Notes:
- Trying out a few more indices to speed up queries using sampleIdList.
cummeRbund 2.7.1
- Bugfixes:
- Fixed 'fullnames' argument to cuffData::*Matrix() methods so that it does what it's supposed to do.
- Added 'showPool' argument to fpkmSCVPlot. When TRUE, empirical mean and standard deviation are determined across all conditions as opposed to cross-replicate. This is set to TRUE anytime you have n<2 replicates per condition.
- Added stat="identity" to expressionBarplot to comply with ggplot 0.9.3 enforcement.
- 'labels' argument to csScatter is now working as it's supposed to. You can pass a vector of 'gene_short_name' identifiers to labels and these will be specifically called out in red text on scatterplot.
- Added repFpkmMatrix() and replicates() methods to CuffFeature objects.
- Removed unnecessary Joins to optimize retrieval speed for several key queries.
- Fixed bug in csVolcano matrix that forced ylimits to be c(0,15)
New Features:
- Added csNMF() method for CuffData and CuffFeatureSet objects to perform non-negative matrix factorization. As of now, it's merely a wrapper around the default settings for NMFN::nnmf(), but hope to expand in the future.
* Does not adjust sparsity of matrices after output, must be done by user as needed.
- Added csPie() method for CuffGene objects. Allows for visualization of relative isoform, CDS, and promoter usage proportions as a pie chart by condition (or optionally as stacked bar charts by adding + coord_cartesian() ).
- Added 'method' argument to csCluster and csHeatmap to allow custom distance functions for clustering. Default = "none" = JSdist(). You can now provide a function that returns a 'dist' object on rows of a matrix.
- Added varModel.info tracking for compatibility with cuffdiff >=2.1. Will now find varModel.info file if exists, and incorporate into database.
- dispersionPlot() method added for CuffSet object. This now appropriately draws from varModel.info and is the preferred visualization for dispersion of RNA-Seq data with cummeRbund.
- Added diffTable() method to CuffData and CuffFeatureSet objects to allow a 'one-table' snapshot of results for all Features (CuffData) or a set of Features (CuffFeatureSet). This table outputs key values including gene name,
gene short name, expression estimates and per-comparison fold-change, p-value, q-value, and significance values (yes/no). A convenient 'data-dump' function to merge across several tables.
- Added coercion methods for CuffGene objects to create GRanges and GRangeslist objects (more BioC friendly!). Will work on making this possible on CuffFeatureSet and CuffFeature objects as well.
- Added pass-through to select p.adjust method for getSig (method argument to getSig)
- Added ability to revert to cuffdiff q-values for specific paired-wise interrogations with getSig as opposed to re-calculating new ones (useCuffMTC; default=FALSE)
Notes:
- Removed generic for 'featureNames'. Now appropriately uses featureNames generic from Biobase. As a consequence, Biobase is now a dependency.
- Added passthrough to as.dist(...) in JSdist(...)
- Added 'logMode' argument to csClusterPlot.
- Added 'showPoints' argument to PCAplot to allow disabling of gene values in PCA plot. If false, only sample projections are plotted.
- Added 'facet' argument to expressionPlot to disable faceting by feature_id.
- shannon.entropy now uses log2 instead of log10 to constrain specificity scores between 0 and 1.
cummeRbund 1.99.6
- Notes:
- 'annotation' and "annotation<-" generics were moved to BiocGenerics 0.3.2. Now using appropriate generic function, but requiring BiocGenerics >= 0.3.2
cummeRbund 1.99.5
- Bugfixes:
- Added replicates argument to csDistHeat to view distances between individual replicate samples.
- Appropriately distinguish now between 'annotation' (external attributes) and features (gene-level sub-features).
- csHeatmap now has 'method' argument to pass function for any dissimilarity metric you desire. You must pass a function that returns a 'dist' object applied to rows of a matrix. Default is still JS-distance.
cummeRbund 1.99.3
- New Features:
- Added diffTable() method to return a table of differential results broken out by pairwise comparison. (more human-readable)
- Added sigMatrix() method to CuffSet objects to draw heatmap showing number of significant genes by pairwise comparison at a given FDR.
- A call to fpkm() now emits calculated (model-derived) standard deviation field as well.
- Can now pass a GTF file as argument to readCufflinks() to integrate transcript model information into database backend
* Added requirement for rtracklayer and GenomicFeatures packages.
* You must also indicate which genome build the .gtf was created against by using the 'genome' argument to readCufflinks.
- Integration with Gviz:
* CuffGene objects now have a makeGeneRegionTrack() argument to create a GeneRegionTrack() from transcript model information
* Can also make GRanges object
* ONLY WORKS IF YOU READ .gtf FILE IN WITH readCufflinks()
- Added csScatterMatrix() and csVolcanoMatrix() method to CuffData objects.
- Added fpkmSCVPlot() as a CuffData method to visualize replicate-level coefficient of variation across fpkm range per condition.
- Added PCAplot() and MDSplot() for dimensionality reduction visualizations (Principle components, and multi-dimensional scaling respectively)
- Added csDistHeat() to create a heatmap of JS-distances between conditions.
Bugfixes:
- Fixed diffData 'features' argument so that it now does what it's supposed to do.
- added DB() with signature(object="CuffSet") to NAMESPACE
Notes:
- Once again, there have been modifications to the underlying database schema so you will have to re-run readCufflinks(rebuild=T) to re-analyze existing datasets.
- Importing 'defaults' from plyr instead of requiring entire package (keeps namespace cleaner).
- Set pseudocount=0.0 as default for csDensity() and csScatter() methods (This prevents a visual bias for genes with FPKM <1 and ggplot2 handles removing true zero values).
cummeRbund 1.99.2
- Bugfixes:
- Fixed bug in replicate table that did not apply make.db.names to match samples table.
- Fixed bug for missing values in *.count_tracking files.
- Now correctly applying make.db.names to *.read_group_tracking files.
- Now correctly allows for empty *.count_tracking and *.read_group_tracking files
cummeRbund 1.99.1
- This represents a major set of improvements and feature additions to cummeRbund.
- cummeRbund now incorporates additional information emitted from cuffdiff 2.0 including:
- run parameters and information.
- sample-level information such as mass and scaling factors.
- individual replicate fpkms and associated statistics for all features.
- raw and normalized count tables and associated statistics all features.
New Features:
- Please see updated vignette for overview of new features.
- New dispersionPlot() to visualize model fit (mean count vs dispersion) at all feature levels.
- New runInfo() method returns cuffdiff run parameters.
- New replicates() method returns a data.frame of replicate-level parameters and information.
- getGene() and getGenes() can now take a list of any tracking_id or gene_short_name (not just gene_ids) to retrieve
a gene or geneset.
- Added getFeatures() method to retrieve a CuffFeatureSet independent of gene-level attributes. This is ideal for looking at sets of features
outside of the context of all other gene-related information (i.e. facilitates feature-level analysis)
- Replicate-level fpkm data now available.
- Condition-level raw and normalized count data now available.
- repFpkm(), repFpkmMatrix, count(), and countMatrix are new accessor methods to CuffData, CuffFeatureSet, and CuffFeature objects.
- All relevant plots now have a logical 'replicates' argument (default = F) that when set to TRUE will expose replicate FPKM values in appropriate ways.
- MAPlot() now has 'useCount' argument to draw MA plots using count data as opposed to fpkm estimates.
Notes:
- Changed default csHeatmap colorscheme to the much more pleasing 'lightyellow' to 'darkred' through 'orange'.
- SQLite journaling is no longer disabled by default (The benefits outweigh the moderate reduction in load times).
Bugfixes:
- Numerous random bug fixes to improve consistency and improve performance for large datasets.
cummeRbund 1.2.1
- Bugfixes:
-Fixed bug in CuffFeatureSet::expressionBarplot to make compatible with ggplot2 v0.9.
New Features:
- Added 'distThresh' argument to findSimilar. This allows you to retrieve all similar genes within a given JS distance as specified by distThresh.
- Added 'returnGeneSet' argument to findSimilar. [default = T] If true, findSimilar returns a CuffGeneSet of genes matching criteria (default). If false, a rank-ordered data frame of JS distance values is returned.
- findSimilar can now take a 'sampleIdList' argument. This should be a vector of sample names across which the distance between genes should be evaluated. This should be a subset of the output of samples(genes(cuff)).
Notes:
- Added requirement for 'fastcluster' package. There is very little footprint, and it makes a significant improvement in speed for the clustering analyses.
cummeRbund 1.1.5
- Bugfixes:
- Fixed minor bug in database setup that caused instability with cuffdiff --no-diff argument.
- Fixed bug in csDendro method for CuffData objects.
cummeRbund 1.1.4
- New Features:
- Added MAplot() method to CuffData objects.
Bugfixes:
- Finished abrupt migration to reshape2. As a result fixed a bug in which 'cast' was still required for several functions and could not be found. Now appropriately using 'dcast' or 'acast'.
- Fixed minor bug in CuffFeature::fpkmMatrix
cummeRbund 1.1.3
- New Features:
- getSig() has been split into two functions: getSig() now returns a vector of ids (no longer a list of vectors), and getSigTable() returns a 'testTable' of
binary values indicating whether or not a gene was significant
in a particular comparison.
- Added ability in getSig() to limit retrieval of significant genes to two provided conditions (arguments x & y). (reduces time for function call if you have a specific comparison in mind a priori)
* When you specify x & y with getSig(), q-values are recalculated from just those selected tests to reduce impact of multiple testing correction.
* If you do not specificy x & y getSig() will return a vector of tracking_ids for all comparisons (with appropriate MTC).
- You can now specify an 'alpha' for getSig() and getSigTable() [ 0.05 by default to match cuffdiff default ] by which to filter the resulting significance calls.
- Added csSpecificity() method: This method returns a feature-X-condition matrix (same shape as fpkmMatrix) that provides a 'condition-specificity' score
* defined as 1-(JSdist(p,q))
where p is is the density of expression (probability vector of log(FPKM+1)) of a given gene across all conditions,
and q is the unit vector for that condition (ie. perfect expression in that particular condition)
* specificity = 1.0 if the feature is expressed exclusively in that condition
- Created csDendro() method: This method returns a object of class 'dendrogram' (and plots using grid) of JS distances between conditions for all genes in a CuffData, CuffGeneSet, or CuffFeatureSet object.
* Useful for identifying relationships between conditions for subsets of features
- New visual cues in several plot types that indicates the quantification status ('quant_stat' field) of a particular gene:condition. This information is useful to indicate whether or not
to trust the expression values for a given gene under a specific condition, and may provide insight into outlier expression values.
* This feature can be disabled by setting showStatus=F.
- csDensity() is now available for CuffFeatureSet and CuffGeneSet objects
Bugfixes:
- Fixed bug in getGenes that may have resulted in long query lag for retrieving promoter diffData. As a result all calls to getGenes should be significanly faster.
- CuffData fpkm argument 'features' now returns appropriate data.frame (includes previously un-reported data fields).
- Replaced all instances of 'ln_fold_change' with the actual 'log2_fold_change'. Values were previously log2 fold change but database headers were not updated to reflect this.
- Fixed bug that could cause readCufflinks() to die with error when using reshape2::melt instead of reshape::melt.
Notes:
- ***The structure of the underlying database has changed in this version. As a consequence, you must rebuild you cuffData.db file to use new version. readCufflinks(rebuild=T)***
- Updated vignette
- A 'fullnames' logical argument was added to fpkmMatrix. If True, rownames for fpkmMatrix will be a concatenation of gene_short_name and tracking_id.
This has the added benefit of making row labels in csHeatmap easier to read, as well as preserving uniqueness.
- Slight speed improvements to JSdist (noticeable when using csCluster on large feature sets).
- 'testTable' argument to getSig() has been dropped in lieu of new getSigTable() method.
cummeRbund 1.1.1
- Bugfixes:
- fixed issue in which there was no graceful error handling of missing CDS or TSS data in cuffdiff output.
- Fixed issue in which distribution test data (promoters, splicing, relCDS) were not appropriately added to objects on creation.
- Fixed bug that would sometimes cause csBoxplot() to throw an error when log-transforming fpkm data. Also added pseudocount argument.
- Fixed bug that would cause diffData() to return a filtered subset of results by default.
- Adjusted indexing of tables to improve performance on large datasets.
- Fixed bug that caused diffData method to not be registered with CuffFeature and CuffGene objects.
- Fixed bug that sometimes caused over-plotting of axis labels in csBarplots.
New Features:
- added getSig method to CuffSet class for rapid retrieval of significant features from all pairwise tests (as a list of IDs).
By default the level is 'genes' but any feature level can be queried.
- csCluster now uses Jensen-Shannon distance by default (as opposed to Euclidean)
- Added 'xlimits' argument to csVolcano to constrain plot dimensions.
- Enforced requirement in csVolcano for x and y arguments (as sample names).
Notes:
- Changed dependency 'reshape' to 'reshape2'
- Changed the default orientation of expressionBarplot() for CuffFeatureSet objects.
- Changed output of csCluster to a list format that includes clustering information. As a result, I created the function csClusterPlot
to replace the previous default drawing behavior of csCluster. This allows for stable cluster analysis.
- For consistency, the 'testId' slot for CuffDist objects was renamed to 'idField'. This brings the CuffDist class in line with the CuffData class.
- CuffGene and CuffGeneSet now include slots for promoter, splicing, and relCDS distribution test results.
cummeRbund 1.0.0
- Official public release. No changes from v0.99.5
cummeRbund 0.99.5
- Significant speed improvements to readCufflinks() for large cuffdiff datasets.
- Tables written first then indexed.
- Added slot accessor methods to avoid using slots directly.
cummeRbund 0.99.4
- Second beta release and submission to Bioconductor
cummeRbund 0.1.3 (2011-08-18)
- First Beta release of cummeRbund and submission to Bioconductor for review and hosting.