GenomicRanges - Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Last updated 6 days ago
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
17.85 score 45 stars 1.3k packages 14k scripts 87k downloadsBiostrings - Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Last updated 15 days ago
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
17.79 score 57 stars 1.2k packages 8.9k scripts 96k downloadsSummarizedExperiment - A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Last updated 23 days ago
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
17.03 score 33 stars 1.2k packages 9.5k scripts 82k downloadsGenomeInfoDb - Utilities for manipulating chromosome names, including modifying them to follow a particular naming style
Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.
Last updated 6 days ago
geneticsdatarepresentationannotationgenomeannotationbioconductor-packagecore-package
16.49 score 31 stars 1.7k packages 1.3k scripts 114k downloadsS4Vectors - Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Last updated 6 days ago
infrastructuredatarepresentationbioconductor-packagecore-package
16.10 score 18 stars 1.8k packages 1.0k scripts 110k downloadsIRanges - Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Last updated 6 days ago
infrastructuredatarepresentationbioconductor-packagecore-package
16.09 score 22 stars 1.8k packages 2.1k scripts 105k downloadsDelayedArray - A unified framework for working transparently with on-disk and in-memory array-like datasets
Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.
Last updated 7 days ago
infrastructuredatarepresentationannotationgenomeannotationbioconductor-packagecore-package
15.60 score 25 stars 1.2k packages 550 scripts 92k downloadsGenomicAlignments - Representation and manipulation of short genomic alignments
Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.
Last updated 23 days ago
infrastructuredataimportgeneticssequencingrnaseqsnpcoveragealignmentimmunooncologybioconductor-packagecore-package
15.50 score 9 stars 525 packages 3.0k scripts 47k downloadsGenomicFeatures - Query the gene models of a given organism/assembly
Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.
Last updated 15 days ago
geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package
15.39 score 26 stars 340 packages 5.3k scripts 34k downloadsBiocGenerics - S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Last updated 7 days ago
infrastructurebioconductor-packagecore-package
14.04 score 12 stars 2.2k packages 606 scripts 108k downloadsBSgenome - Software infrastructure for efficient representation of full genomes and their SNPs
Infrastructure shared by all the Biostrings-based genome data packages.
Last updated 23 days ago
geneticsinfrastructuredatarepresentationsequencematchingannotationsnpbioconductor-packagecore-package
13.17 score 9 stars 268 packages 1.1k scripts 25k downloadsHDF5Array - HDF5 datasets as array-like objects in R
The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.
Last updated 22 days ago
infrastructuredatarepresentationdataimportsequencingrnaseqcoverageannotationgenomeannotationsinglecellimmunooncologybioconductor-packagecore-package
12.08 score 11 stars 129 packages 828 scripts 27k downloadsSparseArray - High-performance sparse data representation and manipulation in R
The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.
Last updated 7 days ago
infrastructuredatarepresentationbioconductor-packagecore-package
11.44 score 8 stars 1.2k packages 45 scripts 75k downloadsS4Arrays - Foundation of array-like containers in Bioconductor
The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).
Last updated 23 days ago
infrastructuredatarepresentationbioconductor-packagecore-package
11.35 score 5 stars 1.2k packages 7 scripts 73k downloadsRhtslib - HTSlib high-throughput sequencing library as an R package
This package provides version 1.18 of the 'HTSlib' C library for high-throughput sequence analysis. The package is primarily useful to developers of other R packages who wish to make use of HTSlib. Motivation and instructions for use of this package are in the vignette, vignette(package="Rhtslib", "Rhtslib").
Last updated 23 days ago
dataimportsequencingbioconductor-packagecore-package
11.34 score 11 stars 590 packages 3 scripts 47k downloadsXVector - Foundation of external vector representation and manipulation in Bioconductor
Provides memory efficient S4 classes for storing sequences "externally" (e.g. behind an R external pointer, or on disk).
Last updated 23 days ago
infrastructuredatarepresentationbioconductor-packagecore-package
11.06 score 2 stars 1.7k packages 67 scripts 94k downloadsUCSC.utils - Low-level utilities to retrieve data from the UCSC Genome Browser
A set of low-level utilities to retrieve data from the UCSC Genome Browser. Most functions in the package access the data via the UCSC REST API but some of them query the UCSC MySQL server directly. Note that the primary purpose of the package is to support higher-level functionalities implemented in downstream packages like GenomeInfoDb or txdbmaker.
Last updated 23 days ago
infrastructuregenomeassemblyannotationgenomeannotationdataimportbioconductor-packagecore-package
10.03 score 1 stars 1.7k packages 4 scripts 53k downloadstxdbmaker - Tools for making TxDb objects from genomic annotations
A set of tools for making TxDb objects from genomic annotations from various sources (e.g. UCSC, Ensembl, and GFF files). These tools allow the user to download the genomic locations of transcripts, exons, and CDS, for a given assembly, and to import them in a TxDb object. TxDb objects are implemented in the GenomicFeatures package, together with flexible methods for extracting the desired features in convenient formats.
Last updated 23 days ago
infrastructuredataimportannotationgenomeannotationgenomeassemblygeneticssequencingbioconductor-packagecore-package
9.29 score 2 stars 87 packages 77 scripts 4.9k downloadspwalign - Perform pairwise sequence alignments
The two main functions in the package are pairwiseAlignment() and stringDist(). The former solves (Needleman-Wunsch) global alignment, (Smith-Waterman) local alignment, and (ends-free) overlap alignment problems. The latter computes the Levenshtein edit distance or pairwise alignment score matrix for a set of strings.
Last updated 23 days ago
alignmentsequencematchingsequencinggeneticsbioconductor-package
8.10 score 1 stars 106 packages 21 scripts 7.5k downloadsBSgenomeForge - Forge your own BSgenome data package
A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.
Last updated 23 days ago
infrastructuredatarepresentationgenomeassemblyannotationgenomeannotationsequencingalignmentdataimportsequencematchingbioconductor-packagecore-package
6.30 score 4 stars 4 scripts 244 downloadsSplicingGraphs - Create, manipulate, visualize splicing graphs, and assign RNA-seq reads to them
This package allows the user to create, manipulate, and visualize splicing graphs and their bubbles based on a gene model for a given organism. Additionally it allows the user to assign RNA-seq reads to the edges of a set of splicing graphs, and to summarize them in different ways.
Last updated 23 days ago
geneticsannotationdatarepresentationvisualizationsequencingrnaseqgeneexpressionalternativesplicingtranscriptionimmunooncologybioconductor-package
5.26 score 2 stars 8 scripts 278 downloadsupdateObject - Find/fix old serialized S4 instances
A set of tools built around updateObject() to work with old serialized S4 instances. The package is primarily useful to package maintainers who want to update the serialized S4 instances included in their package. This is still work-in-progress.
Last updated 23 days ago
infrastructuredatarepresentationbioconductor-packagecore-package
4.48 score 1 stars 3 scripts 98 downloads