R packages by hpages

GenomicRanges - Representation and manipulation of genomic intervals

The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.

Last updated 6 days ago

geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package

17.85 score 45 stars 1.3k packages 14k scripts 87k downloads

Biostrings - Efficient manipulation of biological strings

Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.

Last updated 15 days ago

sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package

17.79 score 57 stars 1.2k packages 8.9k scripts 96k downloads

SummarizedExperiment - A container (S4 class) for matrix-like assays

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Last updated 23 days ago

geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package

17.03 score 33 stars 1.2k packages 9.5k scripts 82k downloads

GenomeInfoDb - Utilities for manipulating chromosome names, including modifying them to follow a particular naming style

Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.

Last updated 6 days ago

geneticsdatarepresentationannotationgenomeannotationbioconductor-packagecore-package

16.49 score 31 stars 1.7k packages 1.3k scripts 114k downloads

S4Vectors - Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Last updated 6 days ago

infrastructuredatarepresentationbioconductor-packagecore-package

16.10 score 18 stars 1.8k packages 1.0k scripts 110k downloads

IRanges - Foundation of integer range manipulation in Bioconductor

Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.

Last updated 6 days ago

infrastructuredatarepresentationbioconductor-packagecore-package

16.09 score 22 stars 1.8k packages 2.1k scripts 105k downloads

DelayedArray - A unified framework for working transparently with on-disk and in-memory array-like datasets

Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.

Last updated 7 days ago

infrastructuredatarepresentationannotationgenomeannotationbioconductor-packagecore-package

15.60 score 25 stars 1.2k packages 550 scripts 92k downloads

GenomicAlignments - Representation and manipulation of short genomic alignments

Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.

Last updated 23 days ago

infrastructuredataimportgeneticssequencingrnaseqsnpcoveragealignmentimmunooncologybioconductor-packagecore-package

15.50 score 9 stars 525 packages 3.0k scripts 47k downloads

GenomicFeatures - Query the gene models of a given organism/assembly

Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.

Last updated 15 days ago

geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package

15.39 score 26 stars 340 packages 5.3k scripts 34k downloads

BiocGenerics - S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Last updated 7 days ago

infrastructurebioconductor-packagecore-package

14.04 score 12 stars 2.2k packages 606 scripts 108k downloads

BSgenome - Software infrastructure for efficient representation of full genomes and their SNPs

Infrastructure shared by all the Biostrings-based genome data packages.

Last updated 23 days ago

geneticsinfrastructuredatarepresentationsequencematchingannotationsnpbioconductor-packagecore-package

13.17 score 9 stars 268 packages 1.1k scripts 25k downloads

HDF5Array - HDF5 datasets as array-like objects in R

The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.

Last updated 22 days ago

infrastructuredatarepresentationdataimportsequencingrnaseqcoverageannotationgenomeannotationsinglecellimmunooncologybioconductor-packagecore-package

12.08 score 11 stars 129 packages 828 scripts 27k downloads

SparseArray - High-performance sparse data representation and manipulation in R

The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.

Last updated 7 days ago

infrastructuredatarepresentationbioconductor-packagecore-package

11.44 score 8 stars 1.2k packages 45 scripts 75k downloads

S4Arrays - Foundation of array-like containers in Bioconductor

The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).

Last updated 23 days ago

infrastructuredatarepresentationbioconductor-packagecore-package

11.35 score 5 stars 1.2k packages 7 scripts 73k downloads

Rhtslib - HTSlib high-throughput sequencing library as an R package

This package provides version 1.18 of the 'HTSlib' C library for high-throughput sequence analysis. The package is primarily useful to developers of other R packages who wish to make use of HTSlib. Motivation and instructions for use of this package are in the vignette, vignette(package="Rhtslib", "Rhtslib").

Last updated 23 days ago

dataimportsequencingbioconductor-packagecore-package

11.34 score 11 stars 590 packages 3 scripts 47k downloads

XVector - Foundation of external vector representation and manipulation in Bioconductor

Provides memory efficient S4 classes for storing sequences "externally" (e.g. behind an R external pointer, or on disk).

Last updated 23 days ago

infrastructuredatarepresentationbioconductor-packagecore-package

11.06 score 2 stars 1.7k packages 67 scripts 94k downloads

UCSC.utils - Low-level utilities to retrieve data from the UCSC Genome Browser

A set of low-level utilities to retrieve data from the UCSC Genome Browser. Most functions in the package access the data via the UCSC REST API but some of them query the UCSC MySQL server directly. Note that the primary purpose of the package is to support higher-level functionalities implemented in downstream packages like GenomeInfoDb or txdbmaker.

Last updated 23 days ago

infrastructuregenomeassemblyannotationgenomeannotationdataimportbioconductor-packagecore-package

10.03 score 1 stars 1.7k packages 4 scripts 53k downloads

txdbmaker - Tools for making TxDb objects from genomic annotations

A set of tools for making TxDb objects from genomic annotations from various sources (e.g. UCSC, Ensembl, and GFF files). These tools allow the user to download the genomic locations of transcripts, exons, and CDS, for a given assembly, and to import them in a TxDb object. TxDb objects are implemented in the GenomicFeatures package, together with flexible methods for extracting the desired features in convenient formats.

Last updated 23 days ago

infrastructuredataimportannotationgenomeannotationgenomeassemblygeneticssequencingbioconductor-packagecore-package

9.29 score 2 stars 87 packages 77 scripts 4.9k downloads

pwalign - Perform pairwise sequence alignments

The two main functions in the package are pairwiseAlignment() and stringDist(). The former solves (Needleman-Wunsch) global alignment, (Smith-Waterman) local alignment, and (ends-free) overlap alignment problems. The latter computes the Levenshtein edit distance or pairwise alignment score matrix for a set of strings.

Last updated 23 days ago

alignmentsequencematchingsequencinggeneticsbioconductor-package

8.10 score 1 stars 106 packages 21 scripts 7.5k downloads

BSgenomeForge - Forge your own BSgenome data package

A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.

Last updated 23 days ago

infrastructuredatarepresentationgenomeassemblyannotationgenomeannotationsequencingalignmentdataimportsequencematchingbioconductor-packagecore-package

6.30 score 4 stars 4 scripts 244 downloads

SplicingGraphs - Create, manipulate, visualize splicing graphs, and assign RNA-seq reads to them

This package allows the user to create, manipulate, and visualize splicing graphs and their bubbles based on a gene model for a given organism. Additionally it allows the user to assign RNA-seq reads to the edges of a set of splicing graphs, and to summarize them in different ways.

Last updated 23 days ago

geneticsannotationdatarepresentationvisualizationsequencingrnaseqgeneexpressionalternativesplicingtranscriptionimmunooncologybioconductor-package

5.26 score 2 stars 8 scripts 278 downloads

GenomicRanges - Representation and manipulation of genomic intervals

Biostrings - Efficient manipulation of biological strings

SummarizedExperiment - A container (S4 class) for matrix-like assays

GenomeInfoDb - Utilities for manipulating chromosome names, including modifying them to follow a particular naming style

S4Vectors - Foundation of vector-like and list-like containers in Bioconductor

IRanges - Foundation of integer range manipulation in Bioconductor

DelayedArray - A unified framework for working transparently with on-disk and in-memory array-like datasets

GenomicAlignments - Representation and manipulation of short genomic alignments

GenomicFeatures - Query the gene models of a given organism/assembly

BiocGenerics - S4 generic functions used in Bioconductor

BSgenome - Software infrastructure for efficient representation of full genomes and their SNPs

HDF5Array - HDF5 datasets as array-like objects in R

SparseArray - High-performance sparse data representation and manipulation in R

S4Arrays - Foundation of array-like containers in Bioconductor

Rhtslib - HTSlib high-throughput sequencing library as an R package

XVector - Foundation of external vector representation and manipulation in Bioconductor

UCSC.utils - Low-level utilities to retrieve data from the UCSC Genome Browser

txdbmaker - Tools for making TxDb objects from genomic annotations

pwalign - Perform pairwise sequence alignments

BSgenomeForge - Forge your own BSgenome data package

SplicingGraphs - Create, manipulate, visualize splicing graphs, and assign RNA-seq reads to them

updateObject - Find/fix old serialized S4 instances