Skip to contents

This function uses ChIPseeker-like logic to annotate peaks with distance to the nearest TSS (transcription start site), classifies them as Promoter (<2kb) or Distal, and adds the nearest gene ID using Biomart-annotated genes. Optionally, it can annotate whether a region overlaps a TSS directly using 1-bp TSS coordinates.

Usage

annotate_with_nearest(
  retsi,
  txdb,
  annot_dbi,
  protein_coding_only = TRUE,
  keep_mito = FALSE,
  verbose = TRUE,
  add_tss_annotation = FALSE
)

Arguments

retsi

A GRanges object or data.frame containing regulatory elements to annotate.

txdb

A TxDb object used to extract promoter regions.

annot_dbi

An AnnotationDbi annotation database object (e.g., org.Hs.eg.db) for mapping gene IDs to gene symbols and types.

protein_coding_only

Logical indicating whether to restrict promoter regions to protein-coding genes only (default: TRUE).

keep_mito

Logical indicating whether to keep mitochondrial (chrM/MT) chromosomes in promoter extraction (default: FALSE).

verbose

Logical; if TRUE, informative messages are printed during processing (default: TRUE).

add_tss_annotation

Logical; whether to also annotate whether a region overlaps a TSS (default: TRUE).

Value

A data.frame with the following columns:

  • distanceToTSS: Distance to the nearest TSS

  • annotation: Promoter/Distal classification based on distance

  • nearestGeneSymbol: Symbol of the nearest gene

  • in_TSS (optional): Logical indicating if RE overlaps a TSS

  • TSS_gene (optional): Gene symbol for overlapping TSS

  • Other original metadata