Annotate regions with nearest gene and promoter information
Source:R/annotate_re_genes.R
annotate_with_nearest.Rd
This function uses ChIPseeker-like logic to annotate peaks with distance to the nearest TSS (transcription start site), classifies them as Promoter (<2kb) or Distal, and adds the nearest gene ID using Biomart-annotated genes. Optionally, it can annotate whether a region overlaps a TSS directly using 1-bp TSS coordinates.
Usage
annotate_with_nearest(
retsi,
txdb,
annot_dbi,
protein_coding_only = TRUE,
keep_mito = FALSE,
verbose = TRUE,
add_tss_annotation = FALSE
)
Arguments
- retsi
A
GRanges
object ordata.frame
containing regulatory elements to annotate.- txdb
A
TxDb
object used to extract promoter regions.- annot_dbi
An
AnnotationDbi
annotation database object (e.g.,org.Hs.eg.db
) for mapping gene IDs to gene symbols and types.- protein_coding_only
Logical indicating whether to restrict promoter regions to protein-coding genes only (default:
TRUE
).- keep_mito
Logical indicating whether to keep mitochondrial (chrM/MT) chromosomes in promoter extraction (default:
FALSE
).- verbose
Logical; if
TRUE
, informative messages are printed during processing (default:TRUE
).- add_tss_annotation
Logical; whether to also annotate whether a region overlaps a TSS (default:
TRUE
).
Value
A data.frame
with the following columns:
distanceToTSS
: Distance to the nearest TSSannotation
: Promoter/Distal classification based on distancenearestGeneSymbol
: Symbol of the nearest genein_TSS
(optional): Logical indicating if RE overlaps a TSSTSS_gene
(optional): Gene symbol for overlapping TSSOther original metadata