You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I was using ALLCools on the Lee et al. 2019 single-cell methylation data (https://pubmed.ncbi.nlm.nih.gov/31501549/). The code I use to generate the mcds data is something like below:
I load the output data like what is in the example. However, when I look at the data using mcds.get_adata(mc_type = "CHN").to_df(), the data shown is like below:
Is this the problem of the data of am I doing something wrong?
Edit: I guess it is the easiest if you can tell me how you do integration between single-cell methylation and other transcriptomics data in a bit more detail? Especially how to form the cell by gene matrix for methylation? Previously, I directly calculate mCH fraction by first filtering out those CG reads, and among all the rest CH reads, for each cell, I calculate the mean of the fifth column in the gene body divided by the mean methylation of the whole cell. However, this approach failed. Below is the r file if you are interested.
library(foreach)
data(Annotations)
# generate a gene annotation data for hg19.
colnames(hg19Annotations) = c("chr", "start", "end", "strand", "names")
hg19Annotations = hg19Annotations[order(hg19Annotations$names), ]
genes = makeGRangesFromDataFrame(hg19Annotations, keep.extra.columns = TRUE)
# list all the files, all of them are tsv.gz methylation files.
filenames <- list.files('/storage08/kwangmoon/data/Lee2019/methylation/rawdata', full.names = TRUE)
names <- list.files('/storage08/kwangmoon/data/Lee2019/methylation/rawdata', full.names = FALSE)
registerDoParallel(8)
foreach(cell = 1:length(filenames)) %dopar% {
tmp = fread(filenames[cell], select = c(1, 2, 4, 5, 6), fill = TRUE)
if(!grepl('chr', tmp$V1[1])){tmp$V1 = paste0("chr", tmp$V1)} #add "chr" in the front of chromosomes
tmpmCG <- tmp %>% filter(!grepl("CG", V4)) # eliminate all CG and keep only CH.
tmpmCG <- tmpmCG %>% filter(V1 %in% paste0("chr", c(1:22, "X", "Y"))) # only use chr1-22, X, Y.
methy = GRanges(tmpmCG$V1, IRanges(tmpmCG$V2, width = 1))
overlapPos = findOverlaps(methy, genes) # get all the reads that overlap with genes.
output = data.frame(tmpmCG[queryHits(overlapPos), ],
genes = hg19Annotations[subjectHits(overlapPos), ]$names)
mCHmean = mean(output$V5) # this is the mean methylation for all cells
write.table(output %>% group_by(genes) %>% summarize(Methyl_rate = mean(V5) / mCHmean), # we calculate mean methylation in the gene body divide by the mean methylation of all cells
file = paste0("/p/keles/schic/volumeD/HumanPatchSeq/MethylGene/", names[cell]), row.names = FALSE, col.names = FALSE, quote = FALSE, sep = "\t")
}
The text was updated successfully, but these errors were encountered:
Hi,
I was using ALLCools on the Lee et al. 2019 single-cell methylation data (https://pubmed.ncbi.nlm.nih.gov/31501549/). The code I use to generate the mcds data is something like below:
allcools generate-dataset --allc_table /p/keles/schic/volumeD/HumanPatchSeq/allcfiles.tsv --output_path /p/keles/schic/volumeD/HumanPatchSeq/Lee2019.mcds --chrom_size_path hg19.chrom.sizes--obs_dim cell --cpu 30 --chunk_size 50 --regions geneslop2k /p/keles/schic/volumeD/HumanPatchSeq/hg19.bed --quantifiers geneslop2k count CGN,CHN
I load the output data like what is in the example. However, when I look at the data using mcds.get_adata(mc_type = "CHN").to_df(), the data shown is like below:
Some cells are consistently 10 among all genes. Here is the result if you want to check it. https://drive.google.com/drive/folders/1PQ5jhvTFuRsm9qn8oVDRS9MzCkiMQnx3?usp=share_link
Is this the problem of the data of am I doing something wrong?
Edit: I guess it is the easiest if you can tell me how you do integration between single-cell methylation and other transcriptomics data in a bit more detail? Especially how to form the cell by gene matrix for methylation? Previously, I directly calculate mCH fraction by first filtering out those CG reads, and among all the rest CH reads, for each cell, I calculate the mean of the fifth column in the gene body divided by the mean methylation of the whole cell. However, this approach failed. Below is the r file if you are interested.
The text was updated successfully, but these errors were encountered: