R

Compiling with libxml in MacOS Catalina

The problem When compiling Emacs 28.05, make bootstrap failed with fatal error: 'libxml/tree.h' file not found while autogen.sh and configure were successfully executed. Troubleshooting The error message pointed out the culprit, libxml.

Strangled by factors

I am exaggerating, but sometimes stringsAsFactors is almost this deadly. I work with genomic data, and a common quest in my job is to identify interesting features (in most cases, genes) from a pool of 25,000+.

Migrating from Medium to Blogdown

After struggling for a while, I decided to move from Medium and switch to blogdown. While Medium is a beautiful platform for blogging, its philosophy seems to fit less well when there are more than articles to host.

Dealing with dependency without sudo like a dummy (again)

The more I work with Linux, the more I encounter dependency issues. This is of course not too big a surprise, but it can be painful especially when you aren’t sudo, so the most obvious solution does not work for you.

Reverse and find complement sequence in R

Recently, I am continuously being amazed by how a seemingly simple task is actually implemented in a sophisticated way. I guess I am just taking so many things for granted just because it was implemented and refined to an extent that I don’t even feel it.

Single or double?: AND operator and OR operator in R

One classmate complained about having trouble subsetting a data frame to keep non-zero rows, like: # I don't want rows of zero here! non_zero <- rna_seq[wt != 0 && mutant !

Installing R package XML on MacOS 10.13.6

I updated my R packages the other day, and not surprisingly, one package failed to compile. This time, it was XML. The error message suggested configure: error: “libxml not found”, but homebrew suggested I had installed libxml2 and had it up-to-date.

K-means exercise in R language

As a novice in genomic data analysis, one of my goal is to benchmark how well a clustering method works. I ran across this practice of doing k-means at R-exercises the other day and felt it might be a nice start because k-means is easy to perform and conceptually simple for me to correlate what is happening behind the clustering machinery.

Using Limma to find differentially expressed genes

Ritchie, ME, Phipson, B, Wu, D, Hu, Y, Law, CW, Shi, W, and Smyth, GK (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies.Nucleic Acids Research 43(7), e47.

Remote connection to Jupyter Notebook

Recently, I analyzed a few single cell RNA-seq datasets and experimented with several new tools from recent publication. While it was fun, most datasets were just too large for my poor laptop to process, and I relied a lot on our server.