Part 1: Is building a predictive model the right thing to do?

Photo by NeONBRAND on Unsplash

I admit — the title is pretty provocative, but do read on to see if this will make sense by the end. I think what motivated me to write this article was my growing sense that with the increased availability of tools and resources to build such models (e.g. R, Python, etc.), came what I perceived to be a disproportionately small increase in awareness of how to use said tools and resources properly.

I find it pretty easy to imagine a scenario where a beginner looking to start doing some predictive modeling turns first to Google. If you Google “how…

Getting Started

Photo by Alvaro Reyes on Unsplash


Full disclaimer, I know the title is a little “clickbaity”, but after years of writing R scripts for data analysis, I believe I’ve come across a solid milestone of what a fully reproducible R workflow should aspire to be. This article is NOT intended for beginners but rather for advanced R users who write functionalized code and may already have personal workflows in managing many scripts. Also, please keep in mind that this article is entirely my opinion, as everyone may have their own notion of an “ultimate” workflow. Hopefully, this article may give you some new ideas!

Imagine you…

A Step-by-step Example using Differential Gene Expression Analysis


This article is aimed towards people who are looking to “break into” the bioinformatics realm and have experience with R (ideally using the tidyverse). Bioinformatics can be a scary-sounding concept (as least it is for me) because it is such a vast and fast-developing field that it can be difficult to define exactly what it is. I’ve always thought that bioinformatics was a highly advanced field beyond what I was capable of doing — that I would need years of technical training to begin actually doing it. …

A Step-by-step Example Using mtcars


This article is aimed towards people who have experience with R (ideally the tidyverse) and want to learn how to start making Shiny apps. For those who haven’t heard of Shiny before, it’s a package that allows you to create web applications using R without needing to know any HTML, CSS, or Javascript. That being said, if you do want to get deeper into app development, learning HTML, CSS, and Javascript will increase your ability to do more powerful things and have more control over the app development process. However, with access to so many tools, it can be overwhelming…

Photo by Brett Jordan on Unsplash

When I was searching for jobs, I had a lot of time on my hands. Add to that COVID, and well, I had a lot of time on my hands. One day I thought to myself that I could use a little practice with dplyr, given the relatively recent updates with dplyr 1.0.0+ and my personal philosophy that you can always practice fundamental skills. My first thought was to Google “dplyr practice problems”, but if you were to do that right now, you’d find a bunch of tutorial websites that have pretty basic dplyr problems. For intermediate and advanced users…

Getting Started

A More Powerful “Online” Approach

Photo by David Travis on Unsplash


This article is not meant to be a technical article nor is it meant to be a comprehensive article on all the different methods out there that control Type I and Type II error rates. This article will assume some background knowledge and is primarily focused on motivating a novel paradigm for combatting the multiple hypothesis testing problem and introducing a set of tools in R and R Shiny that you can use.


If you’ve ever done statistics or read a research paper about a discovery before, the number 0.05 should ring a bell. It refers to a significance threshold

Getting Started

Photo by Jonathan Borba on Unsplash

“Can you describe what’s going on in these Kaplan-Meier curves?” the interviewers asked me. I of course knew what those were, and I was admittedly stunned when they prodded me to say more — I didn’t know what else I could say. So, I stumbled through an answer, and after a while, the interviewers nodded their heads and thanked me for my time.

I didn’t get the job.

Even though I put survival analysis as one of my skills on my CV — and I had legitimately studied and worked with it — I realized afterwards that I still had…

An attempt at an “unbiased” perspective from a tidyverse fanboy

Photo by Kelly Sikkema on Unsplash

I recently participated in a relatively popular Stack Overflow “contest” (what would “popular” even mean on Stack Overflow??), where the prompt was to write a more “elegant” dplyr or tidyverse solution to the solution presented.

The problem statement was to perform two regressions: 1) dep ~ cov_a + cont_a + cont_b and 2) dep ~ cov_b + cont_a + cont_b.

This was the original posted code:

map(.x = names(df)[grepl("cov_", names(df))],
~ df %>%
nest() %>%
mutate(res = map(data, function(y) tidy(lm(dep ~ cont_a + cont_b + !!sym(.x), data = y)))) %>%

and this was the sample dataset provided:

df <…

Photo by Tianyi Ma on Unsplash

For the past couple of months, I’ve been building a Shiny App that researchers can use to control something called the False Discovery Rate. You can check it out here — I’ll probably write an article about it in the future. Along the way, I learned a lot of cool features from various sources — random Stackexchange posts, Dean Attali’s blog, and Appsilon’s blog to name a few. I’ve decided to list some of them here in this article in no particular order. …

After reading the article, which one do you think represents a “&” and a “&&”?

If you’ve been using R for a while now, you may have come across the double “&” operator. Most people who’ve coded before, whether in R or some other language, have an intuitive feel for what the “&” represents. It’s a logical AND statement. “The sky is blue AND cows can fly” is a logically false statement because even though the sky is blue, the second part of the statement is false. So what the heck then does a “&&” represent?

If you look up the help page, using?"&&", you will read “& and && indicate logical AND…The shorter form…

Lathan Liou

Data Scientist at Merck. Tidyverse enthusiast and a neRd.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store