I have long wanted to write R packages. But for some reason, I thought that I wasn’t capable of such a feat. “R package authors are super stars, I’m just me,” I would think in a mix of despair and admiration, marveling at some new and useful R package, and wishing I could create such functional beauty.
But now the day I have long dreamed of has arrived! I have authored an R package, and it is perhaps the most satisfying feeling I have ever had using R. Trust that I have had many R-related feelings, so I do not make this statement lightly. On top of now holding myself in higher regard, I am also wondering what the heck took me so long?
I feel like I am on a roll here. Last month, my Bayesian paper (joint work with Krista Gile, Joseph Hogan, Nancy Barnett, and Crystal Linkletter) was published in Statistics in Medicine, and just today my RDS paper (which I worked on with Dr. Krista Gile) was published in the Electronic Journal of Statistics!
I am really interested in RDS, and here’s why: it is really important to understand the health behaviors and needs of people who are at higher risk for certain health issues, like HIV. But it turns out that the people who tend to be at higher risks for HIV are really hard to get a nice random sample of. Continue reading
I’m happy to report that my article Bayesian Peer Calibration with Application to Alcohol Use has been published in Statistics in Medicine. If you love Bayesian statistics, social networks, multi-level models, and boatloads of conjugate prior distributions (check out the appendices), then this just might be the article for you!
I recently(ish) received an email inviting me to speak at the 2015 Web Summit. How did this come to be? Well let me give you the back story.
There’s a terrible disease that could threaten an entire colony of chimpanzees. The disease is highly contagious and very deadly. Fortunately, a new vaccine is available. But the vaccine is expensive, and it is difficult to administer to the chimpanzees. Due to these complications, only a select few of the entire colony of chimpanzees can be vaccinated. How do you choose?
Any time one is faced with such a problem, the first step should be to collect some data. In this situation we need to learn about how the disease might spread through the population. We need to learn about which chimpanzees have contact with each other. Perhaps there are certain chimpanzees that bridge different groups. We probably should also know how infectious the disease is, how long is a chimpanzee infectious for, which chimpanzee is likely to be the first to get the disease and other important factors.
Let’s say that you have your data, you know about how this disease is spread, and you want to identify the optimal set of chimpanzees to vaccinate. How can we do this? Lives are at stake and we have limited resources. The pressure is on and we need to make the right choice! Continue reading
Every Monday night I like to unwind from a long day of teaching statistics by drawing and painting with my friend (who happens to be an artistic genius) Christopher Tradowsky. Often our drawings and paintings involve scenarios involving human/animal interaction that are unlikely to occur in real life. Here is a small sampling of our collaboration:
I am on a class-taking spree. I finished up Practical Machine Learning Coursera class, and just started in on the Developing Data Products class. What is a data product? Here’s a short description of the course: Continue reading
Earlier this month, I read in a Gawker article that Bill Cosby’s star on the Hollywood Walk of Fame was defiled. This got me wondering if there are other people who have stars on the Walk of Fame who have been accused of crimes. So I went to Wikipedia’s list to see if I could glean anything. It turns out that there are far too many Hollywood stars than I care to read about so I decided to do some data scraping and analysis.
I’ve decided that its time to learn Python. While I feel very comfortable using R to scrape data from the internet, bring together different datasets, clean, analyze, and visualize data, there are a few things that R isn’t totally super at. For example, sometimes I think that my R programs are running slow (especially when they require loops).
At any rate, I love R, and I am by no means leaving R behind (this is starting to feel like the “it’s not you it’s me” talk) but I want to try new things. This can only add to my appreciation of R. At least, that’s what I’m telling some weird anthropomorphized version of R in my head. Apparently I’m trying to break it to R gently.
So enough of that foolishness! Here’s what I am doing to learn Python:
I had such a good time making animated gifs of weather data, I decided to go back for more. Here is an animation of the low temperatures from January 1, to April 1, 2014. See if you can spot a polar vortex:
If you would like to do something similar I am attaching the code here.
More information about the data can be found on my previous post.
Ever since I found this tutorial yesterday, I have been so excited to be making animated gifs in R. In fact, last night as I was laying my head upon my pillow, I almost couldn’t fall asleep because I kept thinking about all of the possibilities! This could be great for visualizing different distributions in my probability class, or by trying to find some patterns in large data sets. There are so many possibilities, my mind is boggled.
So today I put the tutorial to use, and made this animation. Can you guess what it is?
I have been playing around with mapping data in R using the maps package.
Just yesterday I received an email from Dr. Douglas Furton, a Physics Professor at Grand Valley State University. Dr. Furton had read my blog post about One-Handed Solitaire in which I used simulations to find the probability of winning such a game. Finding himself interested in this problem, he contacted me and requested to see my R script. Sadly, my R script from two years ago (when I wrote that post) is no longer with us. Let this be a lesson to me that I never forget: always save to Dropbox!
But Dr. Furton’s email put a bee in my bonnet, so I ended up writing new code which is here. Now, it seems my new code gives me an answer that contradicts the answer that I had arrived at in my post from 2 years ago. Continue reading
Posted in Card Games, Extracurricular, Fun with Statistics, Statistics
Tagged Bathroom Solitaire, Games, one handed solitaire, Probability, R, Simulations, statistics, StatsFun
Here we are in the middle of the summer. The humidity is reaching peak levels. Offices are vacant as people venture out for their much awaited vacations. And as hard it is for me to believe, JSM (Joint Statistical Meetings) 2014 is right around the corner.
I am really looking forward to so much about JSM this year, not the least of which is all the restaurants I plan to go to in Boston, which merits its own post. After spending many happy hours perusing the JSM Online Program, I have found several sessions I am really excited about. Here are a few of my picks for JSM 2014. It should be said that I am really interested in Networks and Statistical Education which will be reflected in my picks below:
The students in my Intro Statistics classes are diligently working on their Statistics Projects. Every time I meet with a project group, I get excited and inspired by their ideas. So what to do with all that inspiration? Why not do a statistical analysis of Humans of New York; one of the most popular photo blogs of all time?
I was reading this article in the New York Times about the same sex marriage case in Michigan. Nothing particularly remarkable was said until I got to this one-sentence paragraph:
At times, the eight days of testimony resembled a droning college seminar in statistical methodology.
As my eyes alighted upon the words “college seminar in statistical methodology” my heart started beating faster. I leaned forward in my office chair. I found myself hoping that the article would go into detail about the statistical methodology at issue in this case. Sadly, my hopes were dashed! Maybe next time…
Chad Topaz, a Math professor at Macalester College wrote an interesting blog entry on undergraduate research over at siam.org. Chad explains that many undergraduates tend to be unduly intimidated by Research:
I think the view that research necessitates genius is counterproductive and inaccurate. I worry that some students who might make meaningful contributions to the world through research (while of course, there are many other equally valuable ways besides research) are turned off by research-fear before they even start.
Are you a woman undergraduate who is interested in math? Do you know a woman undergraduate who is interested in math? Watch this video and learn about the amazing Summer Math Program at Carleton College. You can learn more about it here: http://www.math.carleton.edu/smp/
The year is 2013… the International Year of Statistics. Dr. Susan Murphy is a biostatistician working in causal inference at the University of Michigan. And she is now a MacArthur Genius! Congratulations to Dr. Murphy on being recognized for her important work!
Want to be a genius? Try statistics!
I just defended my dissertation. I simultaneously want to take a three-years nap, and do a victory lap around campus. Hurray!
Here’s some great advice from Smith College President Kathleen McCartney. I absolutely agree. Here she recounts a story from her days as an undergraduate at Tufts:
I was a strong student when I entered college. I was good at “doing school.” But my sophomore year at Tufts University helped me discover my passion. Taking my first class in child development, I was fascinated by the experiments psychologists designed to infer how children learn. Homework for that course felt like play. Despite my reticence (a quality reinforced by my identity as a first-generation college student) I found the strength to seek out my professor during her office hours to volunteer in her research lab. This experience led to graduate school, which led to a career as a professor, which led to the job I have today. I took a risk—for me, a big risk—and it paid off.
There is something about statistics that inspires people to break into song and (sometimes) choreographed dance. Here are a few videos of people who made the sound decision to video themselves doing just that:
Here it is July 22, 2013! Unless the structure of time and/or the academic calendar has changed, that means that many thousands of people are about to embark on the wondrous adventure of graduate school in a mere month or so. I have been on a grad school journey that has involved 2 masters degrees, an almost complete PhD (I’m getting close!) and spanned 3 states and 8 years. A little while ago a friend of mine who was about to start a masters degree emailed me (and her other grad school experienced buddies) to ask for general grad school advice. Here is what I wrote:
The American Statistical Association is holding a competition to help commemorate their 175th year:
Entrants will submit a short (<5 min.) video of their statistically themed performance online by January 1, and an esteemed panel of judges will notify finalists by February 1. During the ASA 175th Birthday Party on Tuesday evening at JSM 2014 in Boston, finalists will perform live before the JSM audience, which will vote wirelessly to select the ASA’s Got Talent winners.
Details about the competition can be found here. My mind is reeling… there are so many possibilities.
Reis, a company that analyses real estate trends, has put out new numbers on vacancy and rent for apartments by metro area. Though you need to be a Reis customer in order to gain full access to the report, Reuters and the Wall Street Journal summarize the report with a focus on the metro areas with the highest rents. Since I am a graduate student and not a real estate tycoon, I cannot afford to look at Reis’ market report. So all of my information is coming from the popular media.
New York City remained the most expensive market. Average rents for the city’s four largest boroughs – Manhattan, Queens, Brooklyn and the Bronx – rose 1 percent to $3,017.19 a month, the first time the average rent topped $3,000 since Reis began collecting data in 1980….The average New York rent was more than 50 percent higher than second-place San Francisco, where rent grew 1.1 percent from the first quarter to $1,998.82. Oklahoma City was the cheapest market, at an average of $571.03 a month, up 0.6 percent.
At first glance, the idea that the average rent in New York (not including Staten Island) is about $3,000 boggles the mind! How does anyone who isn’t a millionaire afford to live in New York? The answer to that question becomes apparent when you consider what the average rent means…
I just saw this NSFW (due to language) data visualization & analysis by Reuben Fischer-Baum of when singers of the United States’ national anthem first flub the lyrics based on 26 youtube videos. I slightly altered their graphic to make it safe for work, but I think you will get the idea.
Apparently, the lyrics which posed the greatest challenge for the singers in the videos are “were so gallantly streaming”. This is a fun idea for analysis, and I enjoyed reading about it, but I think it could easily be made even more interesting and informative.
Here are my suggestions:
If you have youngsters in your life, you may have had the experience of playing different games of skill and chance with them. And though you may be taller, have many more years of experience to draw from, and are far more patient, these young competitors will not cease until you are utterly vanquished and verbally acknowledge their superiority. I speak from experience as I have been humbled by 5 year olds on more than one occasion.
A recent meta-analysis in BMJ on the association between baldness and coronary heart disease has gotten a lot of press (see NYT and Boston Globe). Looking at the comments section for these articles you can see that some readers are jokingly conflating this association with causality:
Andrew Gelman’s blog links to this survey of graduate students’ experiences TA’ing or teaching statistics courses. If you fit the bill, consider filling it out. It only took me about 5 minutes and will add to research on statistics education.
Update: the survey is no longer looking for study participants. I will link to the results if/when they are made available.
The Simply Statistics Blog‘s “Sunday Statistics Roundups” are always an interesting read. These posts tend to bring together links about statistics in the news, contests, conferences, and also data sets. This week’s roundup had a link to a really interesting data set on the bike trip histories from Capital Bikeshare.
As you may imagine Capital Bikeshare is a bike sharing program in Washington, DC. Members of this service can go to any bike station (locations here), borrow a bike, and return the bike to any station they like. Membership plans can vary in length from three days to a year.
I imagine Capital Bikeshare uses their data to try to understand where new stations should be put, which stations need to have more bike parking spaces, and how often to maintain bicycles. While they aren’t making available full access to their data (which is good because it would be creepy to be able to track individual people’s locations and trips) there is still a lot to be learned from the data they released.
I am pretty new to the role of journal article referee. Recently I searched for tips and guidelines for peer reviewing, and I sure did find a lot of information. I’ve compiled a list of helpful links below. While some of the links are subject specific, I think that each link contains useful tips for anyone looking to improve their reviewing (myself included). Continue reading
Tomorrow I head off to Orlando for ENAR where I will be presenting a poster on Sunday night. If you are at ENAR and are inclined, drop by my poster to say hi.
I am excited about attending the Joint Statistical Meetings in Montreal for several reasons, but I am even more excited now. Nate Silver of election predicting fame will be giving the President’s Invited Address.
Happy International Year of Statistics! I am going to celebrate by analyzing social network data in my favorite cafe.
I just found out that my submission is going to be included in the Art of Science Exhibition, which is sponsored by the Brown University Division of Biology and Medicine Office of Graduate and Postdoctoral Studies. Here is the invitation if you are interested in attending the opening on November 14th: Continue reading
What happens when a Mathematician and a Shakespeare scholar develop a course together? Apparently, one possibility is Mathematics and What it Means to be Human, a course developed and taught by Dr. Manil Suri and Dr. Michele Osherow at UMBC. The Chronicle of Higher Education is publishing a series (installment 1 is here) describing how this course came into being, complete with the syllabus. They also discuss the class in this video.
I enjoyed reading about Dr. Manil’s desire to proselytize a love of math to Humanities majors, and Dr. Osherow’s struggle to overcome her mathphobia. I’m looking forward to the next installments which will hopefully detail the students’ reactions to this grand experiment.
Thanks Dr. Ben Capistrant for bringing the Chronicle article to my attention!
I am working on many projects with Dr. Melissa Clark. Here is one project that is coming along nicely:
We have access to data on women who have advanced cancer. Each woman (the ego) was asked about the important people in her life (her alters). We are investigating the associations between the characteristics of the alters, and the ego’s advanced care planning decisions.
I don’t want to get too detailed since we are in the process of writing this up, but I think this paper could have a lot of implications for finding ways to encourage people to make these advanced care planning decisions.
I recently lead an intense course for incoming undergraduates who are going to concentrate in the sciences. I started the class by talking about sets, then moved into counting problems, followed by calculating probabilities, independence/dependence, unions & intersections, probability distributions, and finally hypothesis testing. It was a bit of a whirlwind! I had a lot of goals for myself for this class, which I won’t list out here at this time, but one of the goals was to balance the really serious public health examples (having TB and HIV, time to death, etc) with some light-heartedness.
For example, in the final exam I tried to include a little math/stats humor:
Blake needs to do each of these things today:
- Renew subscription to Statistician’s Fashions Quarterly
- Finish writing the love-song titled You are the Only [n choose 0] For Me
- Apply to compete on America’s Next Top Statistical Model
Assuming that Blake could only do one activity at a time, how many different ways could Blake order these activities?
I just got the news that a paper I have been working on with Sari Reisner, David Wypij, Bryn Austin, Heather Corliss, Margie Rosario, and Allegra Gordon got accepted for publication! They are a wonderful and talented group of people, and I am very lucky that I have had the chance to work with each of them. There were many many rounds of edits and revisions, and it feels great to have this come to fruition. Hurray!
What are the odds? I get asked this by my non-Stats friends from time to time. They usually don’t expect me to actually go to my thinking place (my thinking place deserves its own blog post) and calculate the odds. Not so with my friend Sarah, who has been steadfast in her curiosity. I started working on her problem yesterday.
Sarah loves to play a game called “Bathroom Solitaire” (also called One-Handed Solitaire) which is a bit of a family tradition for her. Both Sarah and her mother have been playing this game for decades. Bathroom solitaire owes its whimsical name to the fact that you can play it while holding all the cards in one hand, thus inspiring a fair amount of multi-tasking.
I have decided to start this website as a space to bring together my different interests, mainly focusing on Biostatistics. I hope to use this website to record my ideas on journal articles that I have read, share cool statistics applications, and hopefully engage with others who share my interests.