Missing Data in Social Networks

Much of my research is on Social Networks, which are representations of how people, groups of people, or sometimes chimpanzees interact or are otherwise connected with each other. When analyzing social network data, as when analyzing any data, it is always important to consider how the data were collected.  For social networks we have to not only consider missing data on each individual in the network, but also the possibility that information on the relationships between the individuals could also be missing.

Continue reading

Posted in Uncategorized

New Email Address

You know what this blog needs? Considerably more Patti LaBelle:

I’m feeling good from my head to my shoes
Know where I’m going and I know what to do
I tidied up my point of view
I got a new attitude (and by attitude I mean email address/job)

I can now be reached at mott [at] smith [dot] edu and will start teaching in the Statistical and Data Sciences program at Smith College in September!


Posted in Teaching, Uncategorized

Gun Violence Data

After digging around the internet looking for data on gun violence for a few minutes,  I found Gun Violence Archive which has a ton of great information on gun violence in the US. You can search for incidents by date, location, age of victim, kind of gun, and much more.  On top of that, you can download CSV files directly from the website.

Here’s what I made with 32 days worth of data: Continue reading

Posted in Uncategorized

Karate Club Network Club

My research focuses on social networks.  Social networks, at least as I deal with them, are representations of how people (or organizations, or animals) perceive or interact with each other.  These representations can take the form of  visualizations (see below for an example) matrices, or lists of who is connected to who.  How you decide to format your representation of a network depends on what you are trying to learn, how many people and relationships are in the network, and what kind of relationships you are interested in.  In this blog post we will just be looking at network visualizations.

If you study social networks, it won’t be long until you encounter Zachary’s Karate Club Network:


Wayne Zachary was not actually in the karate club, but he kept track of who in the club hung out with each other doing non-karate activities for three years.  So in the above representation of the network, you can see that the person labelled 25 interacted with the person labelled 32 outside of the karate club.   Maybe 25 taught 32 how to crochet a sweater with a gerbil emblem on it.  Maybe 32 and 25 stared into each others’ eyes for hours and vowed to never leave each other’s sides.  Maybe 25 and 32 went to a Carpenter’s concert and heard “Close to You” sung live by Karen Carpenter and never were the same (it was the early 1970’s).  We don’t know, we just know they interacted outside of the club.

Now I’ve never been a member of a karate club.  I took karate classes at the local suburban recreation center with about 40 other 7 year-olds back in 1986 in the hopes that I could learn to crane kick like Ralph Macchio.  That did not happen, and I never even made it to the white belt level.  Instead I learned how reluctant a karate instructor can be to clean up a gym floor when a child (not me) pees in fear/boredom.  So my karate knowledge is pretty much limited to the aforementioned 1980’s film franchise and Miss Piggy.

Back to the actual karate club at hand.  As I mentioned, I have never had the experience of being involved in a karate club, but if Zachary’s example is representative of all karate clubs, they are dens of backstabbing and DRAMA.  You see, there were two individuals in the club who had very different ideas of how the club should run.  The karate instructor felt that karate club fees should be raised, and the club president wanted to keep the fees low.  These two didn’t just disagree with each other and then go around karate chopping as per usual.  Instead things got heated.  They each tried to recruit members of the club to be on their side.  Alliances were forged and torn asunder.  Names were called.  People were karate chopped in the heart.  Or not, I wasn’t born yet.  Eventually the group split in two.

Now that you know the embellished karate club backstory, here’s a little challenge for you :

1.) Identify the two ring-leaders of the karate club discord of 1972.

2.) Identify two groups that correspond with how the group split.

Answers below:

Continue reading

Posted in Uncategorized

R Packages, easy as a delicious dessert

I have long wanted to write R packages.  But for some reason, I thought that I wasn’t capable of such a feat.  “R package authors are super stars, I’m just me,” I would think in a mix of despair and admiration, marveling at some new and useful R package, and wishing I could create such functional beauty.

But now the day I have long dreamed of has arrived!  I have authored an R package, and it is perhaps the most satisfying feeling I have ever had using R.  Trust that I have had many R-related feelings, so I do not make this statement lightly.   On top of now holding myself in higher regard, I am also wondering what the heck took me so long?

Continue reading

Posted in Fun with Statistics, Networks, R, Research, Uncategorized

New paper about Random Walks and Edge Sampling

I feel like I am on a roll here.  Last month, my Bayesian paper (joint work with Krista Gile, Joseph Hogan, Nancy Barnett, and Crystal Linkletter) was published in Statistics in Medicine, and just today my RDS paper (which I worked on with Dr. Krista Gile) was published in the Electronic Journal of Statistics!

I am really interested in RDS, and here’s why: it is really important to understand the health behaviors and needs of people who are at higher risk for certain health issues, like HIV.  But it turns out that the people who tend to be at higher risks for HIV are really hard to get a nice random sample of.   Continue reading

Posted in Uncategorized

New Publication in Statistics in Medicine

I’m happy to report that my article Bayesian Peer Calibration with Application to Alcohol Use has been published in Statistics in Medicine.  If you love Bayesian statistics, social networks, multi-level models, and boatloads of conjugate prior distributions (check out the appendices), then this just might be the article for you!

Posted in Uncategorized

An “Invitation” to Address World’s Leading Tech Conference or A Brief History of Biostatistics Ryan Gosling

I recently(ish) received an email inviting me to speak at the 2015 Web Summit.  How did this come to be?  Well let me give you the back story.

Continue reading

Posted in Uncategorized

Simulating the Spread of Disease in a Population of Chimpanzees


There’s a terrible disease that could threaten an entire colony of chimpanzees.  The disease is highly contagious and very deadly.  Fortunately, a new vaccine is available.  But the vaccine is expensive, and it is difficult to administer to the chimpanzees.  Due to these complications, only a select few of the entire colony of chimpanzees can be vaccinated.  How do you choose?

Any time one is faced with such a problem, the first step should be to collect some data.   In this situation we need to learn about how the disease might spread through the population.  We need to learn about which chimpanzees have contact with each other. Perhaps there are certain chimpanzees that bridge different groups. We probably should also know how infectious the disease is, how long is a chimpanzee infectious for, which chimpanzee is likely to be the first to get the disease and other important factors.

Let’s say that you have your data, you know about how this disease is spread, and you want to identify the optimal set of chimpanzees to vaccinate.  How can we do this?  Lives are at stake and we have limited resources.  The pressure is on and we need to make the right choice!   Continue reading

Posted in Research, Shiny | Tagged , , ,

Silly (Mental) Images Generator

Every Monday night I like to unwind from a long day of teaching statistics by drawing and painting with my friend (who happens to be an artistic genius) Christopher Tradowsky.  Often our drawings and paintings involve scenarios involving human/animal interaction that are unlikely to occur in real life.  Here is a small sampling of our collaboration:

Continue reading

Posted in Art, R, random art, Shiny | Tagged ,


I am on a class-taking spree.  I finished up Practical Machine Learning Coursera class, and just started in on the Developing Data Products class. What is a data product?  Here’s a short description of the course:  Continue reading

Posted in Extracurricular, Fun with Statistics, Statistics, Teaching | Tagged , , , ,

Hollywood Walk of Fame

Earlier this month, I read in a Gawker article that Bill Cosby’s star on the Hollywood Walk of Fame was defiled.   This got me wondering if there are other people who have stars on the Walk of Fame who have been accused of crimes.  So I went to Wikipedia’s list to see if I could glean anything.  It turns out that there are far too many Hollywood stars than I care to read about so I decided to do some data scraping and analysis.

Continue reading

Posted in Extracurricular | Tagged , , , ,

An Open Relationship with R: Learning Python


I’ve decided that its time to learn Python.  While I feel very comfortable using R to scrape data from the internet, bring together different datasets, clean, analyze, and visualize data, there are a few things that R isn’t totally super at.  For example, sometimes I think that my R programs are running slow (especially when they require loops).

At any rate, I love R, and I am by no means leaving R behind (this is starting to feel like the “it’s not you it’s me” talk) but I want to try new things.  This can only add to my appreciation of R.  At least, that’s what I’m telling some weird anthropomorphized version of R in my head.  Apparently I’m trying to break it to R gently.

So enough of that foolishness!  Here’s what I am doing to learn Python:

Continue reading

Posted in Uncategorized | Tagged , , , , ,

Animated Gif with R Code and Dataset

I had such a good time making animated gifs of weather data, I decided to go back for more.  Here is an animation of the low temperatures from January 1,  to April 1, 2014.  See if you can spot a polar vortex:


If you would like to do something similar I am attaching the code here.

More information about the data can be found on my previous post.

Posted in Fun with Statistics | Tagged , , , , , ,

My First Animated Gifs! (with R)

Ever since I found this tutorial yesterday, I have been so excited to be making animated gifs in R.  In fact, last night as I was laying my head upon my pillow, I almost couldn’t fall asleep because I kept thinking about all of the possibilities!  This could be great for visualizing different distributions in my probability class, or by trying to find some patterns in large data sets.  There are so many possibilities, my mind is boggled.

So today I put the tutorial to use, and made this animation.  Can you guess what it is?


Continue reading

Posted in Uncategorized

Fun with USGS Earthquake Data

I have been playing around with mapping data in R using the maps package.


Continue reading

Posted in Uncategorized

I Stand Corrected… Or Do I?

Just yesterday I received an email from Dr. Douglas Furton, a Physics Professor at Grand Valley State University.  Dr. Furton had read my blog post about One-Handed Solitaire in which I used simulations to find the probability of winning such a game.  Finding himself interested in this problem, he contacted me and requested to see my R script.  Sadly, my R script from two years ago (when I wrote that post) is no longer with us.  Let this be a lesson to me that I never forget: always save to Dropbox!  

But Dr. Furton’s email put a bee in my bonnet, so I ended up writing new code which is here.  Now, it seems my new code gives me an answer that contradicts the answer that I had arrived at in my post from 2 years ago.  Continue reading

Posted in Card Games, Extracurricular, Fun with Statistics, Statistics | Tagged , , , , , , , | 1 Comment

My Picks for JSM 2014

Here we are in the middle of the summer.  The humidity is reaching peak levels.  Offices are vacant as people venture out for their much awaited vacations.  And as hard it is for me to believe, JSM (Joint Statistical Meetings) 2014 is right around the corner.  

I am really looking forward to so much about JSM this year, not the least of which is all the restaurants I plan to go to in Boston, which merits its own post.  After spending many happy hours perusing the JSM Online Program, I have found several sessions I am really excited about.  Here are a few of my picks for JSM 2014.  It should be said that I am really interested in Networks and Statistical Education which will be reflected in my picks below:

Continue reading

Posted in Uncategorized

A Statistical Analysis of the Humans of New York Blog


The students in my Intro Statistics classes are diligently working on their Statistics Projects.  Every time I meet with a project group, I get excited and inspired by their ideas.  So what to do with all that inspiration?  Why not do a statistical analysis of Humans of New York; one of the most popular photo blogs of all time?

Continue reading

Posted in Extracurricular, Fun with Statistics, Statistics | Tagged , , ,

More about that please…

ImageI was reading this article in the New York Times about the same sex marriage case in Michigan.  Nothing particularly remarkable was said until I got to this one-sentence paragraph:

At times, the eight days of testimony resembled a droning college seminar in statistical methodology. 

As my eyes alighted upon the words “college seminar in statistical methodology” my heart started beating faster.  I leaned forward in my office chair.  I found myself hoping that the article would go into detail about the statistical methodology at issue in this case. Sadly, my hopes were dashed! Maybe next time…  

Posted in Uncategorized | Tagged , , ,

The Stuff that Makes a Researcher

File:Albert Einstein Head cleaned.jpg


Chad Topaz, a Math professor at Macalester College wrote an interesting blog entry on undergraduate research over at siam.org.  Chad explains that many undergraduates tend to be unduly intimidated by Research:

I think the view that research necessitates genius is counterproductive and inaccurate. I worry that some students who might make meaningful contributions to the world through research (while of course, there are many other equally valuable ways besides research) are turned off by research-fear before they even start. 

Continue reading

Posted in Uncategorized | Tagged , , ,

Summer Math Program for Women Undergraduates at Carleton

Are you a woman undergraduate who is interested in math? Do you know a woman undergraduate who is interested in math? Watch this video and learn about the amazing Summer Math Program at Carleton College. You can learn more about it here: http://www.math.carleton.edu/smp/

Video | Posted on by | Tagged , , ,

The year is 2013… the International Year of Statistics.  Dr. Susan Murphy is a biostatistician working in causal inference at the University of Michigan.  And she is now a MacArthur Genius!  Congratulations to Dr. Murphy on being recognized for her important work!  

Want to be a genius?  Try statistics!  

Aside | Posted on by | Tagged , , ,


I just defended my dissertation. I simultaneously want to take a three-years nap, and do a victory lap around campus. Hurray!


Posted in Graduate School, Networks, Statistics | Tagged , ,

Advice for new college students

Here’s some great advice from Smith College President Kathleen McCartney.  I absolutely agree.  Here she recounts a story from her days as an undergraduate at Tufts:

I was a strong student when I entered college. I was good at “doing school.” But my sophomore year at Tufts University helped me discover my passion. Taking my first class in child development, I was fascinated by the experiments psychologists designed to infer how children learn. Homework for that course felt like play. Despite my reticence (a quality reinforced by my identity as a first-generation college student) I found the strength to seek out my professor during her office hours to volunteer in her research lab. This experience led to graduate school, which led to a career as a professor, which led to the job I have today. I took a risk—for me, a big risk—and it paid off.


Posted in Uncategorized | Tagged , ,

A Sample of Stats Songs

There is something about statistics that inspires people to break into song and (sometimes) choreographed dance.  Here are a few videos of people who made the sound decision to video themselves doing just that:

Continue reading

Posted in Uncategorized

Emails to New Grad Students

Here it is July 22, 2013!  Unless the structure of time and/or the academic calendar has changed, that means that many thousands of people are about to embark on the wondrous adventure of graduate school in a mere month or so.  I have been on a grad school journey that has involved 2 masters degrees, an almost complete PhD (I’m getting close!) and spanned 3 states and 8 years.  A little while ago a friend of mine who was about to start a masters degree emailed me (and her other grad school experienced buddies) to ask for general grad school advice.  Here is what I wrote:

Continue reading

Posted in Advice, Graduate School, Uncategorized | Tagged , ,

Statistical Performance Art Competitors: I Challenge Thee!

The American Statistical Association is holding a competition to help commemorate their 175th year:

Entrants will submit a short (<5 min.) video of their statistically themed performance online by January 1, and an esteemed panel of judges will notify finalists by February 1. During the ASA 175th Birthday Party on Tuesday evening at JSM 2014 in Boston, finalists will perform live before the JSM audience, which will vote wirelessly to select the ASA’s Got Talent winners.

Details about the competition can be found here.   My mind is reeling… there are so many possibilities.

Posted in conferences, Fun with Statistics, random art, Statistics | Tagged , , ,

Screwing Up the National Anthem

SFW National Anthem Analysis

I just saw this NSFW (due to language) data visualization & analysis by Reuben Fischer-Baum of when singers of the United States’ national anthem first flub the lyrics based on 26 youtube videos. I slightly altered their graphic to make it safe for work, but I think you will get the idea.

Apparently, the lyrics which posed the greatest challenge for the singers in the videos are “were so gallantly streaming”.  This is a fun idea for analysis, and I enjoyed reading about it, but I think it could easily be made even more interesting and informative.

Here are my suggestions:

Continue reading

Posted in Fun with Statistics, Statistics, Uncategorized | Tagged ,

Getting an edge on “battleship”


If you have youngsters in your life, you may have had the experience of playing different games of skill and chance with them.  And though you may be taller, have many more years of experience to draw from, and are far more patient, these young competitors will not cease until you are utterly vanquished and verbally acknowledge their superiority.  I speak from experience as I have been humbled by 5 year olds on more than one occasion.

Continue reading

Posted in Uncategorized | Tagged , , , ,

Bald Associates


A recent meta-analysis in BMJ on the association between baldness and coronary heart disease has gotten a lot of press (see NYT and Boston Globe).  Looking at the comments section for these articles you can see that some readers are jokingly conflating this association with causality:

Continue reading

Posted in Uncategorized | Tagged ,

Graduate Student Statistics Teaching Inventory

Andrew Gelman’s blog links to this survey of graduate students’ experiences TA’ing or teaching statistics courses. If you fit the bill, consider filling it out. It only took me about 5 minutes and will add to research on statistics education.

Update: the survey is no longer looking for study participants. I will link to the results if/when they are made available.

Posted in Uncategorized | Tagged ,

Capital Bikeshare Data Release

ImageThe Simply Statistics Blog‘s “Sunday Statistics Roundups” are always an interesting read.  These posts tend to bring together links about statistics in the news, contests, conferences, and also data sets.  This week’s roundup had a link to a really interesting data set on the bike trip histories from Capital Bikeshare.

As you may imagine Capital Bikeshare is a bike sharing program in Washington, DC. Members of this service can go to any bike station (locations here), borrow a bike, and return the bike to any station they like.  Membership plans can vary in length from three days to a year.

I imagine Capital Bikeshare uses their data to try to understand where new stations should be put, which stations need to have more bike parking spaces, and how often to maintain bicycles.  While they aren’t making available full access to their data (which is good because it would be creepy to be able to track individual people’s locations and trips) there is still a lot to be learned from the data they released.

Continue reading

Posted in Fun with Statistics, Networks, Statistics, Teaching, Uncategorized | Tagged , , , , ,

Peer Review

I am pretty new to the role of journal article referee. Recently I searched for tips and guidelines for peer reviewing, and I sure did find a lot of information.  I’ve compiled a list of helpful links below.  While some of the links are subject specific, I think that each link contains useful tips for anyone looking to improve their reviewing (myself included). Continue reading

Posted in peer reivew, publications, service to profession | Tagged , , , ,


Tomorrow I head off to Orlando for ENAR where I will be presenting a poster on Sunday night. If you are at ENAR and are inclined, drop by my poster to say hi.

Posted in Uncategorized




I am excited about attending the Joint Statistical Meetings in Montreal for several reasons, but I am even more excited now.  Nate Silver of election predicting fame will be giving the President’s Invited Address.  

Posted in Uncategorized | Tagged , ,



Happy International Year of Statistics!  I am going to celebrate by analyzing social network data in my favorite cafe.

Posted in International Year of Statistics, Networks, Statistics, Uncategorized | Tagged , ,

Art of Science

I just found out that my submission is going to be included in the Art of Science Exhibition, which is sponsored by the Brown University Division of Biology and Medicine Office of Graduate and Postdoctoral Studies.  Here is the invitation if you are interested in attending the opening on November 14th: Continue reading

Posted in Extracurricular, Fun with Statistics, random art, Statistics | Tagged ,

Now Online: Article on Changes in Reported Sexual Orientation and Substance Use

The article that I wrote with Sari ReisnerDavid WypijBryn AustinHeather CorlissMargie Rosario, and Allegra Gordon is now online at the Journal of Adolescent Health website.  Here is the link.

Posted in LGBT Health, publications | Tagged ,

Much Ado about the Null Set

What happens when a Mathematician and a Shakespeare scholar develop a course together?  Apparently, one possibility is Mathematics and What it Means to be Human, a course developed and taught by Dr. Manil Suri and Dr. Michele Osherow at UMBC.  The Chronicle of Higher Education is publishing a series (installment 1 is here) describing how this course came into being, complete with the syllabus.  They also discuss the class in this video.

I enjoyed reading about Dr. Manil’s desire to proselytize a love of math to Humanities majors, and Dr. Osherow’s struggle to overcome her mathphobia.  I’m looking forward to the next installments which will hopefully detail the students’ reactions to this grand experiment.

Thanks Dr. Ben Capistrant for bringing the Chronicle article to my attention!

Posted in Literature, Teaching, Uncategorized | Tagged ,

Projects Galore (Part 1) Ego-Networks of Women with Advanced Cancer

I am working on many projects with Dr. Melissa Clark.  Here is one project that is coming along nicely:

We have access to data on women who have advanced cancer.  Each woman (the ego) was asked about the important people in her life (her alters). We are investigating the associations between the characteristics of the alters, and the ego’s advanced care planning decisions.

I don’t want to get too detailed since we are in the process of writing this up, but I think this paper could have a lot of implications for finding ways to encourage people to make these advanced care planning decisions.

Posted in Networks, Uncategorized | Tagged , ,

Your Random Number Generator Dresses You Funny

I recently lead an intense course for incoming undergraduates who are going to concentrate in the sciences. I started the class by talking about sets, then moved into counting problems, followed by calculating probabilities, independence/dependence, unions & intersections, probability distributions, and finally hypothesis testing. It was a bit of a whirlwind! I had a lot of goals for myself for this class, which I won’t list out here at this time, but one of the goals was to balance the really serious public health examples (having TB and HIV, time to death, etc) with some light-heartedness.

For example, in the final exam I tried to include a little math/stats humor:

Blake needs to do each of these things today:

  • Renew subscription to Statistician’s Fashions Quarterly
  • Finish writing the love-song titled You are the Only [n choose 0] For Me
  • Apply to compete on America’s Next Top Statistical Model

Assuming that Blake could only do one activity at a time, how many different ways could Blake order these activities?

Continue reading

Posted in Extracurricular, Statistics, Teaching, Uncategorized | Tagged , , , , ,


I just got the news that a paper I have been working on with Sari Reisner, David Wypij, Bryn Austin, Heather Corliss, Margie Rosario, and Allegra Gordon got accepted for publication!  They are a wonderful and talented group of people, and I am very lucky that I have had the chance to work with each of them.  There were many many rounds of edits and revisions, and it feels great to have this come to fruition.  Hurray!

Posted in LGBT Health, publications | Tagged , , ,

Adventures in Bathroom Solitaire

What are the odds?  I get asked this by my non-Stats friends from time to time.  They usually don’t expect me to actually go to my thinking place (my thinking place deserves its own blog post) and calculate the odds.  Not so with my friend Sarah, who has been steadfast in her curiosity.   I started working on her problem yesterday.

Sarah loves to play a game called “Bathroom Solitaire” (also called One-Handed Solitaire) which is a bit of a family tradition for her.  Both Sarah and her mother have been playing this game for decades.   Bathroom solitaire owes its whimsical name to the fact that you can play it while holding all the cards in one hand, thus inspiring a fair amount of multi-tasking.

Continue reading

Posted in Extracurricular, Statistics, Teaching | Tagged , , ,

Extracurricular activity

I recently read in Inside Science that physicists had applied network analytic tools to discover whether three classic myths: Beowulf, the Iliad, and Tain Bo Cuailnge, were based in real life events (Here is the original article).    In a project that probably fell to some undergrad RA, the characters from each of the three stories were entered in a database, and links, coded as either hostile or friendly, were assigned between the characters who had relationships.

According to their coding of the data, Beowulf had 74 characters, the Iliad had 716 characters, and Tain Bo Cuailnge had 404 characters. Once the network was constructed for each myth, they calculated different network statistics, like the mean number of people that each person had links with and whether those with many links tended to be connected with others who had many links (called degree assortativity).  If networks are highly assortative (meaning that people are more likely to be linked with others who have a similar number of links) it is thought that these networks are more true to real life.  They came to the conclusion that the Iliad is the most realistic social network.

Continue reading

Posted in Extracurricular, Literature, Networks, Statistics, Uncategorized | Tagged , , , ,