Quantified Self: Numbers and Self-Realization
John Allen Paulos, noted mathematician and advocate of numeracy, argues that people have a fear of numbers and basic mathematics while, ironically, continually cherishing numerical conventions. Arguably, people use – and depend on – numbers on a daily basis, but fail (or ignore) to appreciate the scale at which we truly have a number-based, quantified society. In his book, “Innumeracy: Mathematical Illiteracy and its Consequences“ he writes,
In my opinion, some of the blocks to dealing comfortably with numbers and probabilities are due to quite natural psychological responses to uncertainty, to coincidence, or to how a problem is framed. Others can be attributed to anxiety, or to romantic misconceptions about the nature and importance of mathematics.
While enjoying this book, and in developing a pedagogy of critical statistical literacy and appreciation, I came across a groups of people around the globe who use numbers as a means of self-actualization. The Quantified Self movement promotes building knowledge of one’s self through numbers. Moreover, it seeks to harness how, even the numerically adverse, use numbers to track much of their daily lives. According to The Economist, self-quantifiers share,
…a belief that gathering and analyzing data about their everyday activities can help them improve their lives—an approach known as “self-tracking”, “body hacking” or “self-quantifying”.
As soon as I learned more about this group, this movement, I needed to take part. It seemed like a natural fit for someone who, given the continued fascination with observable evidence and empiricism, self-realization through numbers is perfect for my nerdiness.
I did set a few parameters for this adventure. First, it had to measure something that I engage in on a daily basis, but something that varies. Second, I did not want to track seemingly obvious – and often targeted – activities, such as fitness, diet, or sleeping (which, based on the availability of applications and computer programs, seem to be a primary focus of self-tracking).
After some thought, I came up with a daily ritual, and pleasure, that has nothing to do with these topics, directly…coffee consumption!
The other decision I needed to make was how I was going to gather this information in order to quantify my in-take of java. Using the guide available through the Q.S. (that’s quantified self) blog, I sifted through a few resources and came up with daytum.com. It’s simple, intuitive, and, in honesty, I really like the name. I also looked at other web-interfaces, such as TallyZoo which create simplistic data pictures but I could not get over the name, and I already use an online running journal, Dailymile, yet I wanted something that conformed to my first parameter – no health / exercise – so I stuck with my first choice, Daytum.
I decided to track my consumption based on where I typically consumed my favorite beverage. At home, I use a large kiln-cured ceramic mug that, made by hand, has the resemblance of the big dipper. I call this mug “Stars”. On campus I use a glass mug that has the glasses / facial likeness / bow tie similar to that found on this blog. I call this mug “Prof. G.”. My third category is “other”.
Each day I simply log on to the Daytum website and enter how many cups of each I enjoyed that day. The Daytum interface allows me to see different representations of these data to compare between categories, including a bar chart, pie graph, or for a full summary, a simple mathematical average (mean) combining the three categories.
My next, and final step, was to find something against which I could compare my daily coffee consumption. Because I do not have personal acquaintances who do this same exercise – or not publicly, at least, and having little direct contact with Q.S.’ers…I did a quick Google search and found this Quantified Self Blog Post. However, the links to a fellow coffee tracker are broken.
When I thought all was lost, a beloved data source came through – Gallup. Each week on the public radio program Market Place Morning Report, Frank Newport, the Gallup Chief, has a spot called “Attitude Check” which presents data on timely public opinion polls. On a recent show, Newport was discussing economic attitudes, but quipped at the end of the interview,
The average coffee drinker, by the way, drinks 2.5 cups of coffee a day — how’s that for a number?
This statistic is based on this poll, which also measured soda consumption, has a margin of error of +/- 4 percentage points. Finally a baseline! A point of comparison! A reference group.
This exuberance is not just my own nerdiness, but provides a vital point for statistical literacy – numbers, even personal numbers, do not exist in a vacuum. We do not live solitary lives. There are many people out in the world that do very similar, if not the same exact things that we engage with on a daily basis. From the mundane coffee consumption to the vital economic activities of earning an income and spending disposable income, the point is that we can understand ourselves, and people like us, by comparing what we do with what others do. More generally, we can compare the activities, attitudes, outcomes, and other phenomena for seemingly disparate groups and come to varied conclusions on the themes and variations between them.
All we have to do, to start, is be willing to ask a question and gather the data necessary to uncover some answers.
By the way, as of the day of this posting, I average 4.61 cups of coffee per day (not measured cups, but my Stars, Prof. G. or Other cups – however much coffee they hold). That puts me in exclusive company – only 10% of respondents drink four or more cups of coffee per day! The mean number of daily cups is only 2.5! Thus (to be fully informative),
based on telephone interviews conducted July 9-12, 2012 with a random sample of 1,014 adults, aged 18 and older, living in all 50 U.S. states and the District of Columbia.
I am well above average! This Q.S. stuff is great!
Behind the Scenes of a Data Construction Site
Recently, the Christian Science Monitor (CSM) published an extended article documenting behind-the-scenes polling operations of the famed, and respected, Gallup Organization. The article, in and of itself is well written; but this is not a literary critique. As for substance, and the forum in which this report was published, illuminates the reality of such operations, their procedures and products, on the national conversation. Well, at least on the national conversation constructed by the mass media.
Public opinion surveys have become a ubiquitous element of American political culture. The numbers – some reliable, others less so – are pawed over and interpreted for headlines, insight, and horse-race drama by newspaper reporters, cable news talking heads, and bloggers of all party persuasions.
Yet before the media–editorial boards and writers–get a hold of these data, they are already the product of human ingenuity, scientific processes, and a balanced measure between compromise and luck. A Gallup blog response announcing the CSM article notes,
While Gallup’s mission is to scientifically measure and quantify the views of the people, the article notes, “Gallup uses humans to craft the polls and conduct them.” [The author] details the many steps of the polling process from how the survey questions are selected, to the interviewing center to how Gallup ensures that its polls are nationally representative.
The article also quotes an official from the Pew Research Center (PRC), Scott Keeler, discussing the significance of independent polling organizations such as Gallup and the PRC:
“Polling is important because it gives every voter and every nonvoter an equal chance of having their voice represented,” says Scott Keeter, the director of survey research at the Pew Research Center. “When properly done, without bias and malice, polls can give you a view of what the public is experiencing or wanting, which you don’t get from interest groups or the candidates or even elections, which are very blunt instruments.”
However, I feel these views–at least as presented here–are short-sighted. First, everything about polling, from representation to the validity of their estimates, is based on a sample. So everyone has an equal chance of having their “voice” represented, but not all voices are recorded. Also, when selected, respondents voices are constrained to the questions and response choices determined to be important by professional pollsters.
I liken this to the use of Auto-Tune in popular music recordings. Auto-Tune is computer recording software that corrects for imperfections in vocal and instrumental recordings in order to reign in spurious notes, though often resulting in a synthesized sound.
I am not attempting to be critical of the techniques used by pollsters in order to gather, process, analyze, and report data. I am, however, illuminating a piece in the popular press that allows those among us with less experience (and passion for these polls) to see through the cracks of the operation. It is true, most people digest these numbers uncritically, while others perceive polls as one part of a biased media machine.
In the end, as Frank Newport, the Editor-in-Chief of Gallup, states, polling matters. It shapes political, social, cultural, and economic discussions, influences politicians, and in the end, is in and of itself, an institution. We, as consumers, need to be aware of how this institution is built, and its influence on our daily lives.
The median, mean, and wealth
The Federal Reserve Bank’s Survey of Consumer Finance Report recently confirmed that from 2007 through 2010 household wealth has indeed fallen. Thanks to the fallout from the housing bubble and the Great Recession, these statistics, especially the median and mean net worth of families and households, which come most often from housing assets, reflect just how hard most consumers were hit.
Accordingly, as highlighted by many news outlets, median net worth declined by nearly 40%, from $126,400 in 2007 to $77,300 in 2010. Median income also fell, from $49,600 in 2007 to $45,800 (a percent change of approximately 7.7%) but the change in wealth is stark.
Also as reported, between 2007 and 2010, a larger decline that the mean net worth (see Figure 2 at right from the Fed Report) reinforces the fact that the median is a much more appropriate measure of the ‘typical case’ especially when dealing with family and household incomes and wealth. A larger observed gap between the median and the mean for any numerical measure indicates a skew in the data; if the mean is higher than the median then the skew is positive, but if the mean is lower, then the skew is negative.
On the face, the skew of the mean is not a surprise, and the observation that the mean wealth declined by a smaller percentage than the median is also not surprising, given the well-documented historical income and wealth inequality in the United States. This negative skew means that those with higher incomes and wealth experienced a much smaller change compared to those with modest or low incomes and wealth.
But there is more to this story. Buried in this report are two caveats that offer much more information on how we can analyze and understand just how this figure demonstrates wealth inequality. The first is more accessible because there are data available in the report. The following portion of Table 4, from the Fed Report, shows that the median wealth decreased for each income category except the top 10%, of families, the economic elite, owners of large international corporations, and financiers and investors whose median net worth marginally increased by 1.8 percent.
Or are they?
The second caveat is not as accessible, because it is buried deep within the “Appendix of Survey Procedures and Statistical Measures”. This second note, most importantly, shows how data of this sort–how they are collected, analyzed, and reported–are socially constructed. Yes these data come from a sample, actually a series of samples, and yes they are careful to include geographical, social, and economic conditions within the sampling design. However, the sample is also careful to exclude certain persons as well. According to the appendix,
…a supplemental sample is selected to disproportionately include wealthy families, which hold a relatively large share of such thinly held assets as non-corporate businesses and tax-exempt bonds…this group is drawn from a list of statistical records derived from tax returns. These records are used under strict rules governing confidentiality, the rights of potential respondents to refuse participation in the survey, and the types of information that can be made available. Persons listed by Forbes magazine as being among the wealthiest 400 people in the United States are excluded from sampling. (p. 78; emphasis added)
That’s correct, the 400 wealthiest persons in the United States are not included in the data presented and therefore skewing downward (shrinking) wealth inequality in the United States.
Discovered days later by the media, but not as widely reported as the original reports, had these persons been included in the sample, it is most probable that the median and mean net worth of the top 10% of families would have reflected much more inequality compared to the bottom 90%. With some of the top 400 most wealthiest people included in the sample, the median wealth could have increased across all households, but the mean wealth would have shown much more skew than is already observed. The greater the observed skew between the median and mean, the greater the disparity, and in the case, the greater the inequality in wealth.
This represents a few fundamentals of statistical literacy:
- know your subject: families, households, and wealth through a recession
- know the source of the data: governmental statistics gathered through tax returns and other sources
- know who is being counted (or not): are the most wealthy really so rich that they should be excluded?
My reading of these dubious data is that an official report, and the implication of its claims, threatens under reporting the problem of the decline in wealth and the difference between income categories. By doing so, the impact is softened, slightly, and the image the pain of the recession is possibly more widespread than actually may exist…well, unless you are on the Forbes 400 (Net Worth of the Wealthy). The non-wealthy are easy to trace because what wealth is owned is usually concentrated in the housing sector. But the most wealthy have assets that come in many forms ranging from owning corporations and international conglomerates to fine art.

