Inspired by big data advances in astronomy and physics, a group of researchers wants to use the same approach for social sciences. The plan is to recruit 10,000 New Yorkers and track everything technology can record about their lives over 20 years. From this ocean of data, the scientists hope to spot patterns that we otherwise might have missed about our health, financial situation, parenting — just about everything.
The Kavli HUMAN Project is ambitious, to say the least. It will be “the first true study of all of the factors that make humans… human,” as stated on the project’s website.
Enabled by technology like wearables, smart sensors or the ubiquity of smartphones and mobile apps, the team led by Paul Glimcher, a social scientist at New York University, wants to learn as much as it is humanly possible about the lives of 10,000 people from 4,000 households. Each person’s life will be documented, which includes the full genome sequence, IQ and personality scores, financial records, buying habits, employment, time spent on email or with pets, stool samples and so on. There’s even an idea laid on the table to track how often participants interact with family members using Bluetooth sensors. And all of this should go on for two decades.
There is no immediate goal in mind. It’s just about mining data in the hope that eventually among all the noise we will spot patterns. For instance, we could learn why some people are healthier, happier and live longer than others, then use this information to make everyone else’s lives at least a bit better.
Kavli HUMAN took cues from big data astronomy projects like the revolutionary Sloan Digital Sky Survey which records millions of stars, galaxies or quasars. Similarly to how Sloan expanded our understanding of the universe, Kavli HUMAN will work to unravel the human universe — an insane mix of biology, behavior, and environment.
“… Cutting-edge questions are unanswered because we lack the data related to the genetic regulators of aging processes; the impact of intrauterine growth restrictions and child maltreatment; the interaction of aging with cognitive stimulation in early, mid, and later life; the interaction of stress and physical activity; and the interaction of all of these with economic status,” a recent paper discussing Kavli HUMAN reads.
The whole setup might alarm some as being, well, very creepy. But the prospects could be truly revolutionary. Unlike physics or math, the results from social science studies are often unreproducible, meaning the initial findings could have been false. Partly to blame is the lack of resources scientists have at their disposal as they’re forced to only work in small groups ranging from 50 to 100 people, then control some factors, experiment and draw conclusions. Yet, to be relevant, reliable, and reproducible, it’s preferable to have thousands of participants.
People aren’t protons either. While there are stereotypes and behavioral patterns, these differ according to race, upbringing, occupation, location and so on — and Kavli HUMAN promises to solve these shortcomings by implicating a lot of demographics and recording their interactions more granularly than anyone before.
It’s basically a longitudinal study, an order of magnitude more complex. The Framingham Heart Study, which began in 1948 with 5,209 adult subjects from Framingham, Massachusetts, is the source of a great deal of current knowledge about heart disease. Most of what we know today about the impact of diet, exercise and common medications like aspirin have on heart disease is owed to this study. This study also helped establish that smoking is lethal. But while Framingham researchers gathered half a kilobyte of data a subject a year, Kavli will log a gigabyte onward.
“if you have breast cancer, there is a suite of genetic structures. There are actually six or eight main genes and about 10 minor genes, and they interact in defining your tumor’s sensitivity. Your treatment program is not as simple as, “You have breast cancer. Take this drug.” Nor is it as simple as, “Your tumor is one of five types. Take one of these five drugs.”
Your treatment is a function, a mathematical function, of the interaction of the different genes that make up your tumor and hence define your tumor’s vulnerability.
The genotype of your tumor will be mapped, and it will yield a very detailed treatment program, through a machine-learned algorithm,” Glimcher told VOX.
The multi-million dollar project is slated to start soon, as Glimcher hopes to recruit the first volunteers in mid-2017. If it goes well, the study might be expanded to include other cities.