Amino is building a new way to connect people to the best possible care for their needs, using an unprecedented database of insurance claims from the American healthcare system. Building this database is not easy—there are many systematic and technical hurdles to assembling a robust database and performing accurate analysis on it.
Amino’s database contains patient de-identified records about 220 million people, 951 thousand doctors and health care facilities, over 10 billion health insurance claims, and $1.8 trillion in medical bills. Out of the 220 million people in our database, we have extensive data encompassing more than a year of medical visits for more than 168 million people. The rest have less than a year of visits in our database either because we are missing their data or because they did not visit a doctor for long periods of time.
Because we work with data that is de-identified at the patient level, it is difficult to ensure that individual patients are tracked consistently over time, and our data about individual patients is limited. We do know some things about individual patients—like their age, sex, and approximate geographical location—but we do not have detailed medical histories or demographic information for these individuals beyond what appears in medical claims records. We design our analyses to account for the effects of these difficulties.
While we have data on many people and our goal is to be as comprehensive as possible, there are many people about whom we have no data. We gather information on at least 120 private insurance providers, Medicare (2009 onwards) and other data sources, but we do not have information about medical visits where the patient did not use health insurance (e.g. cash payment situations, visits to certain community health centers, uninsured patients). We do not receive information on every private insurer, and we do not have all of the data from any one private insurer.
We also have only the information coded into medical claims. When doctors submit claims information to health insurance companies, they include information like medical diagnosis and procedure codes that describe the services they provided to their patients. How doctors, health care provider systems, and medical staff translate doctor-patient interactions into claims codes can vary depending on their interpretation of the meanings of codes, billing incentives, and knowledge of claims coding best practices. We have thoroughly considered algorithms for turning these claims codes into accurate representations of patient experience, but there are certainly some mismatches between some doctors and our interpretation.
Last updated May 17, 2017