You + Your Data

We’re creating the apps and infrastructure to put your small data to good use for you.

The Building Blocks of Small Data Apps

Creating Access to Your Data for You

Small Data is the myriad of data traces we each generate everyday. Unfortunately, that data is often unavailable to us in a form that we can make sense of or act upon. Imagine a special kind of app running in the cloud that privately and securely turns your small data into big insights.

Open-Source and Modular

We've designed our infrastructure and apps to be modular and reusable so that they can be remixed and repurposed for new services and platform architectures. And everything down to the front-end UIs of our small data apps are Open Source. Fork away.

Developed Through Applied Research

We're not just building tools and services for theoretical use cases and "what if?" scenarios. Through our collaborations with researchers spanning behavioral economics, healthcare, human computer interaction and policy we iteratively develop and evolve our services to address issues and problems in the real-world.


We're going beyond recommendation engines to services that contextually parse your small data to provide the right insights at the right time, to help you make (hopefully) the right decision. And as for the data that drives these insights, we take the utmost care and precaution with regards to its security while advancing architectural frameworks that maximize transparency and individual control of small data usage.

Small Data Apps

Small data powered services that solve real problems in the world and create new experiences



Newsfie is a research prototype that enables instant news personalization using small data. It recommends news articles based on users' tweets, facebook posts, email, slack communications, or Youtube watch history. You can experience it with your own data, or see what Newfie recommends to other famous people, such as Barack Obama, Bill Gates, Neil Patrick Harris, etc., based on their Twitter data. Also, make sure to check out the recommendations for Charles Dickens based on his letters written in 19 century.


GroupLink is a group event recommendation system that suggests events to promote group members’ face-to-face interactions in non-work settings. GroupLink uses small data to address the challenge of finding events that appeal to a collection of individuals with diverse interests by learning preferences from individual member’s personal digital traces, including social media, email, and online streaming histories.



Limbr helps you to monitor your progress towards freedom from back pain and success to a healthier life. A suite of applications make it easy for you to follow your care plan, report your daily progress and set backs to your care team, and receive personalized coaching to give you the best opportunity for success.


Your Activities of Daily Living (YADL) is an image-based survey for patients with arthritis. It uses images of Activities of Daily Living (ADLs) to improve patients' experience. The interface possesses several added benefits: wider coverage of ADLs, engaged and personalized experience, and accurate capture of individual health situations. YADL holds the promise for improving the efficacy of reporting ADLs and enhancing doctor-patient communications.


This work aims to transform the derivation of clinically-actionable pain measures from patient-generated, mobile-health data.

In particular, we are designing, implementing and evaluating:


Fully integrated into online grocery services Instacart, FreshDirect, and Peapod, Pushcart is an small data email service helping users define nutritional goals and providing direct feedback on how their grocery purchases support their personal health goals.

Pushcart works seamlessly with your online grocery service to analyze your purchases and map them to your personal health goals. No receipt scanning, no apps to install. Just clear insights and tips on achieving your health goals delivered straight to your email inbox.


PlateClick is a novel system for efficient food preference elicitation using a simple, visual quiz-based user interface. We leverage a pairwise comparison approach with only visual content and propose a CNN based online learning framework that learns users' preferences across a large scale dataset based on a small number of interactions. The online learning framework could be used to enhance the interplay between human hedonic and content similarities in solving general human-in-the-loop problems. We envision that PlateClick could be used to capture personal diet profiles and fuel a wide range of applications in healthcare and commercial recommender systems.

Built on PlateClick, Yum-me is a personalized healthy meal recommender system designed to meet individuals’ health goals, dietary restrictions, and fine-grained food preferences.

Health Platforms


ohmage-omh is an open-source, open-architecture, mobile health platform intended for rapid prototyping and piloting of mobile health applications. Participants of studies run specified mobile applications on their smartphone (Android 3.0.x+ and iOS ). A data storage unit securely stores the information from the participant under their separate account. ohmage-omh supports Open mHealth (omh) read and write APIs. The Admin dashboard retrieves data from the database using the omh read API in accordance with the access credentials of the logged-in user.


ResearchStack is an SDK and UX framework for building research study apps on Android.

An overriding goal of ResearchStack is to help developers and researchers with existing apps on iOS more easily adapt those apps for Android. Though the correspondence of features between the two SDKs isn’t one-to-one, the two SDKs offer enough shared functionality to greatly speed up adaptation of ResearchKit apps to Android (and ResearchStack apps to iOS) and the procedural aspects of running a study on a new platform (such as IRB approval and secure connectivity with a data collection backend).

ResearchStack is developed in collaboration with Open mHealth and touchlab. Initial funding for the project is provided by the Robert Wood Johnson Foundation.

Click here to get started!

Personal Analytics


Coming Soon!


Coming Soon!

The Small Data Lab Team

Research Team
Arnaud Sahuguet,
Director of the Foundry @ Cornell Tech
JP Pollak,
Senior Researcher-in-Residence
Faisal Alquadoomi,
PhD Student
Andy Hsieh,
PhD Student
Aliza Lena Selter,
mHealth Research Coordinator
Hongyi Wen,
PhD Student

Lab Advisors
Deneen Vojta,
United Health Group
Dan Stein,
Clinical Advisor in Residence
Thomas Tsang,
Clinical Advisor in Residence

Software Collaborators
Michael Carroll,
Matthew Griffith,
Developer in Residence
James Kizer,
Developer in Residence
Jared Sieling,
Solutions Architect
Adrian Vatchinsky,
Developer in Residence

Masters Students
Ran Godrich
Yanbo Li
Rachel Ruiheng Wang
Yating Zhan

Former Members
Aaron Baum
Anas Bouzoubaa
Diana Freed

Lucky Gunasekara
Neil Lakin
Narumi Toida

Judy Wu

Interested in learning more?
We'd love to hear from you.

How to find us.
Drop us a line