You + Your Data

We’re creating the apps and infrastructure to put your small data to good use for you.

The Building Blocks of Small Data Apps

Creating Access to Your Data for You

Small Data is the myriad of data traces we each generate everyday. Unfortunately, that data is often unavailable to us in a form that we can make sense of or act upon. Imagine a special kind of app running in the cloud that privately and securely turns your small data into big insights.

Open-Source and Modular

We've designed our infrastructure and apps to be modular and reusable so that they can be remixed and repurposed for new services and platform architectures. And everything down to the front-end UIs of our small data apps are Open Source. Fork away.

Developed Through Applied Research

We're not just building tools and services for theoretical use cases and "what if?" scenarios. Through our collaborations with researchers spanning behavioral economics, healthcare, human computer interaction and policy we iteratively develop and evolve our services to address issues and problems in the real-world.


We're going beyond recommendation engines to services that contextually parse your small data to provide the right insights at the right time, to help you make (hopefully) the right decision. And as for the data that drives these insights, we take the utmost care and precaution with regards to its security while advancing architectural frameworks that maximize transparency and individual control of small data usage.

Small Data Apps

Small data powered services that solve real problems in the world and create new experiences

Let's take a quick walkthrough Pushcart's email experience.

1. Set Your Goal
You set your nutritional and health goals around your eating habits.
and Calibrate Pushcart to Your Household
By sharing some basic but important details about your household, Pushcart gets smarter in its feedback and recommendations.
2. Easy to use
With a personal custom Pushcart e-mail, autoforwarding your order emails is a snap.
3. Effortless e-mail Reciept Handling
With your Pushcart e-mail setup, your receipts are processed seamlessly and effortlessly.
4. Clear Feedback
Get clear data analysis and feedback on your grocery purchases and how well they map to your health goals.


PlateClick is a novel system for efficient food preference elicitation using a simple, visual quiz-based user interface. We leverage a pairwise comparison approach with only visual content and propose a CNN based online learning framework that learns users' preferences across a large scale dataset based on a small number of interactions. The online learning framework could be used to enhance the interplay between human hedonic and content similarities in solving general human-in-the-loop problems. We envision that PlateClick could be used to capture personal diet profiles and fuel a wide range of applications in healthcare and commercial recommender systems.


Newsfie enable instant news personalization using users' diverse digital breadcrumbs. It recommends news articles based on users' tweets, facebook posts, email, and slack communications. Everyone is encouraged to experience it with their own digital traces, or take a peek at what Newfie recommends to other famous people based on their public Twitter data, including Barack Obama, Bill Gates, Neil Patrick Harris, etc.. Newsfie works for any kind of text records. For example, you can check out Newsfie's recommendations to Charles Dickens based on his letter written in 19 century. Newsfie technology allows users to further fine-tune the recommendations through "More like this", and "Less like that" buttons that provide continuously personalize their news experiences.


Your Activities of Daily Living (YADL) is an image-based survey for patients with arthritis. It uses images of Activities of Daily Living (ADLs) to improve patients' experience. The interface possesses several added benefits: wider coverage of ADLs, engaged and personalized experience, and accurate capture of individual health situations. YADL holds the promise for improving the efficacy of reporting ADLs and enhancing doctor-patient communications.


Coming Soon!


Coming Soon!


ohmage-omh is an open-source, open-architecture, mobile health platform intended for rapid prototyping and piloting of mobile health applications. Participants of studies run specified mobile applications on their smartphone (Android 3.0.x+ and iOS ). A data storage unit securely stores the information from the participant under their separate account. ohmage-omh supports Open mHealth (omh) read and write APIs. The Admin dashboard retrieves data from the database using the omh read API in accordance with the access credentials of the logged-in user.


The EAF is a personal text analysis framework that focuses on extracting statistics and structured information from private text sources, such as email, and generating inferences about the internal state of the user in terms of the frequency and style in which they communicate with others. The EAF is middleware in the sense that it does not perform visualization or predictions on its own; instead, the computed data are provided to third-party applications at the user's discretion, through a fine-grained per-application access control mechanism.

The EAF analysis core is being updated to make use of large public corpora of conversational text, specifically the recently-released reddit corpus, in order to robustly train models that can then be applied to sparse personal text streams such as email.

Check it out!


ResearchStack is an SDK and UX framework for building research study apps on Android.

An overriding goal of ResearchStack is to help developers and researchers with existing apps on iOS more easily adapt those apps for Android. Though the correspondence of features between the two SDKs isn’t one-to-one, the two SDKs offer enough shared functionality to greatly speed up adaptation of ResearchKit apps to Android (and ResearchStack apps to iOS) and the procedural aspects of running a study on a new platform (such as IRB approval and secure connectivity with a data collection backend).

ResearchStack is developed in collaboration with Open mHealth and touchlab. Initial funding for the project is provided by the Robert Wood Johnson Foundation.

Click here to get started!

The Small Data Lab Team

Faculty and Staff
JP Pollak,
Senior Researcher-in-Residence
Narumi Toida,
Study Coordinator
Dan Stein,
Clinical Advisor in Residence
Thomas Tsang,
Clinical Advisor in Residence

Aaron Baum
Michael Carroll
Diana Freed

PhD Students
Faisal Alquadoomi
Andy Hsieh
Fabian Okeke
Longqi Yang

Software Collaborators
Jared Sieling
Judy Wu

Masters Students
Anas Bouzoubaa
Yanbo Li
Rachel Ruiheng Wang

Interested in learning more?
We'd love to hear from you.

How to find us.
Drop us a line