We strive to create an environment conducive to many different types of research across many different time scales and levels of risk.
We present SensorLM, a new family of sensor–language foundation models trained on 60 million hours of data, connecting multimodal wearable sensor signals to natural language for a deeper understanding of our health and activities.
Wearable devices, from smartwatches to fitness trackers, have become ubiquitous, continuously capturing a rich stream of data about our lives. They record our heart rate, count our steps, track our fitness and sleep, and much more. This deluge of information holds immense potential for personalized health and wellness. However, while we can easily see what our body is doing (e.g., a heart rate of 150 bpm), the crucial context of why (say, "a brisk uphill run" vs. "a stressful public speaking event") is often missing. This gap between raw sensor data and its real-world meaning has been a major barrier to unlocking the full potential of these devices.
The primary challenge lies in the scarcity of large-scale datasets that pair sensor recordings with rich, descriptive text. Manually annotating millions of hours of data is prohibitively expensive and time-consuming. To solve this, and to truly let wearable data "speak for itself", we need models that can learn the intricate connections between sensor signals and human language directly from the data.