Contributors:
Onur Yürüten and Pearl Pu
Description:
This dataset was curated to discover common physical activity routines of people and analyse the effects of social interventions on the behavior patterns. More specifically, it was curated via a longitudinal user study that involved a wearable sensor (Fitbit) and our custom mobile application called HealthyTogether. The dataset contains calorie expenditure and steps each participant to the longitudinal study.
More details on the curated data can be found in:
- Onur Yürüten and Pearl Pu. Factoring the Habits: Comparing Methods for Discovering Behavior Patterns from Large Scale Activity Datasets. International Conference on Big Data Analytics, Data Mining and Computational Intelligence (BIGDACI). Part of Multi Conference on Computer Science and Information Systems (MCCSIS) (forthcoming) 2016
- Onur Yürüten, Jiyong Zhang, and Pearl Pu. Decomposing Activities of Daily Living to Discover Routine Clusters. In the 28th AAAI Conference on Artificial Intelligence (AAAI-14), Quebec City, Canada, July 27-31, 2014
We distribute two version of this dataset:
– HT-48, the original one, created as explained in the AAAI paper;
– HT-83, where we expanded the dataset to 83 users
Contributors:
Onur Yürüten and Pearl Pu
Description:
This dataset contains the daily steps counts of 1000 participants of a social exercising campaign, who wear different sensors to measure their level of activeness. The dataset also contains the gender, height, weight, exercise group id, company id, and age (all anonymized).
Contributors:
Valentina Sintsova and Pearl Pu
Description:
Tweets with explicit emotional hashtags
We collected 17.6 million tweets with explicit emotional hashtags corresponding to the GEW emotion categories, by using Twitter Streaming API between 27th February and 26th May of 2014. Among them, we extracted 1,729,980 tweets that had those hashtags at the end of the text, were not repeated, were no retweets, did not contain URLs, and were assigned to only one emotion category. Using 500,000 of these pseudo-labeled tweets, we built the PMI-Hash emotion lexicon, as described above.
Unfortunately, by Twitter terms of service, we cannot share this dataset directly. Thus, we share those tweets via their identifiers.
Contributors:
Valentina Sintsova and Pearl Pu
Description:
This dataset contains a set of sports-related tweets manually annotated with emotion categories. The annotation was performed by workers from the Amazon Mechanical Turk platform. This dataset was the basis for the OlympLex emotion lexicon.
More details on the collection and annotation process can be found in:
Valentina Sintsova, Claudiu Musat, and Pearl Pu. Fine-Grained Emotion Recognition in Olympic Tweets Based on Human Computation. In Proceedings of the NAACL/HLT Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA), ACL, 2013.
Unfortunately, by Twitter terms of service, we cannot share this dataset directly. Thus, we share those tweets via their identifiers. In the current distribution (version 1.2), we provide the annotation for 1265 tweets for which we have the Twitter identifiers, instead of all 1957 tweets used in the paper.