STEER: Sentiment Tracking and Evaluation Engine for Research

Transforming Noisy Social Media Data into Real-Time Actionable Insights
University
Date
Fall 2024
Blog
Link
Callout
Link
Report
Link
Miro
Link
Presentation
Link
Deck
Link
Langchain Demo
Link
Design mockup of Voltvalet

STEER: SENTIMENT TRACKING & EVALUATION ENGINE FOR RESEARCH

This project, spearheaded by a team from Smith College’s Data Science Clinic in collaboration with 99P Labs, focuses on creating a comprehensive tool for analyzing social media sentiment at scale. Dubbed “STEER,” the Sentiment Tracking and Evaluation Engine for Research automates the tedious process of collecting and evaluating Reddit posts, transforming raw user opinions into high-level insights. By integrating GPT-4o mini, STEER excels at handling messy, unstructured text data and providing concise summaries that clarify prevalent sentiments and recurring issues.

STEER’s workflow begins with a targeted web-scraping function that collects up to 100 posts from a selected subreddit. Users can further refine results by specifying keywords or thematic categories relevant to their needs. Once the data is gathered, an in-house caching mechanism checks if the same subreddit was scraped in the past five days, ensuring freshness while cutting unnecessary overhead. The pipeline then uses GPT-4o mini to assign a sentiment score—negative, neutral, or positive—to each post component and generate an executive summary. Concurrently, a zero-shot classification framework powered by Meta’s BART model predicts which category (e.g., “Reliability,” “Safety,” “Appearance,” “Mileage”) each post belongs to, offering a granular view of user discussions.

A key part of STEER is its interactive user interface, which packages these insights into real-time visualizations and dashboards. Simple input fields let users specify their subreddit, keywords, and desired categories. In return, they receive a high-level summary of sentiment trends, enabling quick detection of problem areas (like safety concerns) or emergent themes (such as frequent part shortages). A built-in chatbot component further enriches the experience by allowing users to ask tailored follow-up questions. By conversing with this chatbot, one can uncover deeper insights—for instance, pinpointing which specific models or product features spark the most frustration or satisfaction among customers.

In essence, STEER equips organizations with a straightforward yet powerful way to gather, interpret, and act on large volumes of social media data. By fusing data science methodologies with cutting-edge language models, this project underscores how automated pipelines can reshape traditional sentiment analysis, making it more scalable, precise, and user-friendly. As social platforms continue to expand, STEER’s adaptable architecture stands ready to extend beyond Reddit, offering a versatile foundation for analyzing—and ultimately improving—customer perceptions and experiences across the digital landscape.

Stay Connected

Follow our journey on Medium and LinkedIn.