Please visit https://capture.udel.edu/channel/Data+Science+Community+Hour for recordings of previous events (must be logged in).

Natural language processing (NLP) themed week with three 15-minute talks,
and industry guest Dr. Eder Santana from Twitch

Matthew Mauriello, CIS, UD;
Arshiya Khan, CCRG, ECE, UD;
Austin Brockmeier, ECE/CIS., UD

Time: March 25, 2021 @ 4:30 PM to 5:30 PM
Location: Zoom

“SAD: A Stress Annotated Dataset for Recognizing Everyday Stressors in SMS-like Conversational Systems [a CHI 2021 late-breaking work]” — Prof. Matthew Mauriello, UD CIS.
There is limited infrastructure for providing stress management services to those in need. To address this problem, chatbots are viewed as a scalable solution. However, one limiting factor is having clear definitions and examples of daily stress on which to build models and methods for routing appropriate advice during conversations. In this talk, we will discuss recent work to develop a dataset of 6850 SMS- like sentences that can be used to classify SMS-like input using a scheme of 9 stressor categories derived from: stress management literature, live conversations from a prototype chatbot system, crowdsourcing, and targeted web scraping from an online repository. In addition to practical consideration around building the dataset, we’ll touch on analysis of it that demonstrates its potential efficacy, look at how real-time events result in topic drift, and describe its implementation in a future SMS-based chatbot.
“GPT-3” — Arshiya Khan, CCRG, ECE, UD.
This talk is an introduction to GPT-3 (Generative Pre-trained Transformer). It was the highlight of 2020 NLP research domain. It is a machine translation model capable of aggressively resolving NLP pain point, i.e., context. The model has been successful in recognizing facts, remembering trivia questions, applying reasoning and logic and most surprisingly, reverse engineer code.
“Text Mining for Medicine and Health” — Prof. Austin Brockmeier, UD ECE/CIS.
Introduction to text-mining data representations as well as use cases for text document clustering, topic- modeling, and prioritization.

Matthew Louis Mauriello is an Assistant Professor in the Department of Computer and Information Sciences at the University of Delaware. His work is in the area of Human-Computer Interaction (HCI) and focuses on applying user-centered design and computer science techniques (e.g., information visualization, machine learning) to societal challenges emphasizing those in our health, education, environmental, and computing systems. He is a member of the Association for Computing Machinery (ACM), the ACM Special Interest Group on Computer-Human Interaction, and the International Game Developers Association.
Arshiya Khan is currently pursuing her Ph.D. in ECE (Cybersecurity) at University of Delaware. Her areas of interest include network security, artificial general intelligence and robust machine learning. She wrote her M.S. thesis on feature taxonomy of network traffic for machine learning algorithms.
Austin Brockmeier is an Assistant Professor in the Departments of Electrical & Computer Engineering and Computer & Information Sciences at the University of Delaware. He is a Resident Faculty in the Data Science Institute. He has published over 30 articles related to data science in some way including text mining for medicine and public health.

