Researchers dive in to data discussions
University of Delaware faculty turned out in force at a recent symposium to shape the future of data science at UD. The ability to extract meaning from “big data” — quadrillions and quintillions of bytes of information — stands to transform research, scholarship and innovation in profound ways.
The symposium, hosted by the Research Office and organized by a faculty committee, featured keynote talks, break-out sessions and short research presentations to help identify UD’s strengths, gaps and aspirations in data science. With that input, a white paper is now being developed by a faculty committee co-chaired by Eric Wommack and Nii Attoh-Okine. It will serve as a roadmap to the future, including a possible data science institute at UD.
“I’m fond of big data,” University President Dennis Assanis told the audience during his keynote address. He said he was “immersed in algorithms” early on as a graduate student at MIT. Later, as provost at Stony Brook University, he raised the funds to establish a data science institute with core strengths in bioinformatics, math and social sciences. It quickly became a hub for collaboration, involving faculty and students from multiple disciplines, as well as industry.
“Different places have different strengths,” Assanis said, in discussing approaches to moving the data science agenda forward at UD. “We need to identify the right things for us — perhaps coastal resilience, biomedical informatics with NIIMBL, the Biden Institute, social sciences, cancer…. Where do we want to be the strongest? What are we trying to address? It’s an amazing opportunity, and I’m very excited about it.”
Data science at UD
The University has a rich variety of data science research underway. That became evident very quickly during discussions and a “lightning round” of two-minute faculty and student presentations. As just a few examples:
- Jeff Buehler, assistant professor of wildlife ecology, uses a digital mountain of NEXRAD weather radar data to map migratory bird distributions and analyze how humans may be impacting bird flight patterns.
- Ben Bagozzi, assistant professor of political science and international relations, is analyzing political texts to try to forecast and help prevent atrocities against civilians.
- Sally Dodson-Robinson, associate professor of physics and astronomy, is collaborating with colleagues at Yale on the “100 Earths Project,” to discover 100 Earth-like planets. Her team will comb through vast data to filter out so-called “parasitic planets”—stars with a spot—which are sometimes misidentified as planets.
- Antony Beris, Arthur B. Metzner Professor of Chemical Engineering, models complex fluids, including blood, which is thixotropic, meaning that it is naturally thick, but becomes less viscous over time. His data-intensive work simulates arterial blood flow to improve understanding of how cardiovascular diseases develop.
- Lindsay Hoffman, associate professor of communication, has collected extensive social media data from the 2016 U.S. presidential election, which she wants to explore and analyze with potential collaborators.
- Kevin Brinson directs the Delaware Environmental Observing System, which has over 1.2 billion data points. He’s seeking collaborators to leverage this weather and climate data for the development of analytical products, perhaps in public health or economics.
“This meeting shows how extraordinary the participation across the University is in data science and how translative it is,” said Eric Wommack, deputy dean of the College of Agriculture and Natural Resources, and co-chair of the planning committee. “By bringing people together, we hope to see lots of new collaborations launched.”
Andrew Ho, professor in the Harvard Graduate School of Education, spoke about “Big Data as Public Good: An Example from Education.” He’s on the project team at the Stanford Education Data Archive (SEDA), which is harnessing data to help scholars, policymakers and parents learn how to improve educational opportunity for all children.
SEDA looks at how much educational outcomes vary across U.S. communities, why and how to help equalize inequalities. The publicly accessible data archive encompasses more than 11,000 geographic school districts, from grades 3–8, more than 215 million scores from state performance tests, race and ethnic data and demographic data such as family characteristics.
Srinivas Aluru, professor of computational science and engineering, shared his experiences establishing Georgia Tech’s new Institute for Data Engineering and Science (IDEaS), which he co-directs.
The institute links research centers and initiatives horizontally in such foundational areas as machine learning and high-performance computing. It helps match industry with data partners on campus, serves as an incubator for economic development and aids other institutions in the state, including the Centers for Disease Control.
Chaitan Baru, the National Science Foundation’s senior adviser for data science, Skyped in to talk about Harnessing Data for 21st-Century Science and Engineering, the agency’s initiative to develop a national-scale approach to research data infrastructure and to build a data-savvy workforce.
“We need fair, interpretable, transparent and trustworthy data science,” he said, “and a STEM-capable workforce for a range of needs, from the technician through the Ph.D.”
Through the Transdisciplinary Research in Principles of Data Science (TRIPODS) program, NSF is bringing together the statistics, math and theoretical science communities to develop the theoretical foundations of data science. It also is supporting the development of small collaborative institutes, as well as a smaller number of larger institutes, Baru said.
–Article by Tracey Bryant Photos by David Barczak