Breakout Sessions

Participants will partake in five of eight possible breakout workshops, taught by instructors from CMU, Pitt, and elsewhere. You must register for your preferred breakout sessions here by Friday 5/25.

None of the workshops require previous experience in math or programming.

Research Design for Large Scale Text Analysis

Matthew Lavin, Clinical Assistant Professor of English and Director of the Digital Media Lab, University of Pittsburgh

Benefitting from the large-scale opportunities enabled by computational text analysis typically involves developing a central research question, designing an act of quantification related to that question, modifying existing data or collecting new data related to the question, and interpreting a principal measurement in a way that informs the initial research question. Research design describes the work of ensuring that the principal measurement will inform the initial question. This workshop will focus on strategies to make planning and implementing computational text analysis more manageable and increase the chances of an informative finding.

Note: Do not register for this course if you have already taken his Spring 2018 workshop of the same title held at Hillman Library.

Computer Simulations as a Method for Philosophy

Kevin Zollman, Associate Professor of Philosophy, Carnegie Mellon University

I will discuss the ways that computer simulations can be thought of as a philosophical method. I will connect simulation with extant philosophical problems, and show why it is a superior tool for certain types of problems. In the accompanying workshop, I will work with students on exploring a simulation using the simulation platform NetLogo. We will learn how to study simulations and discuss what one can and cannot learn from them. (Scott Weingart’s addendum: In this workshop, you will learn to teach a computer simple instructions about how you believe the world works, and use those instructions to simulate the possible consequences of these modeled beliefs. This can be used to explore, for example, how language evolves, how flocks of birds form, how social networks grow, and whether segregation can occur even when homebuyers hope for diverse neighbors.)

Software requirements: Download and install NetLogo

Data Visualization with Tableau

Emma Slayton, CLIR Fellow for Data Visualization, Carnegie Mellon University

Are you currently involved in work that could be augmented by visualizing your data for analysis or to communicate your findings to the public? Would you like to further your skills or learn new ways to make charts, graphs, maps, or other types of visualizations? Come to a workshop that provides a background to creating successful graphs and charts. Data visualization, or the techniques used to visually display or communicate data, is an obvious output of our research or data analysis. The idea is to quickly and clearly display data for purposes of analysis or presentation. Being able to effectively communicate your data to an audience is a necessary part of any project, and is made easier through the use of visualization programs like Tableau. In this workshop, we will discuss the basic capabilities of Tableau as well as take a hands-on approach to using the program.

Software requirements: Please come to the workshop with a copy of Tableau downloaded on your laptop. You can register for a free one-year student license through this link: https://www.tableau.com/academic/students

There Is No Spoon: Networks and Digital History

Zoe LeBlanc, DH Developer at Scholars’ Lab, University of Virginia

Digital History, much like the hit 1999 Matrix series in pop culture, seems to be making waves. But for those new to this field there’s still a big gap between idea and actually executing a project. While there is no download button for digital history, this workshop will try and help participants bridge this gap. In this workshop, we’ll explore why so many digital history projects utilize networks. We’ll discuss the foundations of network analysis, how these methods can shape your research, and in turn how historical research questions are suited to different types of methods and data. Participants will work together to think about how their own research might incorporate digital history methods and help chart what future steps they might take to build a research agenda. Just as Neo learned there is no spoon, participants in this workshop will (hopefully) learn that digital history is not one set of practices, but rather part of a broader constellation of disciplines that can inform their research and change how they think as historians.

Note: Prior to the workshop participants will fill out a brief survey, and should follow all the instructions here.

Creative Writing with Natural Language Processing

Allison Parrish, Faculty at the Interactive Telecommunications Program, New York University

Computational tools and statistical analysis are often deployed as a method to “read” texts. But what about using these same techniques to write them? In this workshop, we’ll investigate the state of the art of natural language processing with an eye toward using the sometimes-unintuitive abstractions of language produced by computational models to make programs that create surprising and poetic creative writing. Topics include: a whirlwind tour of spaCy for parsing English into syntactic constituents; a discussion of techniques for classifying and summarizing documents; and an explanation and demonstration of “word vectors” (like Google’s word2vec), an innovative language technology that allows computers to process written language less as discrete units and more like a continuous signal. Workshop participants will develop a number of small projects in text analysis and poetics using a public domain text of their choice. In becoming familiar with contemporary techniques for computational language analysis, critics and researchers will be able to reason better about language-based media on the Internet. Artists and writers, meanwhile, might just learn a few new techniques to add to their creative palette.

Software requirements: Download and install Anaconda for Python 3.6 for your platform before the workshop.

GIS and Mapping

Jessica Benner, Computer Science and GIS Librarian, Carnegie Mellon University

This workshop will introduce you to the use of geographic information systems (GIS). GIS are tools that allow you to explore the geographic aspects of your data. Whether you study the material culture of a particular community, the works of an eighteenth century writer, or the history of the civil rights movement, there is likely a spatial component to your data. You can add this by collecting spatial information related to your other data. Spatial information can be as coarse as countries, states, counties or zip codes, or as precise as addresses and coordinate pairs. These types of data are used in an GIS tool to help you visualize your data in a spatial context (i.e., a map). During this workshop, you will (1) learn about what a GIS is, different models for representing spatial data, the fundamental aspects of a map, (2) explore several existing DH projects that have a spatial focus or spatial components, and (3) practice determining when the use of GIS is appropriate for a project.

Thinking Through Word Embeddings for the Humanities

Ben Schmidt, Assistant Professor of History, Northeastern University

Word embeddings are a family of algorithms that try to represent the meanings of all the words within a given set of texts as positions in space that you can use for comparison, computation, or visualization. They’re a big topic in natural language processing: they can be very effective at exploring changes in meaning, or patterns of usage; they also serve as a useful introduction for the sorts of strategies most modern machine learning algorithms use.

This workshop will cover the basics of word embeddings: what they do, how to train a model using word2vec, and how to use them to search for synonyms and analogies. We’ll also look at issues more specific to the humanities and social sciences, including how to compare models trained on different sets of texts to each other, and strategies for visualizing models. Finally, we’ll talk about the social biases embodied in the space of language models, both as a technical problem with solutions and as an opportunity for algorithmic criticism.

Hands-on analysis and visualization will be done editing pre-written scripts in the R statistical environment; no programming experience is necessary (and not much will be gained!). We’ll distribute several pre-trained models at the workshop. If you have a large set of your own texts you’d like to try and train ahead of time, please contact the instructor.

Software requirements: Laptop with R and Rstudio programs installed recommended. Instructions for installing “R” and “Rstudio” are available online: follow for example, these: http://web.cs.ucla.edu/~gulzar/rstudio/

Distant Viewing with Deep Learning: An Introduction to Analyzing Large Corpora of Images

Lauren Tilton, Assistant Professor of Digital Humanities, and Taylor Arnold, Assistant Professor of Statistics, University of Richmond

This tutorial provides a hands-on introduction to the use of deep learning techniques in the study of large image corpora. The TensorFlow and Keras libraries within the Python programming language are used to facilitate this analysis. No prior programming experience is required. Image analysis tasks covered in the tutorial include object detection, facial recognition, image similarity, and image clustering. We will make open-access image corpora available in order to test these methods.