Love Data Week at Pitt
♥ Pitt loves data! ♥
Join us for Love Data Week 2022, February 14-18, for a collection of fascinating, data-oriented research talks, explorations of art and data, and hands-on software skills workshops.
This year, we are excited to invite the following Pitt community members to present about their research and art:
- I got my advisor’s paper retracted. Open Data could have prevented that paper. • Prof. Sergey Frolov, Department of Physics and Astronomy faculty
- “The incentives made me do it!”: Really? • Prof. Edouard Machery, Director, Center for Philosophy of Science
- Input/Output - Creative Approaches to Source Material and Sonic Result in Electronic Music • Dr. Aaron Myers-Brooks, Department of Music Faculty
- Digital Textual Analysis of Early Modern Witchcraft and Murder Pamphlets, or, How to Love Your Small Data • Briana Wipf, Literature Program PhD candidate
- CCVG Data - a journey from book to dataset and interactive platform • members of the Contemporary Chinese Village Gazeteer Data project:
What is Love Data Week? It’s a time to celebrate working with and managing data and code, brought to you by your friendly neighborhood data librarians.
Want to make it social media official? Follow #LoveDataPgh on Twitter!
Love Data Week at a glance
Click titles to jump to full descriptions below. Researcher talks are in bold.
Monday, 2/14: Getting Started in R and R Studio – Preparing for the New NIH Data Management & Sharing Plan – Digital Textual Analysis of Early Modern Witchcraft and Murder Pamphlets, or, How to Love Your Small Data (Briana Wipf) – R and RStudio Open Office Hour
Tuesday, 2/15: Publicly Available Social Justice Data – Data Sharing for Open Science – Public 3D Scanning Events @ the Open Lab
Wednesday, 2/16: “The incentives made me do it!”: Really? (Prof. Edouard Machery) – I got my advisor’s paper retracted. Open Data could have prevented that paper. (Prof. Sergey Frolov)
Thursday, 2/17: Spreadsheet data organization workshops
Friday, 2/18: CCVG Data - a journey from book to dataset and interactive platform (Contemporary Chinese Village Gazeteer Data Project) – Input/Output - Creative Approaches to Source Material and Sonic Result in Electronic Music (Dr. Aaron Myers-Brooks) – Intro to Glitch Art – Intro to QGIS
Getting Started in R and R Studio (Online)
12-1pm on 2/14, Zoom, hosted by University Library System
R is a free, open-source software package and programming language for working with data, especially statistical analysis. RStudio is a popular software environment and toolset for using R. In this one-hour, online workshop, we will become oriented within the RStudio environment and explore in a cursory way several common tasks for data work, such as examining and filtering data, generating linear models, and creating simple plots. Participants are welcome to follow along hands-on or merely watch.
More info and RSVP: https://pitt.libcal.com/event/8670585
Preparing for the New NIH Data Management & Sharing Plan: Session 1 – Elements, Costs, & Tools (Online)
12-1pm on 2/14, Zoom, hosted by Health Sciences Library System
NIH has a new policy going into effect on January 25, 2023 that will require NIH-funded researchers to prospectively submit a plan outlining how scientific data from their research will be managed and shared. This session will cover the plan’s elements and allowable costs as well as tools to help with your own plan creation. This session is the first in a three-part series. Session 2 will cover locating and evaluating repositories (the “where” of data sharing). Session 3 will cover writing up documentation and metadata (the “how” of data sharing).
More info and RSVP: https://www.hsls.pitt.edu/instruction/preparing-new-nih-data-management-sharing-plan-session-1-elements-costs-tools/6796
Digital Textual Analysis of Early Modern Witchcraft and Murder Pamphlets, or, How to Love Your Small Data
Briana Wipf, PhD candidate, Literature Program
4-5pm on 2/14, Hillman Library, G-74 Amy Knapp Room, hosted by University Library System
Digital humanists who work with relatively small corpora to do digital textual analysis must wrestle with the difficulty of drawing conclusions based on statistical analysis that prefers much larger datasets. Indeed, some methods are too data-hungry to yield reliable, generalizable results. However, I argue that the difficulty and careful design needed to execute digital textual analysis on small corpora can indeed provide insights into them. In this talk, I will discuss a project that involves comparing the role of an embodied Devil figure in two genres of the early modern English popular press: witchcraft accusation pamphlets and murder pamphlets. I contend that closely pairing digital methods with close reading and attention to historical and rhetorical contexts allows us to draw conclusions about the development of the figure of the Devil over the period of the late sixteenth century to early eighteenth century.
More info and RSVP: https://pitt.libcal.com/calendar/today/wipf-pamphlets
R and RStudio Open Office Hour (On Campus)
5-6pm on 2/14, Hillman Library, G-74 Amy Knapp Room, hosted by University Library System
Users of R and RStudio are invited to bring their questions to this weekly in-person office hour. Tutoring-style assistance is offered. If we don’t know the answer, we’ll help you look for it!
More info and RSVP: https://pitt.libcal.com/event/8671418
Back to top ↑
Publicly Available Social Justice Data (Online)
8:30-10am on 2/15, Zoom, hosted by Health Sciences Library System
There are thousands of federal, state, and local government sites that link the public to their data. Like much of the internet, it is easy to get lost trying to find data useful to your research unless you know where to go. This class is designed to introduce participants to commonly used measures of social justice through publicly available data sites. We will begin by exploring data sites that focus on social justice issues, such as income, education, pollution, housing, and healthy/risky behaviors. The social justice sites pull data from the federal government but since data is generally the most up to date on the home organization, we will follow the trail to the agency that gathered the data initially to look for updates and explore for other valuable information.
More info and RSVP: https://www.hsls.pitt.edu/instruction/publicly-available-social-justice-data/5831
Data Sharing for Open Science (Online)
12:30-1pm on 2/15, Zoom, hosted by University Library System
Data sharing is a key component of Open Science, and a research practice increasingly sought by funders. In this online presentation, we will discuss what is meant by “data sharing;” motivations and considerations in sharing your data; and how to share your data in practical terms.
More info and RSVP: https://pitt.libcal.com/event/8669324
Public 3D Scanning Events @ the Open Lab (On Campus)
1-4pm on 2/15, Hillman Library, Open Lab (Ground Floor), hosted by University Library System
Join us at the Open Lab @ Hillman on the ground floor for weekly 3D scanning demonstrations! No registration is required–just drop in.
Back to top ↑
“The incentives made me do it!”: Really?
Prof. Edouard Machery, Director, Center for Philosophy of Science
12-1pm on 2/16, Hillman Library, G-74 Amy Knapp Rm., hosted by University Library System
The behavioral and biomedical sciences have been confronting a replication crisis for over a decade: An unexpectedly large number of findings, including influential ones, fail to replicate in these disciplines. A common explanation of the low replicability of findings in these sciences appeals to scientists’ incentives: Scientists are incentivized to engage in practices that undermine the reliability of their results. This talk challenges this explanation and proposes an alternative.
More info and RSVP: https://pitt.libcal.com/calendar/today/machery-replicability
I got my advisor’s paper retracted. Open Data could have prevented that paper.
Prof. Sergey Frolov, Associate Professor, Department of Physics & Astronomy
4-5pm on 2/16, Hillman Library, G-74 Amy Knapp Rm., hosted by University Library System
Back to top ↑
Spreadsheet Data Organization Workshops, 12-2pm
for Health Sciences users:
Data Organization in Spreadsheets (Online)
12-1:30pm on 2/17, Zoom, hosted by Health Sciences Library System
After this session you will be able to answer: What are some common challenges with formatting data in spreadsheets and how can we avoid them? How can we carry out basic quality assurance in spreadsheets? How can we export data from spreadsheets in a way that is useful for downstream applications? In this hands-on class we will be using the built in data validation tools in Excel, therefore please have access to a computer with this program installed if you plan on following along. Note: We will not be covering data analysis or statistics in this session.
More info and RSVP: https://www.hsls.pitt.edu/instruction/data-organization-spreadsheets/5837
for general Pitt users:
Data TLC: Organizing Data with Spreadsheets Part 1 (Online)
12-1pm on 2/17, Zoom, hosted by University Library System
Good data organization is the foundation of any research project. Typically, we organize data in spreadsheets in ways that we as humans want to work with the data. However, computers require data to be structured in particular ways in order to use tools that make computation and analysis more efficient. In this workshop, participants will learn common uses for spreadsheets, good data entry and formatting practices, and how to avoid common formatting mistakes. This workshop will not cover data analysis with spreadsheets but will focus on the important initial “data wrangling” stage that enables you to do proper analysis later.
More info and RSVP: https://pitt.libcal.com/event/8671219
Data TLC: Organizing Data with Spreadsheets Part 2 (Online)
1-2pm on 2/17, Zoom, hosted by University Library System
Good data organization is the foundation of any research project. Typically, we organize data in spreadsheets in ways that we as humans want to work with the data. However, computers require data to be structured in particular ways in order to use tools that make computation and analysis more efficient. In this workshop, participants will learn how to document data for future use, basic quality control and data manipulation in spreadsheets, and how to export data from spreadsheets. This workshop will not cover data analysis with spreadsheets but will focus on the important initial “data wrangling” stage that enables you to do proper analysis later.
More info and RSVP: https://pitt.libcal.com/event/8671226
Back to top ↑
CCVG Data - a journey from book to dataset and interactive platform
Members of the Contemporary Chinese Village Gazeteer Data project
10-11am on 2/18, Zoom, hosted by University Library System
- Prof. Daqing He, Associate Chair, Department of Informatics and Networked Systems
- Haihui Zhang, Head, East Asian Library, and lead for the Contemporary Chinese Village Gazeteer Data project
- Ruoyun Zheng, Project Coordinator of the Student Team
In July 2018, the East Asian Library (EAL) of the University of Pittsburgh Library System (ULS) initiated the Contemporary Chinese Village Gazetteer Data (CCVG Data) project to create an open-access online dataset of statistics extracted from Chinese village gazetteers (村志). CCVG Data is an ongoing project. So far, about 307,760 value data from 1,500 villages were extracted and opened for access and download. This unique initiative has produced a dataset of significant value to the humanities and social sciences based on Chinese village gazetteers, which includes quantitative and qualitative critical to supporting contemporary Chinese studies in fields including politics, economics, sociology, environmental science, history, and public health, etc. This presentation will start with a review of the background of initiating the project, followed by a presentation and demonstration on data extraction procedures, data structure, data dictionary, downloading instructions, as well as the user interactive platform, etc. The current stage and the final goal of the project will be discussed as well.
More info and RSVP: https://pitt.libcal.com/calendar/today/ccvg-data
Input/Output - Creative Approaches to Source Material and Sonic Result in Electronic Music
Dr. Aaron Myers-Brooks, Pitt Department of Music faculty
12-1pm on 2/18, Hillman Library, G-74 Amy Knapp Rm., hosted by University Library System
Electronic Music presents a fascinating opportunity for the mixing and matching of disparate input and output data. My presentation will demonstrate three very different input methods, which will generate three very different sonic results. I will begin with drawings generated in HighC, a piece of software based around 20th century composer Iannis Xenakis’s UPIC program. These drawings will generate synthesized soundscapes. I will then perform an improvised hip-hop beat using a video game controller in conjunction with Ableton Live. Finally, I will demonstrate an excerpt from an in progress piece for microtonal guitar and real time effects processing.
More info and RSVP: https://pitt.libcal.com/calendar/today/myers-brooks-music
Intro to Glitch Art (Online)
1-2pm on 2/18, Zoom, hosted by University Library System
This online workshop will cover some of the various tools and techniques associated with the creation of Glitch Art. In addition, participants will be introduced to some of the history and conceptual theories surrounding glitch and Glitch Art. This workshop will use hex editor Notepad++ (PC, make sure to use 32 bit version) or Hex Fiend (Mac) and audio editing software Audacity for data bending. Please download both ahead of time if you want to follow along.
More info and RSVP: https://pitt.libcal.com/event/8671329
Intro to QGIS (On Campus)
2:30-4pm on 2/18, Hillman Library, G-49 Digital Scholarship Commons, hosted by University Library System
This workshop will introduce you to QGIS – a free and Open Source GIS software that allows you to compose maps and interactively explore spatial data. QGIS supports numerous vector, raster, and database formats and functionalities. If you’re already familiar with ArcGIS and want to learn more about an open source alternative or if you’re new to digital mapping, this workshop will introduce you to the basics.
More info and RSVP: https://pitt.libcal.com/calendar/today/qgis
Back to top ↑
…and what else?
We love data all year! This is just a small sampling of our workshops. Check out the upcoming events at these institutions:
This page is a collaboration between folx at the Pitt University Library System and Pitt Health Sciences Library System.