Building Data Justice: Reflections from the Second Community Data Justice Collaborative (CDJC) Workshop 2

The Black Equity Coalition (BEC) and the City of Pittsburgh recently hosted the second Community Data Justice Collaborative (CDJC) workshop via Zoom. This workshop continued the critical work of bringing community voice into the data practices of the City of Pittsburgh. 

The workshop’s purpose involved providing guidance to the City in its data collection and data sharing priorities. 

Slides

To set the stage for our discussion, the participants were asked to reflect on the topic of missing data, and the kinds of data that they wish existed or wish they had access to. The work of artist Mimi Ọnụọha provided inspiration for our initial activity. Her Library of Missing Datasets exhibition encourages reflection on missing data. She notes that the places where data does not exist show us hints of what we think is important and reveal power dynamics and our biases. Her exhibition included a filing cabinet full of empty filing folders. On each folder is a label for a dataset that does not exist. 

In our first activity, we asked our workshop participants to create their own library of missing datasets.   In addition to listing missing datasets, we also asked them to suggest potential uses of this data.  This listing can be used by the City of Pittsburgh to identify data collection and publication priorities of community members. 

Datasets suggested during the workshop included:

  • In our introductory activity, we asked participants to list a few datasets from their library of missing datasets, along with ways that they would like to use the data. 
  • Frequency of fires in a neighborhood can be used to analyze their reason/cause and intervention  
  • Student loan debt to be able to pay loans off more-quickly  
  • Police officer complaints and policy infractions can provide transparency in which officers have tendencies to violate policing policies  
  • Number of students in each school can help to determine education needs  
  • How long residents stay in neighborhoods can measure and evaluate stability  
  • Environmental justice data and data about energy burden can be used to help residents heat their homes  
  • Race and gender for mental health crisis calls can be used to assess needs of people by race and gender to prevent mental health crises.   
  • The rate of volunteering of people who watched PBS as a child can be used to generate support for public television  
  • Metadata for resilient online communities can provide context to how communities use the internet to access housing, food, resources, etc.  
  •  A definitive listing of Pittsburgh businesses owned by Black people can be used by the African American Chamber of Commerce to help bring more of these businesses into Black neighborhoods  
  • Neighborhood health statistics (including chronic Diseases, STIs, pregnancies, etc. ) can be analyzed to suggest ways to improve the numbers by improving health outcomes 
  • Data about when people inquire about hospitals and other medical offices in or near communities can be used to understand and study how and where people with disabilities move and why they move  
  • Neighborhood investment of public funds by source, zip code / community and year can identify opportunities to create equity in neighborhoods with less investment and secure funding and investment from other sources. 

These datasets were shared without use cases: 

  • Amount of rental housing and housing stock in an area   
  • Businesses in a neighborhood 
  • Data about how many times people get bad advice on how to treat health problems ,  
  • Transportation barriers  
  • Abandoned houses 
  • Counting the unaccountable – those who are not counted in the census ,   
  • The number City owned properties and those available for auction and sale throughout the city ,

Breakout Activity 2: Why Does Data Go Missing?

In our second breakout activity, participants were asked to work together and create a list of reasons why they think data can go missing. To prepare this summary of the workshop, we grouped their responses into five categories

Technological/Infrastructure Factors: 

Change in software, Outdated software, File corruption, Older servers cannot handle capacity, Not in a place that is accessible, Lack of a system to keep data updated, Difficult to clean datasets to meet needs and make them usable, Collection processes not standardized or consistent across agencies, Organization eliminates/changes positions, 

Human Factors: 

Human error, People don’t know how to utilize the data, People don’t know the data exists , People don’t want to share personal info, Racism, Lack of trust, Fear of sharing, Lack of training to collect data, , Misrepresentation in data collection, Bias in how and what data is tracked, Lack of human-in-the-loop data entry, Underreporting,  Misplaced, no reason for data collection 

Ethical Factors: 

People with power/authority don’t want data collected, Data hidden (positive/negative), Discarded due to lack of need,  Investment in responsible innovation needed, Misrepresentation, To avoid responsibility, Nobody cares/Lack of care, Deliberate erasure, Hyper innovation with lack of ethics, Racism 

Resource and Economic Factors: 

Lack of monetization, Lack of investment in regenerative practices, Cost, Capitalism, Divestment at an infrastructural level, Solutions not warranted, Disregarded due to capacity, , Lack of social/economic mobility incentives, 

Policy & Governance: 

Confidentiality/Restricted, Privacy/Security, No enabling environment (policy, zoning, ordinances), No incentive for community participation, Community organization underestimated, No consensus collection, Preventing responsibility, 

As a tool for further reflection, participants were also then given a list of reasons for missing data contained in two different reports (cite), and asked to compare their group’s list with this list. This list from the report included the following reasons. 

  • Data threatens people with power 
  • Apathy and neglect 
  • Access to data collection technologies 
  • Collection practices are inadequate or biased 
  • Costs outweigh benefits of collection 
  • Collecting data too burdensome 
  • Cost to acquire or collect data  
  • Lack of technical capacity  
  • Privacy risks and fear of misuse 
  • People seek to be excluded for protection 
  • Data tough to quantify (e.g. people’s emotions)  
  • Indigenous/local knowledge not seen as important 
  • Laws/policies restrict collection and sharing 

Following this activity, participants were asked to what degree focusing on what’s missing or absent helped them adopt a different mindset about data. 

  • Missing data can be an interconnection between will and systemic issues.  
  • Apathy, neglect, and racism are evidence that a change in mindset and shift of perspective toward equity are necessary 
  • How can we create an infrastructure and environment enabling community to be an equitable participant in data? We are ok to not hear community voice. 
  • We can “embrace the gray area” and be curious about what we don’t know about data. Making an investment and effort to discover the story behind missing data can take time. 
  • Where accountability is concerned, it’s imperative that we include people within and affected by the data and data systems 
  • The speed of technology is fast and outpaces the speed of ethics 
  • Having conversations about data can create engagement. When a community is not equipped to participate in the conversation, it can create issues. Awareness, knowledge and literacy are important for inclusivity. 
  • Focusing on what’s missing is a way for us to identify and focus on biases 
  • It makes the city stronger by being inclusive and speaking about data. 
  • It’s important to understand the community and be active in it if working in a community. Community members can provide important contextual data. 

We then concluded the conversation about missing data to:

  • highlight the consequences that could result from being excluded from data, 
  • point out that it is a burden to have to quantify the existence of something in order for people in power to take them seriously; and 
  • Point out that having data doesn’t usually shift power without collective action and legal challenges. 

City Data Governance Discussion

Dr. Chris Belasco, Chief Data Officer of the City of Pittsburgh delivered a presentation to the members of the Collaborative about how the City prioritizes data collection and data releases. He emphasized the importance of public data requests in shaping actions and priorities. These requests can come directly to the City or through the Western Pennsylvania Regional Data Center. He also discussed: 

  • Why the city collects data 
  • Processes for making it open and available 
  • Importance of the City’s data service standard 
  • Principles underlying the City’s data services standard 
  • Challenges to making data open and available, including technology, resource constraints, and privacy regulations.

The Collaborative members were then asked in what ways they’d like to see themselves involved in the open data process. 

  • It’s important to think about how people might cause harm from collecting and sharing data 
  • It’s critical to develop priorities around what is collected and how it is collected. 
  • Involving people in the process of these data decisions is important. The community should determine the importance and worth of data, and roles for the City can involve providing organization, context, and accessibility. 
  • It’s necessary for the City to put information out and get it to the people in the simplest form so that they can read and understand it.  
  • It’s important for the data systems to be based on integrity and community. Not tied to the whims of funders or those with power.  
  • It’s important to understand users and how they hope to use data. Providing transparency and data context for users can help to minimize distrust. 
  • Could there be a ticketing system for data requests so that people can track the status of their data request? A lack of feedback about what data can be shared and when it will be available can cause frustration and challenges for projects and work. 
  • The city needs to understand how the data is being exchanged in a valuable way even if that means the private sector. 
  • Can we have a better understanding of the City’s capacity and budget for data systems? 
  • Reply: The City established a committee (2021) to help identify their data assets. Who owns, who makes the rules around it, who maintains it? 
  • Followup question: Is there an assessment as to how well the process has worked that includes potential community impact? This is to identify the data and how well its maintained 
  • In response to what publishing with a purpose means to them, here is additional feedback that was provided: 
  • Including people that are affected 
  • Advocacy along  with data 
  • Community Governance 
  • Be transparent about purpose 
  • Establish quality assurance mechanism

Reflections and Moving Forward

As we move forward, ongoing dialogue and action will be crucial in ensuring that community members that have often been excluded from data decisions play a meaningful role in guiding data governance practices. The Black Equity Coalition and City of Pittsburgh remain committed to fostering these conversations and advocating for policies that empower communities through data transparency and equity.

Resources

Share the Post:

Related Posts