How Pew Research Center is – and is not – using large language models in our work
In this post, we’ll share our current guidelines for the internal use of large language models and potential areas of experimentation.
A behind-the-scenes blog about research methods at Pew Research Center.
For our latest findings, visit pewresearch.org.
In this post, we’ll share our current guidelines for the internal use of large language models and potential areas of experimentation.
In this post, we discuss reproducibility as a part of Pew Research Center’s code review process.
In this post, we discuss three methods to identify and remove specific words and phrases in unstructured text data.
In this post, we delve into Kubernetes – the back-end tool that powers the systems our research team interacts with.
We explore the connection between Americans’ survey responses and their digital activity using data from our past Twitter research.
PMI is a quick and easy way to identify words that distinguish one group of documents from another.
After venturing into the world of computational social science in 2015, the Center needed to develop new tools and workflows.
The final post in our series examines how topic models can and can’t help when classifying large amounts of text.
Keyword oversampling can be a powerful way to analyze uncommon subsets of text data.
The Pareto principle, or “80/20 rule,” holds that in many systems, a minority of cases produce the majority of outcomes.