{"id":101794,"date":"2016-02-25T11:38:47","date_gmt":"2016-02-25T16:38:47","guid":{"rendered":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/2016\/02\/25\/appendix-b-6\/"},"modified":"2024-04-14T04:14:40","modified_gmt":"2024-04-14T09:14:40","slug":"appendix-b-6","status":"publish","type":"post","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/25\/appendix-b-6\/","title":{"rendered":"Appendix B"},"content":{"rendered":"<p class=\"wp-block-paragraph\">An additional analysis stemming from the Kuriakose &amp; Robbins piece is a scatter plot showing the number of questions (x-axis) against percentage of near duplicates (y-axis). This raised the question &#8212; if more questions in a survey cause a lower near duplicate rate, shouldn&#8217;t that be apparent in the scatter plot? One of the problems with the graph is that several important survey characteristics are confounded when run that way in a simple bivariate analysis.<\/p>\n\n<figure class=\"wp-block-image alignright\"><a href=\"https:\/\/alpha.pewresearch.org\/pewresearch-org\/?attachment_id=278023\"><img data-dominant-color=\"ebe6da\" data-has-transparency=\"false\" style=\"--dominant-color: #ebe6da;\" decoding=\"async\" sizes=\"(max-width: 840px) 100vw, 840px\" class=\"wp-image-278023 not-transparent\" src=\"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-content\/uploads\/2016\/02\/PM_2016.02.25.png\" alt=\"High Matches in International Surveys\" ><\/a><\/figure>\n\n<p class=\"wp-block-paragraph\">To demonstrate this point, we conducted an additional empirical analysis. Using 367 Pew Research Center international surveys, we ran a regression predicting the percentage of near duplicates using the following covariates: the number of questions, the sample size, and the percentage of questions with five or more response options.<\/p>\n\n<p class=\"wp-block-paragraph\">As the results show, two survey characteristics are significant predictors of the percentage of near duplicates. The number of questions turned out to be not significantly predictive, however, the sample size and the proportion of questions with five or more response categories both have a strong association with the percentage of near duplicates. The larger the sample size, the higher the percentage of near duplicates. The greater the proportion of questions with 5+ response options, the lower the percentage of near duplicates. This adds to the evidence in our paper that a uniform threshold of 85% is wrong-headed because survey characteristics affect the likelihood of there being a high match rate. All of the 367 datasets in this analysis are publically available on our website and the R code for this regression is available upon <a href=\"mailto:info@pewresearch.org\">request<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>An additional analysis stemming from the Kuriakose &amp; Robbins piece is a scatter plot showing the number of questions (x-axis) against percentage of near duplicates (y-axis). This raised the question &#8212; if more questions in a survey cause a lower near duplicate rate, shouldn&#8217;t that be apparent in the scatter plot? One of the problems [&hellip;]<\/p>\n","protected":false},"author":294,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"sub_headline":"","sub_title":"","_crdt_document":"","_prc_public_revisions":[],"_ppp_expiration_hours":0,"_ppp_enabled":false,"ai_generated_summary":"","relatedPosts":[],"reportMaterials":[],"multiSectionReport":[],"package_parts__enabled":false,"package_parts":[],"_prc_fork_parent":0,"_prc_fork_status":"","_prc_active_fork":0,"datacite_doi":"","datacite_doi_citation":"","_prc_seo_qr_attachment_id":0,"spoken_article_player_enabled":true,"bylines":[],"acknowledgements":[],"displayBylines":true,"footnotes":"","prc_watchers":[]},"categories":[356],"tags":[],"bylines":[],"collection":[],"datasets":[],"level_of_effort":[],"primary_audience":[],"information_type":[],"_post_visibility":[],"formats":[458],"_fund_pool":[],"languages":[],"regions-countries":[],"research-teams":[528],"workflow-status":[],"class_list":["post-101794","post","type-post","status-publish","format-standard","hentry","category-international-survey-methods","formats-report","research-teams-methods"],"label":false,"post_parent":101838,"word_count":256,"canonical_url":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/25\/appendix-b-6\/","art_direction":false,"_embeds":[],"watchers":[],"table_of_contents":[{"id":101838,"title":"Evaluating a New Proposal for Detecting Data Falsification in Surveys","slug":"evaluating-a-new-proposal-for-detecting-data-falsification-in-surveys-2","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/global\/2016\/02\/23\/evaluating-a-new-proposal-for-detecting-data-falsification-in-surveys-2\/","is_active":false},{"id":101855,"title":"Appendix A","slug":"appendix-a-10","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/23\/appendix-a-10\/","is_active":false},{"id":101794,"title":"Appendix B","slug":"appendix-b-6","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/25\/appendix-b-6\/","is_active":true},{"id":101828,"title":"Works Cited","slug":"works-cited","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/23\/works-cited\/","is_active":false}],"report_materials":"","report_pagination":{"current_post":{"id":101794,"title":"Appendix B","slug":"appendix-b-6","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/25\/appendix-b-6\/","is_active":true,"page_num":3},"next_post":{"id":101828,"title":"Works Cited","slug":"works-cited","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/23\/works-cited\/","is_active":false,"page_num":4},"previous_post":{"id":101855,"title":"Appendix A","slug":"appendix-a-10","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/23\/appendix-a-10\/","is_active":false,"page_num":2},"pagination_items":[{"id":101838,"title":"Evaluating a New Proposal for Detecting Data Falsification in Surveys","slug":"evaluating-a-new-proposal-for-detecting-data-falsification-in-surveys-2","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/global\/2016\/02\/23\/evaluating-a-new-proposal-for-detecting-data-falsification-in-surveys-2\/","is_active":false,"page_num":1},{"id":101855,"title":"Appendix A","slug":"appendix-a-10","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/23\/appendix-a-10\/","is_active":false,"page_num":2},{"id":101794,"title":"Appendix B","slug":"appendix-b-6","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/25\/appendix-b-6\/","is_active":true,"page_num":3},{"id":101828,"title":"Works Cited","slug":"works-cited","link":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/methods\/2016\/02\/23\/works-cited\/","is_active":false,"page_num":4}]},"parent_info":{"parent_title":"Evaluating a New Proposal for Detecting Data Falsification in Surveys","parent_id":101838},"materialsOrdered":[],"chaptersOrdered":[],"partsOrdered":[],"partsEnabled":false,"datacite_doi":"","prc_seo_data":{"title":"Appendix B","description":"An additional analysis stemming from the Kuriakose &amp; Robbins piece is a scatter plot showing the number of questions (x-axis) against percentage of near duplicates (y-axis). This raised the question&hellip;","og_title":"Appendix B","og_description":"","schema_type":"Article","noindex":false,"canonical_url":"","primary_terms":[],"custom_schema":[],"og_image":0,"indexnow_submitted_at":null,"gsc_index_status":null},"prepublish_checks":{"prc-image-alt-text":{"status":"complete","message":"All images have alt text.","data":null},"prc-about-this-research":{"status":"incomplete","message":"Add an \"About this research\" details block.","data":null},"prc-paragraph-count":{"status":"complete","message":"Found 3 paragraphs.","data":{"count":3}},"prc-internal-link":{"status":"complete","message":"Found 1 internal link.","data":{"count":1}}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"relatedPostsOrdered":[],"bylinesOrdered":[],"acknowledgementsOrdered":[],"_links":{"self":[{"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/posts\/101794","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/users\/294"}],"replies":[{"embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/comments?post=101794"}],"version-history":[{"count":2,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/posts\/101794\/revisions"}],"predecessor-version":[{"id":134406,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/posts\/101794\/revisions\/134406"}],"wp:attachment":[{"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/media?parent=101794"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/categories?post=101794"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/tags?post=101794"},{"taxonomy":"bylines","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/bylines?post=101794"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/collection?post=101794"},{"taxonomy":"datasets","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/datasets?post=101794"},{"taxonomy":"level_of_effort","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/level_of_effort?post=101794"},{"taxonomy":"primary_audience","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/primary_audience?post=101794"},{"taxonomy":"information_type","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/information_type?post=101794"},{"taxonomy":"_post_visibility","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/_post_visibility?post=101794"},{"taxonomy":"formats","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/formats?post=101794"},{"taxonomy":"_fund_pool","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/_fund_pool?post=101794"},{"taxonomy":"languages","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/languages?post=101794"},{"taxonomy":"regions-countries","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/regions-countries?post=101794"},{"taxonomy":"research-teams","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/research-teams?post=101794"},{"taxonomy":"workflow-status","embeddable":true,"href":"https:\/\/alpha.pewresearch.org\/pewresearch-org\/wp-json\/wp\/v2\/workflow-status?post=101794"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}