ChinaFile’s research for “Message Control: How a New For-Profit Industry Helps China’s Leaders ‘Manage Public Opinion’” relied primarily on a dataset comprising some 3,100 procurement notices that central and local government offices posted for goods and services related to public opinion monitoring. This article explains how we arrived at that dataset—how we selected notices for inclusion, as well as how we processed and sifted through them in order to draw analytical conclusions.

ChinaFile undertook this type of data-driven research project specifically because of the difficulties in reporting on the ground in China right now. We based our analysis on procurement notices, discussions with experts, and additional research, but we were not able to conduct first-hand reporting in China.

Central and local governing bodies, such as state government departments and Communist Party offices, publish public procurement notices on the Chinese Government Procurement Network (CGPN) website. (We use the terms “officials,” “authorities,” and “government” to refer to either state or Party entities.) The website includes procurement notices from as early as 2001. These notices typically include basic information about the type of equipment or services local governments are hoping to buy, how much they plan to spend, and how and when companies can submit a tender. Local government purchasers do not directly manage the bidding processes. Companies act as middlemen, posting procurement notices on the CGPN website, creating the sometimes lengthy addenda included in these notices, and otherwise handling related administrative tasks. The CGPN website has separate webpages for each phase of the procurement process, from the initial notice to the final announcement of the winning bid.

Some notices include supplemental documents that elaborate on the basic terms of the notice. These can include contract templates, rules for bidding procedures, and information about the specific products or work requested. At times, these addenda explain the reasons and rationale for the requested purchases in rich detail; in others, the rationale is implicit in the kinds of technology officials seek.

Generating and Analyzing Our Dataset

To create this dataset of approximately 3,100 procurement notices, we generated a short list of keywords related to the Chinese government’s efforts to monitor online speech in China and abroad. These keywords were all terms related to the ideas of public opinion, public sentiment, and/or commenting online.

Any procurement notice included in the dataset had to contain at least one of these keywords in either the title of the procurement notice or in the name of the government agency seeking to make the purchase.

The 3,108 procurement notices that include these keywords were posted to the CGPN website between January 9, 2007 and August 23, 2020. Among them we found approximately 600 notices containing supplemental documents. These included Microsoft Word or PDF documents, as well as other file types, such as JPEGs. We assume there are some number of supplemental documents that we did not successfully collect, meaning that figures or estimates made with this information are likely undercounts.

In some cases, we looked specifically at “awarded bid notices”—that is, notices of government contracts’ being awarded to winning companies. The initial dataset of 3,108 included a number of procurement notices we could not determine had been awarded out; these notices may not have made it through the full bidding process, the purchasing agency may have neglected to post the final announcement stating which company won the bid, or authorities may have removed that final announcement from the website. To account for this, we created a second dataset that only contained notices labeled as having been awarded out to a contractor to fulfill. This narrower dataset comprised 1,271 notices.

It appears that there were fewer procurement notices overall posted to the CGPN website at the end of 2019 and the beginning of 2020, likely due in part to COVID-19. This dip decreases the number of public opinion-related notices during this time period, but it follows the decline of overall notices posted on the CGPN website during that time period and is unlikely related to public opinion monitoring trends specifically.

One procurement notice that we reference in “Message Control” did not come from this dataset. Instead, we downloaded it manually from the Liaoning province government procurement website. This notice, from the Tieling city Public Security Bureau, was dated October 2020.

Possible Flaws in the Data

This dataset only contains procurement notices that government organs publicly posted and that still were available at the time we pulled them up on the CGPN website. Therefore, we believe that the figures we cite in the article are most likely undercounts. Further, it is clear that some provinces are assiduous in posting these notices online, while others are less so. We don’t always have a way of knowing whether certain localities are making fewer purchases, whether they are publicizing notices or procuring goods and services outside of the Chinese Government Procurement Network, whether previous notices have been removed from the website because they were retroactively deemed “sensitive,” or if notices are not reflected on the site for any other reason. Not only does this mean that some notices are certainly missing, it also means that it is not entirely random which types of notices are missing. As noted above, our primary dataset does not include notices from provincial-level procurement websites.

There are, however, a few ways in which the data, and our methods of searching, might have led us to over-count. As described above, many purchases may appear multiple times in the dataset; for example, one purchase might appear first as an initial announcement, second as a modification announcement, and finally in an announcement of who won the bid. We have done our best to eliminate these in our smaller set of roughly 1,200 entries, but we may have unintentionally included a small percentage of them.

Jessica Batke and Mareike Ohlberg