In the spirit of open science and transparency, we provide all data and code used for our public dashboards.
Aggregated Legislator Profiles
Our online dashboards are based on real time data from a variety of sources, allowing us to build a profile for each legislator in Congress. The profile data for each legislator can be downloaded in it’s entirety below. This dataset is useful for running quick analyses comparing a legislator’s communication profile, campaign finances, ideological profile, and legislative effectiveness.
Data for the legislator profiles come from a variety of sources. More info about each source below:
The rhetoric profiles we build for each legislator is based on over a million snippets of public communication data from a variety of online sources (twitter, floor speeches, newsletters, & public statements) – collected and processed in-house.
We then leverage ChatGPT to tag each snippet as belonging to any of our rhetoric categories.
We then aggregate the data to get a sense of what legislators spend most of their time talking about, and we rank each legislator based on how they compare to their colleagues.
Updated weekly.
(Full dataset of classified text can be found below)
Our ideology scores are based on how members of Congress vote in relation to their party. We use publicly available data from Voteview, and calculate an ideology score based on the approach of Duck-Mayr & Montgomery (2022). The aggregated data we use can be downloaded below.
Updated weekly.
Information on legislative effectiveness is based on the number of bills a legislator proposes, and how many of those bills get signed into law. Legislation data is originally sourced from ProPublica. The aggregated data we use can be downloaded below.
Updated weekly.
Attendance is the measure of how often a member of congress shows up to vote on legislation. Attendance data is originally sourced from Voteview. The aggregated data we use can be downloaded below.
Updated weekly.
Campaign Finance
Campaign finance data is sourced directly from from the Federal Election Commission’s Database. The aggregated data we use can be downloaded below.
Updated quarterly.
Classified Communication Data
The Communication profile of each legislator is based on text snippets classified by ChatGPT. The full annotated dataset in its entirety can be found below. This dataset is suitable for those wishing to do deeper dives into the nature of political communication of legislative elites.
Updated daily.