Data

In the spirit of open science and transparency, we provide all data and code used for our public dashboards.

Aggregated Legislator Profiles

Our online dashboards are based on real time data from a variety of sources, allowing us to build a profile for each legislator in Congress. The profile data for each legislator can be downloaded in it’s entirety below. This dataset is useful for running quick analyses comparing a legislator’s communication profile, campaign finances, ideological profile, and legislative effectiveness.

[meta data]

Data for the legislator profiles come from a variety of sources. More info about each source below:

Communication

The rhetoric profiles we build for each legislator is based on over a million snippets of public communication data from a variety of online sources (twitter, floor speeches, newsletters, & public statements) – collected and processed in-house.

We then leverage ChatGPT to tag each snippet as belonging to any of our rhetoric categories.

We then aggregate the data to get a sense of what legislators spend most of their time talking about, and we rank each legislator based on how they compare to their colleagues.

Updated weekly.

View Code

(Full dataset of classified text can be found below)

Ideology

Our ideology scores are based on how members of Congress vote in relation to their party. We use publicly available data from Voteview, and calculate an ideology score based on the approach of Duck-Mayr & Montgomery (2022). The aggregated data we use can be downloaded below.

Updated weekly.

View Code

Effectiveness

Information on legislative effectiveness is based on the number of bills a legislator proposes, and how many of those bills get signed into law. Legislation data is originally sourced from ProPublica. The aggregated data we use can be downloaded below.

Updated weekly.

View Code

Attendance

Attendance is the measure of how often a member of congress shows up to vote on legislation. Attendance data is originally sourced from Voteview. The aggregated data we use can be downloaded below.

Updated weekly.

View Code

Campaign Finance

Campaign finance data is sourced directly from from the Federal Election Commission’s Database. The aggregated data we use can be downloaded below.

Updated quarterly.

View Code


Classified Communication Data

The Communication profile of each legislator is based on text snippets classified by ChatGPT. The full annotated dataset in its entirety can be found below. This dataset is suitable for those wishing to do deeper dives into the nature of political communication of legislative elites.

Year Download
2022 Download
2023 Download
2024 Download

[meta data]

Updated daily.

View Code