stroke_essay_llm

Authors:
Rohan Khera MD MS, Aline F Pedroso PhD, Vipina K Keloth PhD, Hua Xu PhD, Gisele S Silva MD PhD, Lee H Schwamm MD

Links:

[Manuscript – TBD]
CarDS Lab Website

Repository Overview

This repository contains organized Python scripts and supporting materials used in the analysis of a manuscript that sought to characterize distinguishing linguistic features in differentiating AI-generated from human-authored scientific text and evaluate the performance of AI-detection tools for this task.

Repository Structure

1. `1_cohort_creation.py`

Contains scripts for processing and annotating the main dataset, including:

Parsing feature values from rater annotations
Cleaning and transforming data for modeling
Creating categorical variables for AI and human impact

2. `2_tables.py`

Contains scripts for:

Calculating best essay ratios
Generating summary tables
Statistical comparisons of text features by reviewer ratings

3. `3_figures.py`

Includes code for visualizing:

Distributions of AI/Human impact annotations
Essay classification metrics such as GPTZero prediction and subjectivity

4. `4_supplement_material.py`

Includes code for:

Merging individual essay PDF files into a combined supplemental file
Adding labels to essays and formatting the output for manuscript submission

How to Use

Ensure all required Excel or CSV input files are present in the correct paths as expected by the scripts.
Run each script in sequence to process data, compute tables/figures, and compile supplemental PDFs.
For figures, additional visualization libraries such as matplotlib or seaborn may be required.

Contact: Rohan Khera - [email protected]

Version: June 2025
"""

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
1_cohort_creation.py		1_cohort_creation.py
2_tables.py		2_tables.py
3_figures.py		3_figures.py
4_supplement_material.py		4_supplement_material.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

stroke_essay_llm

Repository Overview

Repository Structure

1. `1_cohort_creation.py`

2. `2_tables.py`

3. `3_figures.py`

4. `4_supplement_material.py`

How to Use

About

Uh oh!

Releases

Packages

Languages

License

CarDS-Yale/stroke_essay_llm

Folders and files

Latest commit

History

Repository files navigation

stroke_essay_llm

Repository Overview

Repository Structure

1. 1_cohort_creation.py

2. 2_tables.py

3. 3_figures.py

4. 4_supplement_material.py

How to Use

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `1_cohort_creation.py`

2. `2_tables.py`

3. `3_figures.py`

4. `4_supplement_material.py`

Packages