Thematic Analysis of Banking Regulatory Changes Following the Collapse of Silicon Valley Bank

NLP-Driven Analysis of Banking Policies

Topic Modeling of Regulatory Documents using Natural Language Processing

About

The 2023 Silicon Valley Bank (SVB) collapse had a strong influence on financial regulations. Our project aims to utilize natural language processing topic modeling techniques to identify primary themes in our corpora of financial regulatory documents scraped from regulations.gov. The project objective is to identify and visualize topics, reveal shifts in regulatory focus, and how topics shift with market trends.

In this project, we compare the texts of proposed and implemented regulations in a 36-month window surrounding the SVB collapse - 18 months pre, and 18 months post. Methods include naive keyword counts as well as basic and advanced topic modeling techniques (TF-IDF, BERTopic).

We expect to see increased scrutiny on mid-sized banks vs. the historical focus on Global Systematically Important Banks (G-SIBs). We also expect key themes to include increased capital requirements, liquidity risk, and discussions around the appropriate levels of FDIC deposit insurance. Uncovering the root causes of the SVB collapse and its impact on regulatory trends will allow for better impact mitigation should similar crises arise. Experimental results show strong statistical evidence confirming the hypothesis. Detailed findings from this work are available in the final report.

Built With

The main packages used in our project:

pandas
numpy
os
re
json
sklearn
nltk
gensim
matplotlib.pyplot
openai
keybert
rake_nltk
yake
spacy
bertopic
umap

Methods

Keyword Identification & Extraction
Topic Modeling (TF-IDF, BERTopic)

Technologies

Python

Getting Started

The corpora of documents used in our project can be accessed in our documents folder. Implementation and method-specific preprocessing for our three primary methods are done within their respective folders (naive_model, TF-IDF - within keywords folder, BERTopic).

Prerequisites

pandas (A data manipulation and analysis library providing data structures like DataFrames for Python.)
numpy (A library for numerical computing in Python, providing support for large, multi-dimensional arrays and matrices.)
scikit-learn (A machine learning library for Python, offering tools for classification, regression, clustering, and dimensionality reduction.)
nltk (The Natural Language Toolkit, a platform for building Python programs to work with human language data.)
gensim (A library for topic modeling and document similarity analysis in Python.)
bertopic (A topic modeling library that leverages BERT embeddings for creating interpretable topics.)
PyTorch (An open-source machine learning framework for deep learning.)
scipy (A library for scientific computing in Python.)

Usage

Each of the python script files serve separate purposes and can be used for keyword extraction or to topic model our corpus of regulatory documents. Sample visualizations can be found in the figures folder.

Team Members

Name	Handle
Priscilla Clark	@priscillaoclark
Nicholas Wong	@nicwjh
Harsh Kumar	@harshk02
Elaine Zhang	@ElainehxZhang

License

Distributed under the MIT License - LICENSE.

Repository Link: [https://github.com/priscillaoclark/15.S08-applied-nlp-final)

Acknowledgements

We would like to thank Mike Chen, Andrew Zachary, and Chengfeng Mao for their help and guidance throughout this project. The exceptional learning environment and resources provided by the Massachusetts Institute of Technology (MIT) have also been instrumental in shaping this work.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
BERTopic		BERTopic
ai_outputs/lbi		ai_outputs/lbi
data_preparation		data_preparation
documents		documents
figures		figures
keywords		keywords
literature_review		literature_review
naive_model		naive_model
slides		slides
testing		testing
venv_nlp_final		venv_nlp_final
.gitignore		.gitignore
LICENSE		LICENSE
S08_Report_Final.pdf		S08_Report_Final.pdf
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Thematic Analysis of Banking Regulatory Changes Following the Collapse of Silicon Valley Bank

NLP-Driven Analysis of Banking Policies

About

Built With

Methods

Technologies

Getting Started

Prerequisites

Usage

Team Members

License

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

License

priscillaoclark/15.S08-applied-nlp-final

Folders and files

Latest commit

History

Repository files navigation

Thematic Analysis of Banking Regulatory Changes Following the Collapse of Silicon Valley Bank

NLP-Driven Analysis of Banking Policies

About

Built With

Methods

Technologies

Getting Started

Prerequisites

Usage

Team Members

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages