Skip to content
View Kquant03's full-sized avatar

Block or report Kquant03

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Kquant03/README.md
Header Image

Advancing Artificial Intelligence With Data Science

๐Ÿš€ Projects

Project Name Description Technologies
Guide on Everything AI Maintaining a comprehensive guide on everything I've learned regarding artificial intelligence. Transformers, PEFT, TTS, Latent Diffusion, Fine-tuning techniques
Operation Athena Curating a large database of reasoning tasks that updates in real time and has self moderation capabilities Next.js, Node.js, MongoDB
The Caduceus Project
(Code) (Dataset)
Utilizing GPT-4o to convert highly technical medical and scientific protocls into markdown files. Python, OpenAI API
Pneuma (wip) Training an LLM on realistic human interactions, conversations, and experiences. Python, Fine-Tuning, Direct-Preference-Optimization
Apocrypha and Sandevistan (wip) Datasets representing experiences, imaginative scenarios, and other such things we normally don't train LLMs on. Python, Together.ai API, Prompt-Engineering, Mirroring of Neural Patterns.
Study Guide for LLMs (wip) A dataset to help LLMs study for the ARC-C and MMLU benchmarks by utilizing the Nemotron model. Python, Nvidia API, Prompt-Engineering
System Prompt Generator (unreleased) A python script that synthetically generates system prompts for ShareGPT datasets Python, Together.ai API, Prompt-Engineering
Interactive Experience Generator (unreleased) A data pipeline that generates interactions between a human and an AI in ShareGPT format. Python, Together.ai API, Prompt-Engineering
Nemotron 340B Data Generation Pipeline (unreleased) A data pipeline to generate multiturn data with Nvidia's new 340B Nemotron model, in the ShareGPT format. Python, Nvidia API, Prompt-Engineering
Multiversal Data (wip) A potential solution to help Large Language Models predict future events. Python, Together.ai API, Prompt-Engineering

๐Ÿ’ผ Skills

Python
Next.js
PyTorch

๐ŸŒฑ Learning

I'm currently exploring the following technologies:

- Machine Learning

- Cloud Computing

- Data Science

- Deep Learning

๐Ÿ“ซ Get in Touch

Feel free to reach out to me for collaboration, discussions, or just to say hi!

Email

Pinned Loading

  1. operation-athena-mongo operation-athena-mongo Public

    Curating reasoning tasks for LLMs.

    TypeScript 3

  2. Replete-Guide-pro Replete-Guide-pro Public

    MDX

  3. LOGSHAPER-OOBAtoDPO LOGSHAPER-OOBAtoDPO Public

    Convert first and second message in logs to the "question" and "rejected" fields of DPO format.

    Python 1

  4. LOGSHAPER-OOBAtoShareGPT LOGSHAPER-OOBAtoShareGPT Public

    A script for reformatting Oobabooga chat logs into multiturn data.

    Python 1

  5. Benchmark-Contamination-Checker Benchmark-Contamination-Checker Public

    Python