CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
-
Updated
Mar 17, 2024 - Python
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
🔥 🔥 🔥Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).
[NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation
Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).
高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task
Embeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polish Language
TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
The PreTENS shared task hosted at SemEval 2022 aims at focusing on semantic competence with specific attention on the evaluation of language models with respect to the recognition of appropriate taxonomic relations between two nominal arguments (i.e. cases where one is a supercategory of the other, or in extensional terms, one denotes a superset…
Code for "Semi-supervised Formality Style Transfer using Language Model Discriminator and Mutual Information Maximization"
Informal to formal dataset mask MLM
A 78.5% word sense disambiguator based on Transformers and RoBERTa (PyTorch)
A project that harnesses the Stanford NLP library to gauge sentiment from provided text via an intuitive graphical interface.
translatorlab: a machine translation tool that uses artificial intelligence models to provide accurate and fast translations between different languages
The PowerShell Random Text Generator is a script that generates random text based on a given model.
This project scrapes and cleans Shakespeare's public domain texts, trains a character-level LSTM model in PyTorch, and generates fresh, Shakespeare-like text. Perfect for literature and NLP enthusiasts, it provides metrics (loss, perplexity, accuracy) and a platform for tuning hyperparameters and exploring the art of AI-driven language modeling.
The project generates a sentence given a pre-defined starting phrase from the user such as "Ilbierah kont" and the script attempts to build a sentence off of that phrase. Structurally, the generator works in an n-gram fashion but the main structures used to generate the sentences were the unigram, bigram and trigram. The perplexity for each n-gr…
Personality test which classifies in four personality types. For the classification is used the natural language processing classification algorithm - Multinomial Naive-Bayes.
Add a description, image, and links to the languagemodel topic page so that developers can more easily learn about it.
To associate your repository with the languagemodel topic, visit your repo's landing page and select "manage topics."