Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Camfranglais-LLM #42

Open
7 of 21 tasks
Zaker237 opened this issue Apr 12, 2023 · 3 comments
Open
7 of 21 tasks

Camfranglais-LLM #42

Zaker237 opened this issue Apr 12, 2023 · 3 comments

Comments

@Zaker237
Copy link
Member

Introduction

A lite Language Model that can understand our language(Camfranglais)

Description

Goal

The goal of the project is to build an (hopfully train) a language model that can understand Camfranglais so that the model can be used in the future for translations (for example into pure English or pure French).

Process

I don't have the whole process in mind but my idea is the following:

  • we will start by finding a way to collect enough text in Camfranglais: for this we can make a web application open to the community where people can participate by entering text in Camfranglais with their translations in French and/or English. Oss already has an application of this style in development that we can use Hier the link to the app
  • Then as soon as we have a usable amount of data, we can build to train our models.

Why This project??

  • first of all because it can be fun to work on it.
  • people can in fact learn a lot about ML when working on such a project.
  • and we could at the end have a English/French-Camfranglais translator

Relevant Technology

For the Technology, it will mainly be Python with the following librairies:

  • Pytorch/Tensorflow
  • numpy/pandas
  • transformers

and also some Web Technologies

Complexity

  • Beginner - This project requires no or little prior knowledge of the technolog(y|ies) specified to contribute to the project
  • Intermediate - The user should have some prior knowledge of the technolog(y|ies) to the point where they know how to use it, but not necessarily all the nooks and crannies of the technology
  • Advanced - The project requires the user to have a good understanding of all components of the project to contribute

Required time

  • Little work - A couple of days
  • Medium work - A week or two
  • Much work - The project will take more than a couple of weeks and serious planning is required

Categories

  • Mobile app
  • IoT
  • Web app
  • Frontend/UI
  • AI/ML
  • APIs/Backend
  • Voice Assistant
  • Developer Tooling
  • Extension/Plugin/Add-On
  • Design/UX
  • AR/VR
  • Bots
  • Security
  • Blockchain
  • Futuristic Tech/Something Unique
@github-actions
Copy link

It's great having you contribute to this project

Welcome to the community 🤓, we will carefully review your project idea and get back to you.

If you would like to follow our community's work you should join us on our Telegram chat group and Channel, we help and encourage each other to contribute to open source.
You can also support us financially here to help us build Cameroon one open source at a time.

@billmetangmo
Copy link

Good fun work to do @Zaker237. I can be interested
BTW, I don't think it's necessary at least for preliminary versions (v0.x) to train a new model.

Why ? Because Camfranglais is a mix of more of french/english words then dialects ( don't know real proportion however).
Cosindering my assumption is true and as Chatgpt can easily undertand french/english mix:
2023-04-12

and knows camfranglais but do some errors:
image

The easisest way should be to provide it a dictionnary like this one from valery ndongo https://docsend.com/view/avvk5ef9qpvzy5zd as a context to chatgpt like below:

1xzt5jn

THIS COULD BE A GOOD USE CASE FOR CHATGPT PLUGIN. DOES SOMEONE ALREADY HAVE ACCESS TO ?

@Zaker237
Copy link
Member Author

@billmetangmo That dictionnary from valery ndongo is indeed a great ressource for this project. thanks I didn't know it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants