Jiarui Xu XavierSpycy

graph TD;
    DomainKnowledge[Domain Knowledge]-->MachineLearning[Machine Learning];
    MachineLearning[Machine Learning]-->StatisticalLearning[Statistical Learning];
    MachineLearning[Machine Learning]-->DeepLearning[Deep Learning];
    DomainKnowledge[Domain Knowledge]-->BackendDevelopment[Backend Development];
    DeepLearning[Deep Learning]-->ImageClassification[Image Classification];
    ImageClassification[Image Classification]-->LabelNoise[Label Noise];
    DeepLearning[Deep Learning]-->NaturalLanguageProcessing[Natural Language Processing];
    NaturalLanguageProcessing[Natural Language Processing]-->TopicModeling[Topic Modeling];
    NaturalLanguageProcessing[Natural Language Processing]-->LLMApplication[LLM Application];
    LLMApplication[LLM Application]-->RAG[Retrieval-Augmented Generation];
    LLMApplication[LLM Application]-->AIAgent[AI Agent];
    DeepLearning[Deep Learning]-->Multimodality[Multimodality];
    Multimodality[Multimodality]-->ImageTextClassification[Image-Text Classification];
    Multimodality[Multimodality]-->VisualQuestionAnswering[Visual Question Answering];

	Skills
Domain
Operating Systems
Languages
DL Frameworks
LLM Applications
ML Library
Scientific Computing
Web Deployment
Database
Version Control
Containerization
Self-developed Package
Statistic Tools
Visualization
Integrated Development Environment

📖 Curriculum Vitae

English | 中文版

🌊 Experiences

Ressearch Interests : LLM Applications / Multimodality Learning
Research:
1. Cross-modal medical image representation learning (USYD Engineering VRI Scholarship)
2. Topic Modeling for the Evolution through Descriptions of Applications Longitudina (TopMEDAL)
Internship:
- LLM Algorithm Intern @
  - AI Lab
  - 2024.08 ~ 2024.11
- Vacation Research Intern @
  - Faculty of Engineering
  - 2024.06 ~ 2024.07
- AIGC Algorithm Intern @

🚀 Projects

➣

Photo by Google DeepMind on Unsplash

🌟Multilayer Perceptron from Scratch using NumPy↗

A robust implementation of multilayer perceptrons, entirely built upon the powerful NumPy library.

Advantages of our implementation:

Keras-like
```
import numpy_keras as keras
```
Autograd Integration
```
import numpy_keras.autograd as keras
```
Good Performance on MNIST
Good Performance on a Complex Dataset

🌟Non-negative Matrix Factorization using NumPy↗

This project implements nine different Non-negative Matrix Factorization (NMF) algorithms and compares the robustness of each algorithm to five various types of noise in real-world data applications.

Well-reconstructed effects

Image reconstruction
Sufficient experiments

We conduct a seires of experiments, thus when developing your own algorithms, these results could act as a baseline. The results of the experiments (2 datasets × 5 noise types × 2 noise levels × 5 random seeds implicitly) are displayed in the repository.
Flexible development

Our development framework empowers you to effortlessly create your own NMF algorithms with minimal Python scripting.
Mature pipeline

Our framework offers well-established pipelines, accommodating both standard and customized NMF tests.

For personalized NMF models, the nmf parameter accepts a BasicNMF object. You can seamlessly insert your own NMF model into our pipeline to evaluate its performance.
Multiprocessing experiments

We've harnessed the power of multiprocessing for extensive experiments, significantly enhancing efficiency. This approach has halved the overall experiment duration, reducing it to 30% ~ 50% of the time it would take to run each experiment sequentially.

For a comprehensive analysis of your algorithm, our platform enables conducting multiple experiments across various datasets:
```
from algorithm.pipeline import Experiment
exp = Experiment()
exp.choose('L1NormRegularizedNMF')
exp.execute()
```
Interactive algorithm interface

Demo

Note that the initial parameter in these experiments can also be BasicNMF object, allowing the direct integration of your custom NMF model for thorough evaluation and testing.

DON'T HESITATE TO DEVELOP YOUR OWN ALGORITHM!!!

➣

Photo by Alex Knight on Unsplash

🌟EMNIST Handwritten Character Classification

↗

This project aims to reproduce various convolutional neural networks and modify them to our specific requirements.

Performance of different CNNs on the training set

	AlexNet	VGGNet	SpinalNet	ResNet
Accuracy	87.95%	89.80%	87.15%	89.28%
Precision	87.62%	90.01%	86.18%	89.24%
Recall	87.95%	89.80%	87.15%	89.28%
F1 score	86.59%	88.42%	85.28%	88.30%

Performance of different CNNs on the test set

	AlexNet	VGGNet	SpinalNet	ResNet
Accuracy	86.96%	87.24%	85.92%	86.88%
Precision	85.55%	86.43%	85.92%	86.88%
Recall	86.96%	87.24%	85.92%	86.88%
F1 score	85.58%	85.66%	84.07%	85.68%

Effects of VGGNet

🌟CAT: A Visual-Text Multimodal Classifier

↗

This project involves a multi-label multi-classification problem. We deployed four pre-trained image models and two pre-trained text models. To enhance performance, we developed 12 multi-modal models using self-attention and cross-attention mechanisms. The project poster showcases some valuable techniques and intriguing discoveries.

CAT (Convolution, Attention and Transformer) architecture

Project Poster

🌟Robust Traniners for Noisy Labels

↗

This project is an experimental repository focusing on dealing with datasets containing a high level of noisy labels (50% and above). This repository features experiments conducted on the FashionMNIST and CIFAR datasets using the ResNet34 as the baseline classifier.

The repository explores various training strategies (Trainer objects), including ForwardLossCorrection, CoTeaching, JoCoR, and O2UNet. Specifically, for datasets with unknown transition matrices, DualT is employed as the Transition Matrix Estimator.

Meaningful Loss Trends

Loss Trend 1

Loss Trend 2

Persuasive Results

FashionMNIST0.5

Actual Transition Matrix			Estimated Transition Matrix
0.5	0.2	0.3	0.473	0.209	0.309
0.3	0.5	0.2	0.306	0.485	0.232
0.2	0.3	0.5	0.221	0.306	0.460

FashionMNIST0.6

Actual Transition Matrix			Estimated Transition Matrix
0.4	0.3	0.3	0.407	0.295	0.298
0.3	0.4	0.3	0.297	0.394	0.308
0.3	0.3	0.4	0.301	0.310	0.388

🌟Transformers for Tabular Data↗

A PyTorch-based implementation that leverages Transformer architectures to enhance the handling and design of tabular data.

🌟MultiCLIP: Multimodal-Multilabel-Multistage Classification using Language Image Pre-training↗

A framework for multimodal-multilabel-multistage classification utilizing advanced pretrained models like CLIP and BLIP.

Diagrams of implementation:

CLIP + Router

BLIP + Anything

➣

Photo by Growtika on Unsplash

🌟Hands-on LoRa: Practical Fine-tuning LLMs using LoRa↗

Completed Showcase of LLMs with LoRa

LLM	No. Parameters	Task	LoRa/QLoRa	Code
Gemma-IT	2B	Text-to-text Generation	QLoRa	Link
Qwen 2	1.5B	Named Entity Recognition	LoRa	Link
Llama 3	8B	Cross-Linguistic Adaptation	LoRa	Link

🌟Llama3Ops: From LoRa to Deployment with Llama3↗

Model weights: XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k
Finetuning framework: LLaMA-Factory | PEFT | Unsloth
Qauntization framework: llama.cpp | AutoAWQ | AutoGPTQ
Deployment framework: llama.cpp | ollama | TensorRT-LLM & Triton | vLLM
RAG framework: LangChain | LlamaIndex

➣

🌟Awesome Tutorials for TensorFlow2↗

📄 Certification

Specialization	Launcher	Completion Date	Credential
Generative Adversarial Networks (GANs)	DeepLearning.AI	Jun 2024	Link
Natural Language Processing	DeepLearning.AI	Oct 2023	Link
Deep Learning	DeepLearning.AI	Aug 2023	Link
Mathematics for Machine Learning and Data Science	DeepLearning.AI	Aug 2023	Link
Applied Data Science with Python	University of Michigan	Jul 2023	Link
Machine Learning	DeepLearning.AI & Stanford University	Jul 2023	Link
Mathematics for Machine Learning	Imperial College London	Jun 2023	Link
Expressway to Data Science: Python Programming	University of Colorado Boulder	Dec 2022	Link
Python 3 Programming	University of Michigan	Dec 2022	Link
Introduction to Scripting in Python	Rice University	Nov 2022	Link
Statistics with Python	University of Michigan	Nov 2022	Link
Excel Skills for Data Analytics and Visualization	Macquarie University	Oct 2022	Link
Python for Everybody	University of Michigan	Oct 2022	Link
Excel Skills for Business	Macquarie University	Sep 2022	Link

Credentials

☎️ Contact

If you have any questions or need further information, please don't hesitate to open an issue: Ask a Question.

[email protected]

Jiarui XU

Provide feedback

Saved searches

Use saved searches to filter your results more quickly