Hi, I'm Rayan 👋
I'm a data science grad student who's into machine learning and software engineering. I just really enjoy figuring out how to use AI to solve real-world stuff.
RB

About

Pursuing an M.S. in Data Science at the University of Michigan-Dearborn with some experience in machine learning, NLP, and data analysis.

Work Experience

M

Math Learning Center (University of Michigan)

September 2023 - Present
Math and Statistics Tutor
Providing drop-in tutoring for a variety of mathematics courses (calculus, statistics and bio-statisticsusing R). Facilitating 1-1 and group appointments and project management to support students with source-relatedquestions.
E

Emrhod Consulting

January 2022 - Present
AI & Machine Learning Intern / Junior Research Analyst
Designed client satisfaction surveys for Stellantis (Peugeot and Citroën), using CATI and online methodologies to assess consumer perception and brand performance ;Supported UNDP-backed research projects by implementing both quantitative and qualitative research methods, including focus groups and in-depth interviews; Conducted statistical analysis (Descriptive, Inferential and Predictive Statistics) and managed large-scale survey and claims datasets using R, ensuring data accuracy, consistency, and readiness for insights generation; Fine-tuned transformer models (mBART, MarianMT) to transliterate alphanumeric letters to Arabic letters and translate Tunisian dialect to French that improved the time management efficiency of managers by 18.75%; Implemented web-scraping, data preprocessing and augmentation, as well as NLP techniques to enhance the dataset and evaluated model performance using BLEU, ROUGE, METEOR, and TER scores.
V

Vermeg

June 2024 - August 2024
AI/ML Intern
Conducted a comparative analysis of key tools for designing RAG pipelines, including LLamaIndex,LangChain and Vector DB. Defined performance and quality criteria (ease of use, scalability, integration) and created a consistent testing framework to evaluate each tool. Drafted a detailed report with benchmark results and formulated recommendations, presenting findings to Development and Management teams.

Skills

Python
R
SAS
SQL
Java
C
C#
JavaScript
Scikit-learn
TensorFlow
PyTorch
GCP
Spark
Tableau
Power BI
Matplotlib
NumPy
Pandas
My Projects

Check out my latest work

I've worked on a variety of projects, from language translation pipelines to complex machine learning applications. Here are a few of my favorites.

Brain Tumor Classification with CNNs and Transfer Learning

● Built and evaluated four deep learning models (Custom CNN, VGG16, EfficientNetB0, MobileNetV2) for classifying brain MRI images into four categories: glioma, meningioma, pituitary tumor, and no tumor.
● Used a labeled dataset of 3,268 training images and 1,052 testing images, preprocessed with grayscale conversion, resizing to 128×128, and augmented to improve generalization.
● MobileNetV2 outperformed all other models, achieving ~92% accuracy, macro-averaged F1-score above 0.90, and low overfitting, making it ideal for deployment on resource-constrained environments.
● Implemented a full ML pipeline: loading and preprocessing data, visualizing class distributions, training models, evaluating with classification reports and confusion matrices, and comparing performance across models.

Python
Keras
OpenCV
VGG16
EfficientNetB0
MobileNetV2
CNN
Scikit-learn
Matplotlib
TensorFlow

Infant Mortality Prediction

● Conducted exploratory data analysis on a dataset spanning 2,923 country-year health records to assess socioeconomic and healthcare determinants of infant mortality.
● Engineered a predictive log-log linear regression model achieving R² = 0.988 and RMSE = 0.16, outperforming standard linear and stepwise models.
● Diagnosed model fit through residual analysis, Shapiro-Wilk, and Durbin-Watson tests, and recommended hierarchical and panel-data extensions for future modeling.

Python
Google Colab Pro
Linear Regression
Logistic Regression
Shapiro-Wilk
Durbin-Watson
EDA
Matplotlib-learn

Fraud Detection Patterns in Insurance Claims

This project focuses on identifying fraudulent insurance claims using advanced machine learning techniques. By leveraging detailed claim data and employing extensive preprocessing and feature engineering, we developed and compared models such as Random Forest, Logistic Regression, XGBoost, Naive Bayes, and KNN. The Random Forest model emerged as the most effective, achieving high accuracy and AUC scores. This work highlights the potential of data-driven approaches in addressing real-world challenges like fraud detection.

Python
PySpark
Scikit-learn
XGBoost
Pandas
NumPy
Apache Spark
Matplotlib
Seaborn
Gaussian Noise Data Augmentation

Diabetes Prediction Using AutoML and BigQuery

This project focuses on developing a diabetes prediction system using Google Cloud services, leveraging BigQuery for data preprocessing, AutoML for machine learning model training, Vertex AI for deployment, and Looker Studio for data visualization. It emphasizes early detection of diabetes, aiming to improve patient outcomes and resource efficiency. The project utilizes a dataset from Kaggle, cleans and transforms it into a structured format, trains models for predictions, and creates dashboards to present actionable insights, all while considering data privacy and ethical implications.

GCP
AutoML
VertexAI
BigQuery
LookerStudio

Translating Tunisian Dialect to French using Machine Learning

Developed a machine learning pipeline to automate the translation of Tunisian dialect sentences into French, addressing the operational inefficiencies of manual translation at Emrhod Consulting. The project utilized advanced NLP techniques, including fine-tuned MarianMT and mBART models, combined with semi-supervised learning and a transliteration module for converting alphanumeric inputs to Arabic script. This approach significantly reduced translation time, improved accuracy, and contributed to advancing NLP solutions for low-resource languages.

Python
Google Colab Pro
Selenium
Pandas
Numpy
Matplotlib
Scikit-learn
Pytorch
TensorFlow
Contact

Get in Touch

Have a question or want to connect? Feel free to reach out to me on LinkedIn with a direct message and I'll do my best to respond promptly.