Projects - Dhanesh B.B. | Data Science Portfolio

Featured Projects

Showcasing innovative solutions that demonstrate expertise in machine learning and data analytics

PyPI Package

InsightfulPy

Python Toolkit for Exploratory Data Analysis

v0.2.0

Python toolkit for exploratory data analysis with 30+ functions across 11 files for statistical summaries, data quality checks, and batch visualizations. Built with modular architecture (10 specialized modules) and constants-driven design. Works in Jupyter, IPython, and terminal. Features environment detection, built-in help system (help(), quick_start(), examples(), list_all()), and multi-dataset comparison. Published on PyPI.

9 Releases

Beta Status

Open Source

Python 3.8+

Pandas

NumPy

Matplotlib

Seaborn

SciPy Missingno TableOne ResearchPy

GitHub PyPI

Dec 2024 - Present

Automation

AutoCSV Profiler Suite

Multi-Environment CSV Analysis Orchestrator

v2.0.0

Multi-environment CSV orchestrator resolving numpy/pandas/scipy conflicts through isolated conda environments. Unified interface for YData Profiling, SweetViz, DataPrep engines with subprocess isolation. Features Rich CLI, memory chunking (1GB default), parallel environment setup, lazy loading, and graceful degradation. 28 tests, cross-platform support.

4 Engines

Isolated Envs

Cross-platform

Python

Conda

YAML EDA Tools Automation

View Project

Mar 2025 - Present

AI Tools

Multi-AI Chat Manager

AI Services Orchestration Platform

Demo

Hybrid Python + Electron app for 15+ AI platforms (ChatGPT, Claude, Gemini, Perplexity, Grok, DeepSeek). Features browser extension for simultaneous prompt distribution, grid/side-by-side layouts, multi-display support, Groq API prompt enhancement, profile switching, and HTTP polling server with JSON-RPC IPC communication.

15+ Platforms

Extension

IPC Ready

Electron

Python

Windows Productivity Automation

View Demo

Oct 2025 - Present

Machine Learning Projects

Classification, regression, and predictive analytics

Home Loan Default Predictor

Home loan default prediction analyzing 58.44M records across 7 datasets. XGBoost classifier with 143 engineered features achieves 83% accuracy, 52.8% recall, and 0.785 ROC AUC at threshold 0.60. Addresses class imbalance (8% default rate), memory optimization (68.5% reduction), multicollinearity, and saves 1200+ hours of manual review through intelligent risk assessment. Grade A evaluation.

Python

XGBoost

Scikit-learn

Jupyter

Code Kaggle

Jun 2024 - Mar 2025

INX Employee Performance Analysis

Multiclass classification (3 classes) for employee performance using XGBoost with 92.5% accuracy, 0.9756 ROC AUC, and SHAP interpretability. Analyzes 1,200 records across 11 features with class weights for imbalance. Top drivers: environment satisfaction, salary hike, promotion history. Includes department-wise analysis and HR recommendations.

Python

XGBoost

SHAP

Scikit-learn

INX Employee Performance Analysis Repository

Code

Mar 2025 - Apr 2025

FicZon Sales Effectiveness

Sales lead quality prediction using XGBoost with 81.06% ROC AUC and 84.74% recall on 7,420 leads. Handles 24% missing data, 26-category high cardinality via frequency encoding, and 1.6:1 class imbalance. Business impact: $142K cost savings, $380K revenue gains, 45% junk lead reduction. Grade A evaluation.

Python

XGBoost

Scikit-learn

Jupyter

View Code

Feb 2025 - Mar 2025

Portuguese Bank Campaign Predictor

Term deposit prediction using LightGBM on UCI dataset (41,188 records). Addresses 7.87:1 class imbalance, VIF >26K multicollinearity, and data leakage prevention (4 temporal features removed). Achieves 87.8% accuracy, 60.9% recall, 20% lower CV variance than XGBoost. 5 models compared, threshold optimization 0.1-0.9. Grade A evaluation.

Python

LightGBM

Scikit-learn

Jupyter

Portuguese Bank Campaign Predictor Repository

View Code Kaggle

Jun 2024 - Mar 2025

Capital Bikeshare Demand Forecaster

Bike rental demand prediction using Ridge Regression on Capital Bikeshare (731 days). R²=0.832, CV R²=0.815±0.032 with only 0.5% overfitting gap (vs XGBoost 9.4%). Resolves VIF 662→37 multicollinearity. 10 algorithms compared, 72x faster than XGBoost. Provides fleet rebalancing recommendations. Grade A+ evaluation.

Python Ridge Regression

Scikit-learn

Jupyter

Capital Bikeshare Demand Forecaster Repository

View Code Kaggle

Jun 2024 - Mar 2025

Automobile Price Predictor

1985 automobile price prediction using Lasso on UCI dataset (200 samples, 42 features). 91.7% R² with 29 sparse coefficients, 600x faster inference than XGBoost. Resolves extreme multicollinearity (VIF 16,676→8.36) via PCA. 10 algorithms compared with 2.3-point CV-test gap vs XGBoost's 8.3-point gap. Grade A+ evaluation.

Python Lasso PCA

Scikit-learn

View Code Kaggle

Jun 2024 - Mar 2025

Development Tools & Automation

CLI tools, desktop apps, and automation frameworks

SpectraTact

Hybrid Python + Electron app for intelligent window orchestration on Windows. Features grid/side-by-side layouts, multi-monitor support (span, distribute, overflow modes), profile-based management, keyboard shortcuts, dark/light themes, and first-run onboarding with JSON-RPC IPC, two-tier caching, and live config reload.

Electron

Python

Windows Automation

View Code

Aug 2025 - Present

PyPI

AutoCSV Profiler

Automated CSV profiling with interactive CLI, automatic delimiter detection, and memory-efficient chunked processing. Features Rich console interface, exception hierarchy, public Python API with analyze() function, TableOne/ResearchPy integration. 20 tests, supports Python 3.8-3.13. PyPI v2.0.0.

Python 3.8+ CLI

Pandas Rich

PyPI Code

Feb 2025 - Present

Private

Notebook Extractor

Python CLI tool for extracting and organizing content from Jupyter notebooks. Extracts code cells, function/class definitions, imports, markdown, and Base64 embedded images. Features structured output directories, section-based outlines from headers, and execution count preservation. 37 tests with pytest. v0.0.1 Alpha.

Python

nbformat CLI

pytest

Dec 2025 - Present

Private

StatementForge

Personal financial data extraction system for processing PDF statements from Indian banks, credit cards, and UPI platforms. Supports 10+ institutions. Features institution-specific extractors, unified transaction schema, FY-based consolidation, and CSV outputs covering 5 years of financial data.

Python pdfplumber

Pandas ETL

Jun 2025 - Present

Web Development

Portfolio sites and web applications

Portfolio Website (dhaneshbb.github.io)

Static portfolio with vanilla JS ES6 modules and Bootstrap 5. Features typing animation, 3D card tilt effects, infinite scroll carousels, light/dark theme with FOUC prevention, custom PDF viewer, and certificate viewer. Lighthouse CI, ESLint/Stylelint/Prettier, GitHub Pages deployment.

JavaScript

Bootstrap 5

GitHub Pages

ESLint

Live Site Code

May 2025 - Present

Private

JobSync

Job search workflow app with Next.js 15, React 19, Prisma, and shadcn/ui. Features application tracking, AI document analysis (LangChain with Ollama/OpenAI/Vertex AI), Google Calendar/Tasks sync, RSS job aggregation, and browser extension for 50+ portals. Rate limiting, CSRF, optional SQLite encryption.