Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.
-
Updated
Oct 25, 2024 - Python
Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.
Distribution transparent Machine Learning experiments on Apache Spark
This project implements 30+ variants of ANN algorithms to find the K nearest neighbors in high-dimensional vector spaces. It is meant as a convenient sandbox: drop in your own ANN code, run a one-liner, and instantly compare build/search speed and recall against the bundled baselines.
Do models distinguish between declared-true and declared-false premises?
Attentively Embracing Noise for Robust Latent Representation in BERT (COLING 2020)
O(N) attention with a bounded inference KV cache. D4 Daubechies wavelet field + content-gated Q·K gather at dyadic offsets.
Reproducible research comparing GNN (GraphSAGE, GCN, GAT) vs ML baselines (XGBoost, RF) on Elliptic++ Bitcoin fraud detection. Features ablation experiments revealing when tabular models outperform graph neural networks.
Emotiwave is a research project investigating how well AI systems can recognise human emotions from video when one or more sensors fail. The core question: if you lose the audio, or the camera, or the transcript — does the system fall apart, or does it adapt?
Multi-agent verification for AI outputs: claim verification, RAG diagnostics, pre-action verification for agentic AI. Includes ablation studies proving multi-agent vs single-prompt tradeoffs, FaithBench benchmarks, and bias-triggering evaluation methodology
Machine Learning analysis for an imbalanced dataset. Developed as final project for the course "Machine Learning and Intelligent Systems" at Eurecom, Sophia Antipolis
🧠 Automated neural network ablation studies using LLM agents and LangGraph. Systematically remove components, test performance, and gain insights into architecture importance through an intelligent multi-agent workflow.
A multimodal deep learning project for classifying mental health-related memes, combining both textual and visual features.
Six Ways to Forget: Biologically-grounded forgetting mechanisms for LLM agent memory systems. 18 experiments, 4 falsified hypotheses, STDP ablation (Cohen's d = 3.163).
Re-implementation of the paper titled "Noise against noise: stochastic label noise helps combat inherent label noise" from ICLR 2021.
Evaluation framework for self-hosted LLMs. Systematic prompt ablation (baseline, CoT, few-shot, self-consistency voting) on Llama 3.1 8B via lm-evaluation-harness, with Wilson CI statistical analysis, determinism validation, and load testing under concurrency. Found chain-of-thought degrades accuracy 25pp at small scale.
Phase zero of Artificial Neuroplasticity: Giving models self-editing capacity, through a trained triumvirate of three models; Analyzer / Trainee / Evaluator. The Analyzer uses TransformerLens to watch the Trainee. The Evaluator is the Review Board,, confirming the Trainee has become smarter than itself. This IS NOT implemented in this phase zero.
Binary image classification project to detect drones vs non-drone aerial objects (birds) using a pretrained ResNet-18 model. Built with PyTorch and transfer learning, includes class-imbalance handling, validation metrics, confusion matrix analysis, and an ablation study comparing frozen vs fine-tuned backbones.
This study tries to compare the detection of lung diseases using xray scans from three different datasets using three different neural network architectures using Pytorch and perform an ablation study by changing learning rates. The dimensional understanding is visualised using t-SNE and Grad-CAM for visualisation of diseases in x-ray scans.
Ablation Study of CapsuleNetwork on TimeSeries
Intelligent layer pruning toolkit for LLMs featuring iterative optimization, self-healing algorithms, and comprehensive benchmarking.
Add a description, image, and links to the ablation-study topic page so that developers can more easily learn about it.
To associate your repository with the ablation-study topic, visit your repo's landing page and select "manage topics."