Posts

Showing posts from May, 2024

GSoC'24 Week 1 & 2 Progress: Building a Cross-Validation Framework in Octave

Image
In the past two weeks, I have been working on enhancing the statistics package for GNU Octave by implementing a cross-validation framework for classification models. The focus has been on implementing a 'crossval' method in the 'ClassificationKNN' class and creating a ‘ClassificationPartitionedModel’ class and a ‘kfoldPredict’ method. Together, these provide a solid foundation for cross-validation in Octave. This blog will walk you through the progress, challenges, and solutions implemented.  Cross-validation is a method used to evaluate and improve the performance of machine learning models. It is essential in classification models, where the goal is to categorize data into predefined classes. Cross-validation helps assess how well a model will generalize to an independent dataset, which is critical for ensuring its robustness and reliability. The process involves partitioning the dataset into a set of folds, training the model on some folds, and then validating it on ...

GSoC'24 Project: Adding GAM and Discriminant Classification Classes and Implementing Missing Methods

I am Ruchika Sonagote from India, and I am currently studying Computer Science at the Indian Institute of Technology Gandhinagar. I am thrilled to share that I will contribute to the GNU Octave Statistics Package this summer. I am thankful to Andreas Bertsatos for mentoring me throughout this project. I am enthusiastic about the opportunity to contribute to the GNU Octave Statistics Package and am committed to making meaningful contributions to its development. This project aims to significantly extend the classification capabilities of the GNU Octave statistics package by implementing additional methods for the existing `ClassificationKNN` class and introducing new classification classes, namely `ClassificationDiscriminant` and `ClassificationGAM`. These enhancements will bridge the gap between GNU Octave and MATLAB in terms of advanced data analysis and machine learning functionalities, fostering a richer environment for scientific computing. Proposed Timeline ...