Added Readme

This commit is contained in:
Namonay
2025-06-04 04:40:16 +02:00
committed by GitHub
parent e86fb8e357
commit 3233aeb8f4

84
README.md Normal file
View File

@@ -0,0 +1,84 @@
# ft_linear_regression
A simple linear regression project in Python for educational purposes. The goal is to predict the price of a car based on its mileage using linear regression, and to visualize the results and the model's confidence.
## Features
- Reads car mileage and price data from CSV datasets.
- Learns optimal parameters (`theta0`, `theta1`) using gradient descent.
- Normalizes input data for effective training.
- Exports learned parameters to a CSV file.
- Predicts car prices for any given mileage.
- Calculates R² (coefficient of determination) to indicate model confidence.
- Visualizes the regression line, cost function evolution, and confidence score.
## Project Structure
```
ft_linear_regression/
├── datasets/
│ ├── data.csv # Default dataset
│ ├── big_data.csv
│ ├── negative_data.csv
│ ├── nonlinear_data.csv
│ ├── perfectpositive_data.csv
│ ├── small_data.csv
│ └── variance_data.csv
├── model.py # Main file: trains, saves, and visualizes the model
├── estimate.py # Script to estimate a price given a mileage
├── confidence.py # Calculates R² confidence score
└── thetas.csv # Saved learned parameters after running model.py
```
## Usage
### 1. Train the Model
Simply run:
```bash
python model.py
```
- Trains the linear regression model on `datasets/data.csv`.
- Saves the learned parameters (`thetas.csv`).
- Displays a plot with the data, regression line, confidence score (R²), and cost vs epochs.
### 2. Estimate Car Price
After training, estimate a car price by mileage:
```bash
python estimate.py
```
- Enter the mileage when prompted.
- Outputs the predicted price.
### 3. Confidence Score
The R² (confidence) score is printed on the regression plot, indicating how well the model fits the data.
## Custom Datasets
You can swap out the dataset by changing the `data_path` variable in `model.py` or by providing your own CSV file in the `datasets/` folder. The CSV must have columns: `km,price`.
## Requirements
- Python 3.x
- pandas
- numpy
- matplotlib
Install dependencies with:
```bash
pip install pandas numpy matplotlib
```
## Notes
- The model uses simple linear regression (1 feature: km).
- Data is normalized for stable and faster learning.