Weather Delay 🚂⛈️

GR
Gianluca Rea

🚂⛈️ Overview

A machine learning research project studying algorithms to predict train delays based on departure delay and weather conditions. The goal is to improve the attractiveness of rail travel over cars to reduce carbon footprint.

🎯 Project Motivation

Understanding and predicting train delays is crucial for improving passenger satisfaction and modal shift toward sustainable transportation. By combining weather data with historical delay information, we can develop models that help:

  • Provide better delay predictions to passengers
  • Help operators optimize scheduling
  • Encourage rail travel as a reliable alternative to cars
  • Reduce carbon footprint through sustainable transportation

🦾 Repository

See the code: Weather Delay Repository

📊 Methodology

This study explores classification and regression algorithms to:

  1. Classification: Predict if a train will be delayed beyond a threshold
  2. Regression: Predict the exact delay duration

Both approaches analyze the relationship between:

  • Departure delay (key historical indicator)
  • Weather conditions at the time of travel
  • Route and temporal factors

📈 Data Sources

The project utilizes data from:

🛠️ Tech Stack

  • Python: Core implementation language
  • Pandas: Data manipulation and analysis
  • NumPy: Numerical computations
  • Scikit-Learn: Machine learning algorithms and evaluation
  • Jupyter Notebooks: Interactive data analysis and experimentation

🚀 Project Structure

Files and their purposes:

  • 02-Data Analytics: Main data analysis and model development notebook
    • Data loading and exploration
    • Feature engineering from weather and delay data
    • Model training and evaluation
    • Comparison of classification and regression approaches

Note: Data files require downloading from the sources above before preprocessing.

📝 Research Focus

  • Comparing different classification algorithms for delay prediction
  • Evaluating regression models for delay duration estimation
  • Feature importance analysis of weather and temporal factors
  • Model performance trade-offs between accuracy and simplicity

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a new branch for your feature or bugfix
  3. Commit your changes and push them to your fork
  4. Submit a pull request with a detailed description of your changes

Please ensure that your code follows the project's coding standards and includes appropriate documentation.

📜 License

This project is licensed under the MIT License. See the LICENSE file for more details.