Skip to content

This project aims to develop a predictive model for forecasting stock prices in the Tunisian stock market using historical data and machine learning techniques.

Notifications You must be signed in to change notification settings

FirasKahlaoui/tunisia-stock-market

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tunisia_Stock_Market

This project aims to develop a predictive model for forecasting stock prices in the Tunisian stock market using historical data and machine learning techniques.

Project Structure

The project is organized as follows:

  • notebooks/: Contains Jupyter notebooks for data analysis and preprocessing.
    • check_data.ipynb: Notebook for initial data checking.
    • Data_Preprocessing.ipynb: Notebook for data cleaning and preprocessing.
    • data/: Directory containing various stages of stock market data.
      • weekly_stock_market.csv: Raw weekly stock market data.
      • checked_weekly_stock_market.csv: Data after initial checks.
      • cleaned_weekly_stock_market.csv: Data after cleaning.
      • normalized_weekly_stock_market.csv: Data after normalization.
  • stock_scraper/: Contains the web scraping scripts to collect stock market data.
    • companies_data/: JSON files with data for individual companies.
    • companies.json: List of companies to scrape.
    • import_test.py: Script for testing data import functionality.
    • scrapy.cfg: Configuration file for Scrapy.
  • README.md: This file, containing project documentation.
  • requirements.txt: List of Python libraries required for the project.

Requirements

To ensure you have all the necessary dependencies for the Tunisia Stock Market Prediction project, you can use the requirements.txt file provided in the repository. This file includes all the required libraries and frameworks for data analysis, machine learning, deep learning, web scraping, and web development.

Installation

  1. Clone the Repository:

    First, clone the repository to your local machine:

    git clone https://github.com/yourusername/tunisia-stock-market-prediction.git
    cd tunisia-stock-market-prediction
  2. Create a Virtual Environment:

    Next, create a new virtual environment using Python 3. You can create a new virtual environment using venv:

    python3 -m venv env
    source env/bin/activate
  3. Install the Required Libraries:

    You can install the required libraries using the following command:

    pip install -r requirements.txt

Data Collection and Scraping Approach

The data for the Tunisian stock market is visualized on the website through canvas graphs, which do not allow for direct scraping of the data from the webpage's HTML. To overcome this challenge, we adopted a more technical approach by inspecting the network traffic to identify the server requests that fetch the stock data.

Steps to Scrape Data from Canvas Graphs

  1. Inspect Network Traffic:

    • Open the website where the stock market data is displayed.
    • Use the browser's Developer Tools (usually accessible by pressing F12 or right-clicking and selecting "Inspect") to monitor the network traffic.
    • Navigate to the "Network" tab and filter by XHR (XMLHttpRequest) to observe the API calls made by the webpage.
  2. Identify Data Requests:

    • Look for requests that fetch the stock data. These requests are often made to an API endpoint and return data in JSON format.
    • Analyze the request headers, method (GET or POST), and any query parameters or payloads used to retrieve the data.
  3. Craft Custom Requests with Scrapy:

    • Using Scrapy, a popular web scraping framework in Python, create a spider to mimic the identified requests that fetch the stock data.
    • Ensure to include any required headers, cookies, or parameters identified in the previous step. This can involve setting custom headers or cookies in your Scrapy spider to ensure the server accepts and processes your request as if it were coming from a legitimate user.
  4. Parse and Save the Data:

    • Once the data is retrieved, parse the JSON response to extract the necessary information.
    • Save the parsed data into a structured format like CSV or a database for further analysis or processing. This step is crucial for transforming the raw data into a usable format for data analysis, machine learning models, or any other intended use case.

Feature Engineering

Feature engineering is crucial in preparing raw data for machine learning models by creating meaningful input features. For the analysis of weekly Tunisian stock market data, the following steps were implemented:

Date Extraction

  • Converted the date column to datetime format to facilitate extraction of temporal features.
  • Extracted features such as year, month, day_of_month, and week_of_year to capture seasonal and time-related patterns.

Price Features

  • Calculated price_range as the difference between highestPrice and lowestPrice to capture weekly price volatility.
  • Derived price_change as the difference between closingPrice and openingPrice to gauge weekly price movement.
  • Computed weekly_return as the percentage change from openingPrice to closingPrice, normalized to account for weekly fluctuations.

Volume Transformation

  • Applied a logarithmic transformation (log_volume) to the volume column to normalize the distribution and reduce skewness, making the volume data more suitable for modeling.

Moving Averages and Volatility

  • Calculated moving_avg_4 as the 4-week rolling average of closingPrice to smooth out short-term fluctuations and identify long-term trends.
  • Utilized exponential moving average (ema_4) to give more weight to recent prices while computing the average, reflecting recent market sentiment.
  • Determined volatility_4 as the 4-week rolling standard deviation of closingPrice to quantify the weekly price fluctuation or risk.

Data Handling

  • Handled missing values in rolling statistics by filling NaN values appropriately, ensuring continuity in feature calculations.

These engineered features are designed to enhance the predictive capability of machine learning models by providing meaningful insights into the dynamics of weekly stock market behavior. The processed dataset, containing these engineered features alongside the target variable (closingPrice), is then used for training and evaluating predictive models in subsequent steps.

By performing robust feature engineering, the aim is to improve model accuracy and effectiveness in forecasting stock prices based on historical data patterns.

About

This project aims to develop a predictive model for forecasting stock prices in the Tunisian stock market using historical data and machine learning techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages