Leveraging SIR and SEIR models to predict the spread of COVID-19 in India

Authored by:

Ajit Dhamale
Amol Dhekane
Anima Jain
Biswajit Panigrahi
Preethi Narayan
Sohil Sodhiya


In this paper we will study the COVID-19 infections spread across states and Union Territories (UT) India. The objective of this paper is to build predictive machine learning model to predict

  • Number of confirmed cases
  • Number of cases that would recover, removed (fatalities)
    As part of the study we will be analysing the data published by various sources related to COVID-19 in India, between the months of March and July, 2020.
    We have used inferences from Susceptible-Infected-Recovered (SIR) and Susceptible - Exposed - Infectious - Recovered (SEIR) to project the future trajectory of infections and recoveries.
    The predictions would then be used to determine the demand/shortfall in hospital bed availability in order to help the local authorities help plan accordingly.

Materials and Methods:

For this research we analysed data from various sources available in public domain. We searched for state wise data from sources such as Ministry of Health and Family Welfare, Government of India COVID-19 dashboard (https://www.mygov.in/covid-19), Kaggle, as well as crowd sourced databases for COVID-19. The datasets were evaluated on completeness and recency. During evaluation, Common issues observed were missing data, old data which was not up-to date and granularity. These were identified and a strategy to address these shortcomings was also finalized. Due to lack of availability of city wise data it was decided to perform the analysis at state level.
After detailed evaluation of sources, crowd-sourced database for COVID-19 stats was selected as the golden data source. The https://api.covid19india.org/ datasets consists of State wise time series of Confirmed, Recovered and Deceased numbers.
Since the datasets are available in public domain, no specific approval or permissions were required.
Since Maharashtra was found to have maximum number of confirmed cases over a period of time, this drove the analysis and model building towards this particular state.
Further analysis of the Maharashtra state against some of the metro and non-metro states was also performed for a comparative study. This was done in light of the aspect that the effect of demographics, industrial development, prevalence of IT service industry and other such key internal and external factors on the progression of infection within these states.


Your team has done really admirable work. We really appreciate and its also helpful to all our learners.