Yale Chang • over 11 years ago
UPDATE: clustering and regression methods find some interesting pattern in the data
1) To predict the number of new cases/death at a given region on a given time, we apply clustering algorithm on the data and discover we might need different regressors to predict new cases. We can find strong correlation between deaths number and cases number 19 days ago, which is consistent with practical experience.
2) Lasso Poisson regression can find pattern of new case each day, but cannot predict exact number. It also identifies past case numbers in neighboring regions are predictive for case number in one region. Quarantine is also relevant.
3) Compute transmission rate in each region by assuming exponential growth, then use a nonlinear dependency measure copula correlation to find relevant factors. Factors such as 'bad water', 'small house', 'urban', 'tap water', 'latitude' are found relevant. Further linear/nonlinear regression analysis is necessary to identify the exact function type.
Link for slides and data: https://www.dropbox.com/sh/ief4x9yshmd6619/AADJgB49a_xQcwl9aYynv2Pua?dl=0
Comments are closed.
0 comments