These projects aim to impute missing values of the given datasets. You have to write a code in the programming language of Python to
Step 1: Read an excel data set. Do not limit your code to a specific data size or data dimension. Must be able to read or load the data with different size and dimension. You will receive some datasets with numerical/categorical attributes in XLS and/or CSV format.
Step 2: Identify the missing data. Discover the number and the location of the missing data. For instance, if you return the missing indices, you are able to discover the missing data patterns (univariate, monotone, arbitrary missing data).
Step 3: Read the reference paper given for the proposed method and understand the algorithm and try to write a code to impute the missing data based on the given approach. In this case, Multiple imputation via Markov chain Monte Carlo [MI via MCMC].
Step 4: Return the imputed data and compare it with the complete data to measure the accuracy and reliability of your results. Manage your code to return the imputed values. Then you are able to compare the imputed values with the original complete data to compute the error (NRMS). You can automatically or manually generate some diagrams to present and compare your results with the original complete datasets