Determining data distributions

This is about statistical analysis of a data collection as well as different data reduction methods, and in particular, dimensionality reduction through feature extraction. You are given two datasets, each containing a data table of 1000 vector with 100 attributes (i.e., dimensions) in two files with 500 samples for each file. Each dataset is given by two tables of 500 samples each. Both datasets are given as text table files where each dataset is represented as a 1000 x 100 matrix where each row of the matrix is a vector. You are further told that for each dataset, for all the samples (i.e., vectors) the component values of each vector follow the same distribution.

1. Determine the distributions of the two vector component values for both datasets. For each dataset, randomly pick up 10 samples and report the distribution parameters for each of the 10 samples.

2. Compute the norms for all the samples for both datasets. Then determine the distributions for the norms of both datasets, respectively, and report their distribution parameters.

3. Implement PCA and DCT methods and apply them for feature extraction to the two datasets, respectively. Report the reduced dimensionalities for the two datasets after the feature extraction for PCA and DCT, respectively.

4. Compare the feature extraction results between the two methods for the two datasets, respectively, and report your comparison conclusion.

You can use whatever programming language you are comfortable with.

Skills: Python, Data Mining

See more: useful statistical tool analysis data questionnaire, virtual data entry statistical jobs retiree, report writing spss statistical data, sharepoint data view row board display, aspnet determine data changed, excel macro add row fill data, active reports determine row detail, i am conducting this contest to get a unique attractive name for my new boutique, we have some data of uk based company u have to enter this specific data in our software online you will get us$200 for per 2000, data entry vacancy vadapalani bpo telecaller less than 1 yr graduate chennai jobs, describe the following methods of data capture and processing method used by organisation 1 batch processing method 2 online pro, engagement of project managers and data entry operator on contract basis last date to rech for region state, i need tje answer of this question bralirwa manufacturing company provides the following for month of december 2006 stock on 1st, Build a Website - Androidqa2chris - Test This On Projects With My Skills 1, Re-design this website New design for all (4) pages Please use the logo\ s of the banks The , Re-design this website New design for all (4) pages Please use the logo\ s of the banks , This bait bidding is not acceptable, thank you for your understanding., 2. Architecture diagram. 3. User Interface Wireframe or the frontend design. 4. data flow design / ER diagrams if any

About the Employer:
( 6 reviews ) BINGHAMTON, United States

Project ID: #21738468

1 freelancer is bidding on average $50 for this job


Hi there, I have read your project description and i'm confident i can do this project for you perfectly.I still have a few questions. please leave a message on my chat so we can discuss the budget and deadline of the More

$50 USD / hour
(14 Reviews)