18IT039-Practical Exam_Nov-2021_7IT1

Nihar javiya
3 min readNov 18, 2021

--

Kindly perform following tasks for the given dataset.

Dataset: https://archive.ics.uci.edu/ml/machine-learning-databases/00475/

Task-1:
Dataset Description using Orange tool.
What is need to be done to improve the accuracy of classification result of the given dataset? Get the maximum classification accuracy possible by performing following methods.
→Pre-processing
o Encoding
o Normalization
o Missing value handling
o Feature Selection

Compare your accuracy with and without applying pre-processing steps. Perform the Classification and visualize accuracy before and after preprocessing in Orange/Python.

Task-2:
Generate the Dashboard of preprocessed dataset from task-1.
Find the Maximum data insights by plotting Bar chart, Boxplot, Pie Plot, Stack Plot using PowerBI dashboard visualization.

Following answers need to be submitted in a single PDF file:
1. Provide a screen shot of data description and explain in brief.
2. Provide screen shot(s) of data pre-processing steps showing its significance.
3. Provide a screen shot showing accuracy before and after pre-processing.
4. Provide a screen shot of PowerBI dashboard with description.

Solution:

Ans-1.Dataset provide to us is a audit data and first we import our data in the orange tool.

Now with use of orange tool we will preprocess our data .

data table

Ans-2. In the task first we encoding our data , normalization, missing value handling and feature selection.

preprocess step
preprocess step

After the preprocessing Now for prediction I need to set the target variable and as per the following screen shot I set the target variable.

Assign target variable

After that we will send our data to the data sampler which will devide dataset .

After data sampling we have select the model with which we predict the value. Here I select the three method Knn, Naive bias and neural network.And after add the prediction widget in this we see the score of our model or dataset.

In my whole dataflow first part is with data preprocessing and and second part is without data preprocessing.My whole dataflow look like

data flow

Ans-3. Now we will see the out put of the preprocess data. By clicking on prediction we see the score of out dataset.

Output of preprocess data

Now we see the Output of without preprocess data.

output of without preprocess data
preprocess vs simple data flow

And Now in the screen shot we see the difference between the preprocess data and without preprocess data.

Ans-4. Now we will generate the powerBI dashboard for the providing data set.To generate the dashboard we first generate the report in the powerBI and publish the report into workflow.

And the report contain the Bar chart, Box plot, Pie plot and stack plot.

Dashboard is look like below

dashboard
dashboard

--

--