|
| 1 | +# EXPLORATORY DATA ANALYSIS (EDA): |
| 2 | + |
| 3 | +Exploratory Data Analysis is an approach to analyze the data using visual techniques. It is used to discover trends, patterns, or to check assumptions with the help of statistical summary and graphical representations. |
| 4 | + |
| 5 | +Mainly we have two types of analysis: |
| 6 | +* Univariate Analysis |
| 7 | +* Multivariate Analysis |
| 8 | + |
| 9 | + |
| 10 | +# UNIVARIATE ANALYSIS: |
| 11 | + |
| 12 | +Univariate analysis is the simplest form of analyzing data. Uni means one, so in other words the data has only one variable.The main purpose of univariate analysis is to describe the data and find patterns that exist within it. |
| 13 | + |
| 14 | +We have several options for describing data with univariate data: |
| 15 | +1. Frequency Distribution Tables |
| 16 | +2. Bar Charts |
| 17 | +3. Histogram |
| 18 | +4. Pie Charts |
| 19 | + |
| 20 | +## Frequency Distribution Tables: |
| 21 | + |
| 22 | +Frequency tells you how often something happened. The frequency of an observation tells you the number of times the observation occurs in the data. |
| 23 | +* Tables can show either categorical variables (sometimes called qualitative variables) or quantitative variables (sometimes called numeric variables). You can think of categorical variables as categories (like eye color or sex) and quantitative variables as numbers. |
| 24 | +Eg: df['Sex'].value_counts() \ |
| 25 | + male 577 \ |
| 26 | + female 314 \ |
| 27 | + Name: Sex, dtype: int64 \ |
| 28 | + |
| 29 | +## Bar Charts: |
| 30 | + |
| 31 | +A bar chart is a graph with rectangular bars. The graph usually compares different categories. Although the graphs can be plotted vertically (bars standing up) or horizontally (bars laying flat from left to right), the most usual type of bar graph is vertical. |
| 32 | +* The horizontal (x) axis represents the categories; The vertical (y) axis represents a value for those categories. |
| 33 | + |
| 34 | +## Histogram: |
| 35 | + |
| 36 | +Histograms are similar to bar charts; they are a way to display counts of data. A bar graph charts actual counts against categories; The height of the bar indicates the number of items in that category. A histogram displays the same categorical variables in “bins”. |
| 37 | + |
| 38 | +## Pie Charts: |
| 39 | + |
| 40 | +A Pie Chart is a type of graph that displays data in a circular graph. The pieces of the graph are proportional to the fraction of the whole in each category. In other words, each slice of the pie is relative to the size of that category in the group as a whole. The entire “pie” represents 100 percent of a whole, while the pie “slices” represent portions of the whole. |
| 41 | + |
| 42 | + |
| 43 | +# MULTIVARIATE ANALYSIS: |
| 44 | + |
| 45 | +Multivariate analysis is part of Exploratory data analysis. Based on MULTIVARIATE ANALYSIS, we can visualize the deeper insight of multiple variables. |
| 46 | + |
| 47 | +Multivariate analysis (MVA) is a Statistical procedure for analysis of data involving more than one type of measurement or observation. It may also mean solving problems where more than one dependent variable is analyzed simultaneously with other variables. |
| 48 | + |
| 49 | +There are many different techniques for multivariate analysis, and they can be divided into two categories: |
| 50 | + |
| 51 | +* Dependence techniques |
| 52 | +* Interdependence techniques |
| 53 | + |
| 54 | +## Dependence Techniques: |
| 55 | + |
| 56 | +Dependence methods are used when one or some of the variables are dependent on others.In machine learning, dependence techniques are used to build predictive models. |
| 57 | +* The analyst enters input data into the model, specifying which variables are independent and which ones are dependent—in other words, which variables they want the model to predict, and which variables they want the model to use to make those predictions. |
| 58 | + |
| 59 | +## Interdependence Techniques: |
| 60 | + |
| 61 | +Interdependence methods are used to understand the structural makeup and underlying patterns within a dataset. In this case, no variables are dependent on others, so you’re not looking for causal relationships. Rather, interdependence methods seek to give meaning to a set of variables or to group them together in meaningful ways. |
| 62 | + |
| 63 | + |
| 64 | + |
| 65 | + |
| 66 | + |
| 67 | + |
| 68 | + |
| 69 | + |
| 70 | + |
0 commit comments