# What does a negative kurtosis mean

## Descriptive statistics and probability theory

### 3.6 Dimensions of symmetry and curvature

**Explanatory video for section 3.6 (slides 88-96)**

After a short detour to graphical representations, we now turn to two further measures that are used to characterize cardinal-scaled features or their original lists. So far we have got to know how key figures for the location and the spread of the characteristic values can be formed, now come key figures for the assessment of the **symmetry** (or rather asymmetry) and the so-called **Bulge** of original lists.

### Symmetry and skew

First of all, let's think about what *symmetry* to mean an original list or a feature. In common parlance, an object is referred to as *symmetrical*if the corresponding mirror image cannot be distinguished from the object itself. An analogous definition offers itself with regard to the already known possibility of displaying an original list with the help of a bar diagram for characteristics. A feature is called accordingly **symmetrical**, if its associated bar diagram is symmetrical in the sense of axis symmetry around a parallel to the \ (y \) axis.

It can be shown that in the case of such a symmetry the axis of symmetry must lie precisely at the point \ (\ overline {x} \). Accordingly, a feature is symmetric (by \ (\ overline {x} \)) if and only if (apart from the sequence) the according to \ (2 \ cdot \ overline {x} -x_i \) for \ (i \ in \ {1, \ ldots, n \} \) does not differ from the original list (or, equivalently, the frequency distribution of the feature \ (X \) does not differ from the frequency distribution of the "transformed" feature \ (2 \ cdot \ overline as described) {x} -X \) differs). By simple substitution, one also obtains the equivalent characterization via the agreement (except for the sequence) of the original lists from \ (x_i- \ overline {x} \) and \ (\ overline {x} -x_i \) (or the correspondingly formed features \ (X- \ overline {x} \) and \ (\ overline {x} -X \)).

If a feature does not have the defined property of symmetry, one is interested in a measure of the degree of asymmetry. For this purpose, the so-called **empirical skewness** respectively **Skewness** used in accordance with Definition 3.7 below.

**Definition 3.7 (empirical skewness) **

**empirical skewness**from \ (X \).

It is easy to see that this key figure measures skewness at least insofar as it always assumes the value \ (0 \) for symmetrical features, because (apart from the order) \ (x_i- \ overline {x} \) and \ ( \ overline {x} -x_i \) then obviously \ (\ text {skewness} (X) = - \ text {skewness} (X) \), which only applies to \ (\ text {skewness} (X) = 0 \) can be the case.

It can also be seen that the skewness measure defined in this way (unlike the empirical variance) can assume both positive and negative values; *Strength*, but also the *direction* to quantify the crookedness. The following figure compares bar diagrams of a symmetrical feature, a feature with positive skewness, and a feature with negative skewness.

Figure 3.14: Example: empirical skewness of features

In the present example, the characteristic with positive skewness is characterized by a relatively strong concentration of “small” characteristic values on the left side of the bar diagram combined with a more extensive distribution of the “large” characteristic values on the right side of the bar diagram, while the opposite is true for the characteristic with negative Skew has a more extensive distribution of the “small” original list entries on the left side combined with a high concentration of the “large” original list entries on the right side of the bar diagram.

According to the visual impression of the steeper or flatter rise or fall of the bars in the bar diagram on the left or right side, the following terms have been established:

A feature is called \ (X \)

**left part**or**right skew**, if \ (\ text {skewness} (X)> 0 \) and**right part**or**left skewed**, if \ (\ text {skewness} (X) <0 \)

Also of interest is the relationship between the mutual position of the arithmetic mean \ (\ overline {x} \) and the median \ (x _ {\ text {med}} \), which is clearly determined according to the known convention, to the asymmetry of a feature: this applies to symmetrical ones Features **always**\ (\ overline {x} = x _ {\ text {med}} \), while with left-steeped features **tends to**\ (\ overline {x}> x _ {\ text {med}} \) and with right-hand parts **tends to**\ (\ overline {x}

### Bulge / Kurtosis

The next key figure for characterizing cardinally scaled features is the so-called **Bulge** or **Kurtosis**. This key figure is used to determine whether the original list entries tend to spread evenly around the center of the frequency distribution, or whether there are comparatively many original list entries very close to the center combined with a few (possibly only individual) characteristic values with a large distance from the center.

The term *Bulge* is best explained by looking at the associated bar graphs or histograms; here there is often a clear connection between the key figure *Bulge* and recognize the shape of the diagram. So have characteristics with a *small* Curvature more evenly distributed feature values, leading to *flatter* Peaks in bar charts and histograms, while features with a *huge* Bulge especially many values near the center and some very distant values, leading to *steeper* Peaks in bar diagrams and histograms (combined with some “sparse” values or classes at the edge), as can be seen in the following example.

Figure 3.15: Example: Features with different empirical curvature / kurtosis

For the benefit and the interpretation of the above comparison it is important that both features have matching (arithmetic) mean values and variances, and that the scaling of the \ (x \) axis has been chosen so that the (only a closer look at recognizing) sparsely populated classes are just shown at the outer edge. The “proximity” of the original list entries to the center used in the above explanation must therefore always be assessed in relation to the variance or standard deviation of the characteristic. This is also reflected (otherwise similar to the definition of skew) in the calculation rule of the following definition 3.8.

**Definition 3.8 (empirical curvature, kurtosis) **

**empirical curvature (kurtosis)**from \ (X \).

For the quantitative classification of a calculated *Bulge* respectively *Kurtosis* First of all, it should be noted that a kurtosis is not only obviously always nonnegative, but it can also be shown that the kurtosis of a feature is always at least \ (1 \) and can otherwise in principle be arbitrarily large. A special value of the kurtosis (why this is the case will become a little clearer in the further course of the event) is the value \ (3 \), which forms the boundary between low and high kurtosis. Taking this limit into account, the following characterization of features on the basis of their curvature or kurtosis are common:

A feature is called \ (X \)

**platykurtisch**or**flat-topped**, if \ (1 \ le \ text {kurtosis} (X) <3 \) and**leptokurtisch**or**steep peak**, if \ (\ text {kurtosis} (X)> 3 \)

**Attention** When interpreting a reported kurtosis, it is always necessary because it is also widespread, instead of the (actual) kurtosis the so-called *Excess kurtosis* or shorter the *excess* which is calculated as \ (\ text {kurtosis} (X) -3 \), and unfortunately this occasionally only with *Kurtosis* to designate, which can lead to a considerable likelihood of confusion.

For a better understanding, it is worth taking a closer look retrospectively at the calculation rule not only for the empirical kurtosis, but also for the empirical skewness and variance. A comparison of the corresponding formulas \ [s ^ 2 = \ frac {1} {n} \ sum_ {i = 1} ^ n \ left (x_i- \ overline {x} \ right) ^ 2, \ \ text {skewness} (X) = \ frac {1} {n} \ sum_ {i = 1} ^ n \ left (\ frac {x_i- \ overline {x}} {s} \ right) ^ 3, \ \ text {kurtosis} (X) = \ frac {1} {n} \ sum_ {i = 1} ^ n \ left (\ frac {x_i- \ overline {x}} {s} \ right) ^ 4 \] makes the analogy more visible are: essentially empirical variance, skewness and kurtosis are mean values of the second, third and fourth powers of all deviations of the original list entries from \ (\ overline {x} \), whereby in the formula for the empirical skewness and kurtosis the spread of the original list entries "are calculated out " has been.

It is easy to realize that this “calculation out” is absolutely necessary for the sensible use of these key figures: for example, if you multiply all entries in the original list by the factor \ (2 \), then (without which this would also result in the value \ (2 \) ) increasing the value of the standard deviation \ (s \) or its corresponding power in the denominator of the calculation rules) an “unstandardized” version of the empirical skewness by the factor \ (8 \) and an “unstandardized” version of the empirical kurtosis by the factor \ (16 \) enlarge. This “stretching” of the original list only changes the scaling of the \ (x \) axis in the bar chart or histogram, the shape in terms of skewness or curvature remains completely unchanged and should therefore of course not be accompanied by other values of the associated key figures.

In order to make the empirical skewness and kurtosis of features of different scattering comparable, the standardization of the (already “centered”) entries in the original list (done with the help of division by the corresponding power of \ (s \)) is obviously inevitable. Centering and then subtracting the scatter (which can often be interpreted as subtracting the unit) is also called**standardization**(also

**Studentization**), on the basis of a feature \ (X \) with the original list entries \ (x_i \), \ (i \ in \ {1, \ ldots, n \} \), the arithmetic mean \ (\ overline {x } \) as well as the empirical standard deviation \ (s \) das

**standardized feature**\ (Y \) with the original list entries \ [y_i = \ frac {x_i- \ overline {x}} {s}, \ qquad i \ in \ {1, \ ldots, n \} \] define which arithmetic mean \ (0 \) and empirical standard deviation (as well as variance) \ (1 \), as one can easily recalculate.

Using this standardized feature \ (Y \), the empirical skewness and kurtosis can then be represented in the following particularly simple form:

\ (\ text {skewness} (X) = \ overline {y ^ 3}: = \ sum_ {i = 1} ^ n y_i ^ 3 \)

\ (\ text {kurtosis} (X) = \ overline {y ^ 4}: = \ sum_ {i = 1} ^ n y_i ^ 4 \)

### Skew and curvature in graphical representations

In this section, we summarize how - even without calculating the corresponding key figures from the original list - based on the widespread graphical representations of features (such as box plots and histograms), at least a tendency towards skewness and curvature can be inferred.

First of all, it can be stated that symmetrical features always produce symmetrical box plots, for example as a box plot for the (by 5) symmetrical original list \ [1,2,3,4,5,6,7,8,9 \] the following representation:Figure 3.16: Example: box plot of a symmetrical feature

If a feature is asymmetrical, the type of asymmetry or skew can be identified as follows:

At

**to the left**Has characteristics*tends to*the**right / upper part**(right / top part of the box and right / top whisker) one**bigger**Extension than the left / lower part.At

**right-hand**Has characteristics*tends to*the**right / upper part**(right / top part of the box and right / top whisker) one**smaller ones**Extension than the left / lower part.

Box plots are particularly suitable for assessing the empirical curvature / kurtosis, since they are characteristic of (leptokurtic) features (with a large kurtosis) *Runaway* must be entered here separately and are therefore easy to recognize. The following applies:

For features with

**greater**empirical kurtosis tends to exist**many outliers**, i.e. separately entered feature values outside the whiskers (depending on the skew at least on one side).For features with

**smaller**empirical kurtosis is common**few**or**no outliers at all**.

How the empirical kurtosis can be assessed on the basis of histograms became a concept when the concept was formed *Bulge* described, the skewness can also be easily assessed, since one proceeds here essentially as with stick diagrams and one sees the difference between **left part** and **right part** can tell whether the **Increase in frequency densities** on the left or right flank of the "summit" **steeper** is.

Finally, the following figures compare examples of histograms and (matching) box plots for characteristics of different empirical skewness and curvature.

Figure 3.17: Example: Histograms for different empirical skewness / kurtosis

Figure 3.18: Example: box plots for different empirical skewness / kurtosis

- Why do liberals always put others down
- How does Google Translate improve its translations
- Climbing stairs reduces the fat in the upper body
- Who was the first king of Australia
- How do top Indian celebrities avoid the tax?
- Forced population control is unethical
- Should we buy gold jewelry from CaratLane
- Is Japan a nation state
- What happens on your first period
- Why do foreigners buy real estate in the US
- Why should I bother people
- Are you an auditor
- Can I take Adderall with Lamictal?
- How has bullying changed your life
- Why am I always impressed by others
- Why is China populated
- How interesting is forensics
- Are INFPs compatible with ESTJs or ENTJs
- How strong is Turkey
- Swims well against scoliosis
- Alcohol helps destroy gum bacteria
- What does date mean
- What is mysticism
- Israeli men marry Palestinian women