Skip to main content
← Back to P Definitions

Percentiles

What Is Percentiles?

Percentiles are a fundamental concept in statistics and data analysis that indicate the relative standing of a particular value within a given dataset. Specifically, a percentile represents the percentage of values in a dataset that fall below a certain data point. For instance, if a data point is at the 80th percentile, it means that 80% of the observations in the dataset are below that value. Percentiles are widely used to understand the data distribution and position of individual data points without needing to know the entire dataset. They are particularly useful for comparing a single observation against a larger group.

History and Origin

The concept of percentiles as a statistical measure gained prominence in the late 19th century. The English polymath Francis Galton is widely credited with coining the term "percentile" around 1885, as part of his pioneering work in biometrics and statistics.11,10 Galton's contributions were instrumental in developing methods for describing and understanding variation within populations, laying a foundation for the use of percentiles in various scientific and social applications. His work extended the utility of simple averages by providing a more nuanced view of data spread and individual positions within a distribution.

Key Takeaways

  • Percentiles quantify the position of a data point relative to others in a dataset.
  • The k-th percentile is the value below which k percent of the observations fall.
  • The 50th percentile is specifically known as the median, representing the middle value of a dataset.
  • Percentiles are widely applied in fields like finance, education, and health to assess relative performance measurement or status.
  • While intuitive, percentiles have limitations, especially with small sample sizes or highly skewed data.

Formula and Calculation

Calculating a percentile typically involves ordering the dataset and then identifying the value at a specific position. There are several methods for calculating percentiles, but a common approach, often used by statistical software, is as follows:

Let $N$ be the total number of data points in an ordered dataset (from smallest to largest).
To find the $k$-th percentile ($P_k$), first calculate the index ($i$) for that percentile:

i=k100×Ni = \frac{k}{100} \times N
  • If $i$ is an integer, the $k$-th percentile is the average of the value at the $i$-th position and the value at the $(i+1)$-th position in the ordered dataset.
  • If $i$ is not an integer, round up to the next whole number. The $k$-th percentile is the value at that rounded-up position.

For example, if you want the 75th percentile of 20 data points, $i = (75/100) * 20 = 15$. You would then average the 15th and 16th values in the sorted list. If you wanted the 70th percentile of 20 data points, $i = (70/100) * 20 = 14$. You would then average the 14th and 15th values in the sorted list. This approach ensures a consistent method for locating the percentile. The calculation relies on organizing the data, which highlights its role in quantitative analysis.

Interpreting Percentiles

Interpreting percentiles provides valuable context beyond raw data points. If an investment portfolio's annual return is at the 90th percentile among similar portfolios, it indicates that its return outperformed 90% of those peer portfolios. This interpretation doesn't state the absolute return, but rather its relative strength within a group. Conversely, a value at the 20th percentile suggests it is lower than 80% of the dataset.

In finance, percentiles are crucial for understanding the performance of assets, funds, or even salaries. For instance, comparing your income to national income percentiles can provide insight into your relative economic standing. The 50th percentile, or median, is a particularly important benchmark as it divides the data exactly in half, providing a clear central point, distinct from the mean, which can be skewed by outlier values.

Hypothetical Example

Consider a hypothetical scenario where an investor, Maria, is evaluating the annual returns of 100 mutual funds to identify top performers for her portfolio management strategy. She collects the following annual returns (simplified for brevity, sorted in ascending order):

2.1%, 2.5%, 2.8%, ..., 8.0%, 8.2%, ..., 15.0%

Suppose Maria wants to find the 75th percentile return to identify funds that performed better than three-quarters of the group.

  1. Order the data: The returns are already sorted.
  2. Calculate the index: For the 75th percentile ($k=75$) and $N=100$ funds:
    $i = (75/100) \times 100 = 75$
  3. Find the percentile value: Since the index is an integer, the 75th percentile is the average of the 75th and 76th values in the sorted list. Let's say the 75th fund returned 9.8% and the 76th fund returned 10.0%.
    $P_{75} = (9.8% + 10.0%) / 2 = 9.9%$

This means that 75% of the mutual funds had an annual return of 9.9% or less. Maria can use this benchmark to select funds performing in the top quartile of the analyzed group.

Practical Applications

Percentiles have extensive practical applications across various financial and economic domains:

  • Investment Performance: Fund managers and investors use percentiles to compare the returns of a specific fund against its peer group. For example, a fund ranking in the 90th percentile means it outperformed 90% of similar funds, offering a clear performance measurement.
  • Economic Analysis: Government agencies and researchers often report income and wealth distribution in terms of percentiles to illustrate income inequality and economic disparities. For example, the Federal Reserve provides data on the distribution of household wealth and income across different percentiles to track economic well-being and wealth concentration.9,8 The Federal Reserve's economic data series (FRED) also provides extensive data categorized by income percentiles, which are used as key economic indicators.7,6
  • Risk Management: In risk assessment, value-at-risk (VaR) is often expressed using percentiles. For example, a 99th percentile VaR of $1 million indicates that there is a 1% chance of losing $1 million or more over a given period.
  • Credit Scoring: Lenders may use percentile ranks to assess a borrower's creditworthiness relative to the broader population, helping to gauge the probability of default.
  • Compensation and Salary Benchmarking: Companies use percentiles to set competitive salary ranges. For instance, offering a salary at the 75th percentile means paying more than 75% of similar positions in the market.

Limitations and Criticisms

While percentiles offer intuitive insights into data distribution, they also have limitations. One significant criticism is that converting raw data into percentile ranks can lead to a loss of information, transforming ratio-scale data into ordinal-scale data.5,4 This means that while percentiles tell you the relative order of values, they do not convey the magnitude of the differences between them. For example, the difference in underlying values between the 50th and 51st percentile might be vastly different from the difference between the 98th and 99th percentile, especially in highly skewed distributions.3

Furthermore, percentiles can be sensitive to the sample size; in small datasets, a single outlier can disproportionately affect percentile values.2 This sensitivity can make percentiles less representative of the true underlying data distribution in limited observations. They also do not inherently provide information about the shape of the data's distribution, such as its skewness or kurtosis.1 For a comprehensive data analysis, percentiles should ideally be used in conjunction with other statistical measures, such as the mean and standard deviation.

Percentiles vs. Quartiles

The terms percentiles and quartiles are closely related and often confused, but they represent different divisions of a dataset.

FeaturePercentilesQuartiles
DefinitionDivide a dataset into 100 equal parts.Divide a dataset into 4 equal parts.
Number of Divisions99 specific percentile values (P1 to P99).3 specific quartile values (Q1, Q2, Q3).
RelationshipQuartiles are specific percentiles: <br> • Q1 = 25th percentile <br> • Q2 (Median) = 50th percentile <br> • Q3 = 75th percentileA quartile represents a broader range of values, encompassing 25% of the data.
GranularityOffers a more granular view of data distribution.Provides a simpler, broader summary of data spread.

Both are types of quantiles, which are values that divide a dataset into equal-sized subgroups. While quartiles offer a quick snapshot of the data's spread into four main sections, percentiles provide a more detailed positional statistical inference within the entire range of data.

FAQs

What is the 50th percentile?

The 50th percentile is the median of a dataset. It is the value below which 50% of the data points fall, and above which the other 50% fall. It represents the middle point of a sorted dataset.

Can a percentile be negative?

Yes, a percentile can be negative if the data points in the dataset are negative. Percentiles simply reflect the position of a value within the distribution; they do not change the sign of the original data. For example, if you are looking at percentage returns on investments, and many are negative, the 25th percentile could indeed be a negative return.

What is the difference between a percentile and a percentage?

A percentile indicates the percentage of observations in a dataset that fall below a specific value. A percentage, conversely, expresses a part of a whole as a fraction of 100. For example, scoring 80% on a test means you answered 80 out of 100 questions correctly (a proportion). Being at the 80th percentile on the same test means you scored higher than 80% of the other test-takers (a relative position within a data distribution).

Why are percentiles useful in finance?

Percentiles are useful in finance for performance measurement, risk assessment, and understanding wealth or income distribution. They allow investors and analysts to benchmark a specific asset, portfolio, or individual's financial standing against a larger group, providing crucial context beyond absolute values.