Nowadays one of the most trendy words is: predictions! You can hear or read it in any field. In business, we want to predict sales, customer churn, product costs, etc. To be able to face humanitarian emergencies, we want to be able to predict wars, economic disasters, natural disasters. And so on.

This is very intuitive, right? Unfortunately, prediction is a vague term.

Prediction definitions

What is a prediction? I asked DeepSeek-R1, which gave me a great, pragmatic answer:

A prediction is:
1. A statement about a future event or outcome.
2. Based on current information, evidence, patterns, or reasoning.
3. Inherently uncertain (as the future hasn't happened yet).

Claude Sonnet 4 gave me a more conversational answer:

A prediction is a statement about what will happen in the future based on current information, patterns, or analysis. It involves making an educated guess or forecast about upcoming events, outcomes, or conditions. Predictions can range from simple expectations to complex statistical models, but they're always uncertain since the future hasn't happened yet.

Cambridge Dictionary simply states:

 A statement about what you think will happen in the future.

The two former definitions don’t really suit oracles responses. When I specifically asked this question, both LLMs confirmed that oracles don’t make predictions in that sense. DeepSeek even decided to show me a semi-graphical explanation:

(Yet, no one believed Cassandra…)

But both these definitions still remind me oracles! Why? Because they’re an answer about a question, no more than that. A lot of useful, related information is left out.

Neural networks are oracles

In this respect, neural networks predictions are similar to oracles responses.

You train a neural network with your sales and other supposedly related data. The neural network analyses those data, find patterns, and answer with a prediction about sales in the next 6 months: 333, 444, 250, 320, 351, 419.

This prediction might be exceptionally accurate, if the neural network is adequate for this type of predictions and it was trained with quality data. And maybe, you don’t need any other information. Perhaps, this prediction is all you wanted to obtain.

Or maybe not. The prediction might leave you with more questions:

  • How much can you trust these results? Are these numbers almost certain, or are they not much better than wild guesses?
  • What are the ranges you might reasonably expect, how wide are these ranges?
  • Which patterns can be identified in the metric’s behaviour? Which seasonalities and cycles are involved? What is the trend? How high is the noise?
  • Which factors affect the prediction? To what extent? Do they affect it positively or negatively?

Neural networks traditionally don’t provide this information. Modern deep learning includes uncertainty quantification methods, but there are important limitations.

However, hybrid models exist. They are halfway between AI and probability. This might be a topic for a future article.

Statistical models

Some statistical and probabilistic models provide the above information, or part of it. Here are some data that statistics can provide, whereas neural networks can’t:

  • Covariance: whether two metrics tend to move in the same direction, or in opposite directions, or in uncorrelated ways.
  • How much a metric’s behaviour affects another metric.
  • Confidence intervals: A range that becomes wider as you look farther in the future, and the probability that values will be in that range. For example: we have an 85% probability of closing 100-120 contracts in February, and 95-130 in March.
  • P-Values: Chances that a value matches the predicted value, or is even more extreme, randomly. Example: we predict that our marketing campaign will cause a 4% increase in sales. A P-Value of 0.15 indicates that, if the campaign had no effects at all, there would still be a 15% chance that sales increase by 4% or more, for unrelated reasons.

It’s also very common to repeat perform the same prediction using different models. One obvious reason is that people need to reduce the risk of relying on a flawed approach. You can’t know for sure if a model is suitable for your prediction. But different models also capture different patterns, and provide different set of information.

Statistics is not just about predictions. An often underestimated branch of statistics is descriptive statistics. It summarises key characteristics of a time series. For example:

  • Minimum and maximum values. They indicate what we can expect in corner cases.
  • Percentiles are “more reasonable” minimums and maximums, once we exclude the lowest and highest values. For example, the 95th percentile means 95% of observations are below this value.
  • A range is the “space” between the boundaries we observe.
  • A mean is a summary of a series of values. We all know about the arithmetic mean: the arithmetic mean between 2, 3, and 4 is 3. But the arithmetic mean is not suitable for any series. Many types of means exist, and it’s important to choose the most significant for each particular case.
  • Error measures indicate how much the observed values diverge from the mean. The measure to use depends on the mean we use. For example, for the arithmetic mean one can use the standard deviation or the variance.

Before predicting the future, you should probably look at current and historical data. This will give you a better understanding of the context.

Why would I care?

Maybe you shouldn’t care. Maybe you don’t have a reason to. Maybe you don’t know how to interpret these data. Maybe certain information, while theoretically relevant, can’t be used to set a course of action. Plus, neural networks usually (not always!) work very well.

That said, statistics is not just theoretical speculation. Here are some examples of how to use such information in business:

  • Standard deviation: Looking at historical data, costs tend to vary by 25%. We should spend 15,000 Pounds per month in the next 6 months. But let’s be prepared to variations.
  • Covariance and correlation: When the costs of materials increase, materials delivery delays also increase. Let’s be prepared.
  • Regression: There seems to be a causation relation between a store size and how much each single customer spends. Let’s buy bigger stores.
  • Seasonality: We have a peak of sales every second Tuesday of the month. Whatever the reason, we might want to have more personnel in the stores.
  • Cycle: We have occasional sales drops. From further investigations, these drops seem to occur when big concerts take place in the city. Next time, we might come up with a promotion.

IMAGE CREDIT AND NOTES:

  • Image itself: DALL·E 3
  • Image concept: Claude Sonnet 4

I had to use Claude for the image concept, because ChatGPT wasn’t able to conceive an image that would represent the article. Claude did a great job composing a fantastic prompt, but unfortunately DALL·E 3’s concept is still not great. The Oracle shouldn’t predict numbers and the word “crytcpedicion” is a bit dull. Still, this blog is partly about the current state of AI, not about art.