UX for AI
Posts
Forecasting with Line graphs: a Definitive Guide for Serious UX for AI Practitioners, Part 3

Forecasting with Line graphs: a Definitive Guide for Serious UX for AI Practitioners, Part 3

It’s time to move on to the “Eye Meat” -- various ways you can use Line Graphs in your UX for AI designs. Let’s begin with Forecasting.

Greg Nudelman
January 05, 2024

Now that we got the basics out of the way with Part 1 (dos) and Part 2 (don’ts) of the Line Graph, it’s time to move on to the “Eye Meat” (to borrow a phrase from John Maeda). The next few installments of our newsletter will feature various ways you can use Line Graphs in your UX for AI designs.

Let’s begin with Forecasting.

Forecasting Basics

The practice of forecasting has been around for a very long time. Even Pharaohs of Egypt 5000 years ago relied on soothsayers, bones, entrails, and the like, and who can forget the religious fervor that sprung up around 1400 BC at the Oracle of Delphi? If you want a cool refresher and like comic books and six-pack abs (and who doesn’t!), take a look at the movie “300” – there is a truly epic scene in the movie where the drugged-up young girl is “encouraged” by the ugly, twisted, corrupt corpse-like priests – the Ephors – to make the Oracular pronouncement that will be sending King Leonidas and the 300 heroes to their eventual glorious deaths in the Battle of Thermopylae. A truly spectacular example of forecasting! (Just Google “300 - The Ephors & The Oracle.” WARNING: you might get a video rated “Mature” that might not necessarily be, ahem, workspace-appropriate, but I guess that depends on where you do your forecasting…)

Today (fortunately), we no longer need to drug young virgins to get our predictions. The Forecasting UX for AI design pattern looks actually quite simple, almost mundane. Most often, it shows up as a line graph in a solid line showing the actual collected data followed by a dashed line showing the forecasted value:

Source: https://exceljet.net/charts/line-chart-actual-with-forecast

Optionally, in addition to the dashed line, you may drop a vertical “now” line as well, as a sort of “you are here” marker to indicate the current date and time, and a “confidence interval” cone (more on that later). Here’s an example of both techniques in a temperature differential forecast:

Source: https://www.climate.gov/media/15588

Why should you, as a UX Designer or Researcher, care about the design of forecasting interfaces? Simply put, it’s one of the most important uses for AI. In addition to forecasting sales and weather, you can use this simple interface design pattern to forecast weight loss/gain on a diet plan, product demand, stock market performance, how long it takes for crops to grow or a pipe to rust, or a septic tank to fill, or for global warming to kill us all… Hopefully, you get the idea. Or better yet — let’s make that your homework!

Come up with three different ways how you can use Forecasting to help display predictions in your own project. What variables would you want to forecast and why? How would the forecast affect the decisions of your customers? Is it better to overshoot or under-shoot these predictions? Why?

Now let’s dig into the finer points of forecasting. For this, we’ll have to get into the orbit of planet Math, but I assure you, the concepts are quite simple, and an investment of 10 minutes today will empower you to have better conversations with your colleagues for the next 10 years of your career.

Let’s dig in!

Linear Regression

One of the most important forecasting techniques is linear regression. Essentially, the idea is simple: draw a straight line through the available data points. Then, you can use this line to predict the value of Y for any X; no AI is required! The math for this is actually pretty straightforward: after we put a line through the data points, we measure the distance from each data point to the resulting line, called a Residual. Intuitively it is easy to see that the line that fits best will have the smallest distances from all the data points. These distances (Residuals) are typically squared to remove the negative sign, so you are measuring the absolute distance to the line. This is covered in an accessible language in this video: https://youtu.be/nk2CQITm_eo?si=0fgAnCW5PYH_5G3d if you want to dig deeper)

Source: https://youtu.be/8iqzFQ_nZI8?si=MJrhk59_b-oPyY4F

R-squared

As a proxy to how well the forecast will work, we can use a standard measure of how well the line we drew matches the existing data points. Intuitively, we can see that the line in Figure A (left) fits “looser” to the data points than the line in Figure B (on the right).

Source: Greg Nudelman

This “fitness” can be measured mathematically and is called R-squared, which is a number between 0 and 1. Again, no fancy AI is needed: the math is pretty straightforward and is explained in this video in an entertaining and accessible manner: https://www.youtube.com/watch?v=bMccdk8EdGo.

The important thing to understand about R-squared is that the closer it is to 1, the better the fit and, therefore, presumably a better forecast, and the lower the number (e.g., the closer R-squared is to 0), the worse the fit and, therefore, less trust-worthy our prediction. R-squared is easy to work with because it is linear: R-squared of 0.8 is twice as good as R-squared of 0.4 (I realize this is somewhat confusing: a squared variable is linear, which is why I thought I’d point that out.) (You’re welcome.)

To help your customer see how well the prediction fits, you can show the dashed forecast line with the “confidence interval,” – which is kind of like a “cone of shame” for a dog who has been recently neutered (we have seen a confidence interval earlier in the article predicting the El Nino temperature change). The shaded area of the confidence interval marks the supposed limits of where the line (or the dog) may go. The further out we forecast, the more uncertainty we introduce, creating a larger possible space for the line to move. Of course, forecasting is not an exact science, so the confidence interval (as a cone with most dogs) is more of an idea than a rule. It is, therefore, meant to be an indication of the increase in uncertainty, not something that is necessarily “set in stone.” However, the confidence interval does provide a helpful, intuitive visual guide as to the possible goodness of the forecast:

Source: Greg Nudelman

You can read more about confidence interval math here: https://blogs.sas.com/content/sastraining/2013/12/19/how-to-plot-a-forecast-and-confidence-interval/

R vs. R-squared

One potential disadvantage of R-squared is that it does not indicate the direction (higher or lower); all we know is the absolute difference. As you might recall from the “Accuracy is Bullshit” article, sometimes over-shooting can have much higher consequences than under-shooting your forecast, and so in those times, you really do want to account for the direction of the difference.

Imagine, for example, that you are forecasting how much food you would need for a 1-month journey to the North Pole. Do you think that both directions of a forecast error will have the same consequence? If you overestimate the amount of food, you will carry some extra beef jerky 1000 miles, a relatively minor inconvenience. If you underestimate the amount of food, your expedition will starve.

R-squared is unsuitable for the occasions where the forecasting over/undershooting has a different monetary or humanitarian impact, so you can just use R. How do you know what occasion your specific use case represents? Well, naturally, you would ask your PMs, SMEs, and Customers, once again proving Richard Saul Wurman’s maxim:

❝

While most professions make a living with their knowledge, UX people make a living through their ignorance

Richard Saul Wurman

Meaning, of course, that the quality of our questions really matters.

Forecasting with AI

You might say: “All this is well and good, but you did not tell us anything about how AI could help us with forecasting. It was all just high-school math!” That is true. In many cases, asking better questions means sussing out from your Data Science colleagues just how sophisticated the prediction algorithm really is.

Most times, you will find simple math works just fine, and using AI/ML methods will only complicate things. Really.

However, there are several key forecasting techniques shown as line graphs, where AI/ML methods are pretty much the only way to create an accurate forecast. I have the most personal experience with two of them: non-linear regression and seasonality. Let us cover those two next.

Non-Linear Regression

While a near-term prediction can often be approximated via straight line and simple math,

Few things in nature have a true linear relationship across the entire spectrum of data.

For example, here’s a graph of chlorine degradation in a product as a function of the time it spent sitting on a shelf from a paper on non-linear regression techniques, https://www.statgraphics.com/blog/nonlinear_regression

Source: https://www.statgraphics.com/blog/nonlinear_regression

While we can certainly put a straight line through these data points, it should be fairly obvious that a straight line will not be a great fit. This is a case we will do much better with non-linear regression. Essentially, for both linear and non-linear regression, the same considerations apply, except the graph is not a straight line but a more complex curve with a longer equation that best fits the data points and provides the highest R-squared. The best-fit equation is usually determined by some kind of AI/ML algorithm, which tries various standard equation approaches to determine which formula creates the best fit.

The paper goes into a great deal of detail regarding various techniques, so I recommend reading it in its entirety. However, the main takeaway for UX designers who do not want to get into the math is that multiple different equations might work (nearly) equally well, and you should work with your Data Science and Engineering colleagues on the AI/ML approaches to figure out the best fit non-linear model:

Source: https://www.statgraphics.com/blog/nonlinear_regression

One point I want to caution the reader on is just because the model fits the data well, does not mean the model is the correct one for predicting the next data point. Here is one unfortunate example where the model, while fitting the existing data well, does not match what physically happens in the system: this curve is predicting that the amount of chlorine will increase with prolonged shelf life, an obvious hallucination:

Source: https://www.statgraphics.com/blog/nonlinear_regression

Thus,

As a UX designer working on non-linear regression forecasting, it is part of your job to ask good questions of Data Scientists, SMEs, and Customers to determine if the curve matches physical reality in order to help the team avoid situations like the one above.

Seasonality

Another common forecasting technique where AI/ML techniques are very helpful is seasonality. Consider typical website traffic: it peaks Monday-Friday during US working hours and drops off every night and on weekends. In addition to the typical weekly variation, there are peak times of increased demand for e-commerce websites, such as Black Friday, Cyber Monday, Holidays, Labor Day, etc. which occur every year:

Source: https://www.searchenginejournal.com/seo-seasonality-overcoming-dips-during-slow-season/372742/

This type of variation is called Seasonality, and it is very difficult to predict using typical non-linear regression methods. The only good way to account for this type of variation and make accurate demand forecasts is to collect a bunch of data and feed it to an ML model. Fortunately, ML usually works quite well in this case.

Again, as a designer, it is your job to understand the underlying forces that drive seasonality so that you can ask good questions about the quality and limitations of your team’s prediction algorithm. Keep in mind not only the weekly but also monthly and yearly seasonality so that the model’s predictions best describe reality. Ask if over-shooting and under-shooting will have the same consequences: most often, they will not!

For example, if you overestimate Cyber Monday traffic, your AWS bill will be slightly higher. If you underestimate the traffic, your whole website will crash, erasing shopping carts and search session information and costing you millions for every minute of outage.

Once you complete your homework and feel that you understand the real-life model and forecast consequences well, set up a quality discussion with your Engineering and Data Science colleagues. Ask about how much data was used to build the forecast model: one week’s worth of data or several? Does the model account for yearly trends such as Black Friday? Make sure the data you are collecting meets the seasonality requirements of your use case and that everyone is on the same page with regard to the consequences of under/overshooting your target and adjust the model accordingly.

Forecasting an Aggregate Variable

Finally, I’d like to present a use case where a line graph will not be a great option for the forecast. As we discussed in the previous installment of this column (Part 2, “Don’ts”), the use case where we are plotting and forecasting an aggregate variable like the daily volume of something is best described as a bar graph. Here’s an example where the AI system is used to forecast the weekly pattern of water demand:

Source: https://www.nature.com/articles/s41598-022-17177-0

Using a line graph for this use case would not be a great choice; a bar chart is much better because we are using an aggregate variable, total daily water consumption. Note that this example demonstrates the strong weekly seasonality we touched upon in the previous section.

What’s the use case for this kind of prediction? If you can accurately forecast the demand for water, you can use cheaper methods to pump it, like pumping it to a high storage tank at night to save on the electrical bill or using a cheaper and more reliable but slower low-volume pump to increase efficiency and decrease cost. Another value of this forecast is simply knowing that you have enough water to meet the needs of your constituents.

Now, if your system simply calculated the average week's worth of data and forecasted the same amounts for next week, that would be pretty straightforward, and no AI would need to be involved. However, a more realistic model might include yearly seasonality: perhaps people use more water in the summer or around certain holidays. Our AI-based model can help forecast seasonal demand more accurately.

Things get even more interesting when we train the model to account for environmental factors. Recall in our previous column, How to Pick a Use Case, we discussed how a smart irrigation system can be used to reduce water consumption. An AI model that “knows” the daily watering requirements of various plants and can take into account environmental factors such as precipitation (rain, fog, dew, etc.), as well as temperature and humidity, would be very useful in forecasting the total watering needs for a field of crops. A model such as this could be used to keep water consumption at the minimum while maintaining crop yield, thereby maximizing profit (while also reducing water waste and pumping costs, complying with industry regulations, and reducing global warming).

Nifty, no?

In the picture below, the actual water consumption is marked in dark blue bars. Lighter purple bars represent the demand forecast produced by our model:

Source: Greg Nudelman

For your homework, think of three different aggregate variables you can forecast for your project. What factors would be ideal to include when training such a model? What data do you have readily available? What data do you still need? Who do you need to ask in order to figure out how to get the missing data? (Remember, you can always start by asking ChatGPT for help!)

In conclusion

In this article, we have covered line-graph and aggregate variable-based forecasting in some detail. While the picture of the forecast itself is often straightforward (a dashed line or bar chart), treating it as “simple” would be a mistake. There are many nuances to consider, and field research and quality conversations with your colleagues are a must.

I hope that we were also able to assure you that as a UX Designer, you don't need a deep understanding of all of the complex math that could be involved, yet your work will be easier, and project outcomes will be better if you at least understand the concepts involved. Learning more about various statistical forecasting methods will help you maximize your value to the team and increase your effectiveness as a UX Designer.

As Robert Sheckley said so well in his incomparable short story, Ask a Foolish Question:

❝

In order to ask a [good] question you must already know most of the answer.

Robert Sheckley

Happy Forecasting!

Greg & Daria

Reply

or to participate.