top of page

AIFP Part I.3: From Idea to AI Models: A Complete Example with AI-Assisted Rapid Prototyping

  • Writer: Jinghong Chen
    Jinghong Chen
  • Feb 6
  • 11 min read

In the previous article, we've established that AI is really about three things: Data, Model, and Learning. It turns out that, if you take a disciplined approach based on this "AI Trinity" and use the available AI tools today wisely, you are already in position to rapidly prototype systems that can solve real problems.


In this article, I invite you to see me do exactly that. We will start with a real, urging problem and come up with a solution with AI. Then, we will work with AI assistants to turn our idea into a working model, all done in less than a day. You will discover that building AI systems is easier than ever before (and, I promise you, it will keep getting easier); You will believe you can build AI to address real problems, too.



From Problems to AI Solutions


Let's attack a problem that's going to stay relevant for the next 10 years, not some toy problems like hand-written digits recognition or facial expression generator.


You don't need the exact numbers to know Energy is a critical issue for the entire human race: Fossil fuels will run out. And we as a race are investing much more on technologies that consume energy (e.g., AI is very energy-hungry) than on technologies that produce or save energy. An alien observer looking at us may well think that the human race is accelerating towards self-destrcution.


One way to get out of this dead-end is by using energy more efficiently. And one well-known inefficiency is that demand for electricity is cyclic and with high peaks (e.g., people turn on their dish washers at approximately the same time). To meet these peaks (typically 30%-50% of higher than average) and prevent outage, the generators at power stations need to work extra hard at very inefficient operating points (up to 50% more inefficient). As a result, most power stations have "back-up" generators that are essentially for peak usage only and are otherwise not used. We are looking at substantial waste in Energy.


One possible solution is to spread usage over longer time period. For example, having some people start their dishwashers an hour or two later than usual. This will relieve pressures on power stations and also on home owners' utility bill where electricity is dynamically priced. While changing people's behavior at substantial scale is impossible, we can introduce smart appliances that decide when to consume electricity based on its prediction of energy usage. I.e., building smart dishwashers that know when to start its one-hour washing job to avoid peak.


As you can imagine, energy usage patterns can shift in different seasons, at different locations, and over long period of time. For our smart appliances to work optimally for wide application, they need to change their peak-hour estimates accordingly. This presents an opportunity for AI solution: if we can continuously learning from past electricy usage history to predict future usage, we can make sure the system is applicable to different parts of the world and for the future, even if very different usage patterns emerge.



Data: Find it, Understand it, and Sanity-Check it.


Let's start with the Data for our Prototype. We first need to find relevant data, understand if these data can be used, and finally sanity-check if the data is of good quality.


Finding Data. We need historical electricity usage data. I typed in "Can you find me datasets for household electrical power consumption prediciton to build AI systems?" and ChatGPT replied with the Individual Household Electric Power Consumption dataset from the UCI Machine Learning Repository. It is also one of the top results from Google. Most of the time, finding prototype data is easier than you expect.


Understand Data. You may think you need coding skills to access a dataset, let alone understand it. The name "dataset" seems to suggest some technical preliminaries. This may be true five years ago, but not today. If you go to the website, you will find that the dataset is just a standalone txt file. In truth, it's really a big table (i.e., like an excel) stored as a text file. You can easily ask an AI assistant to turn it into a table and print a few lines for you. That's what I did with Claude and here's what I got. (Don't worry, I will provide the list of all AI tools used at the end)


Timestamp

Total Power (kW)

Kitchen (Wh)

Laundry (Wh)

Water/AC (Wh)

2006-12-16 17:00

4.223

0.000

0.528

16.861

2006-12-16 18:00

3.632

0.000

6.717

16.867


From these few lines it is apparent that the dataset contains exactly what we want: total power usage tagged with date and time of the measurement. In addition, it contains sub-meter data that further breaks down power usage. To keep the prototyping focused, let's focus on Total Power for now.


Sanity-Check Data. We are not done yet! We need to sanity-check if our data makes sense and not just random noise. From common sense, we know that there is going be a seasonal, weekly, and daily cycles in power usage. Let's see if our dataset shows these patterns. To produce the following statistics and graphs, I again instruct Claude to write codes for me and did not need to write a single line.


Let's first see if electricity usage is indeed peaky by comparing the average, maximal, and minimal usage:

Average daily consumption: 1.09 kilowatts
Maximum daily consumption: 3.31 kilowatts (303% of average)
Minimum daily consumption: 0.17 kilowatts

So it is peakier than I originally thought. But this could be due to seasonal change. Winter usage is usually much higher than Summer usage. So let's look at seaonal usage next:


Monthly power consumption from the UCI Individual Household Electric Power Consumption dataset. The bar shows standard deviation for each month. We observe heavy usage during winder (Nov-Feb) and less usage during summar (July-Aug).
Monthly power consumption from the UCI Individual Household Electric Power Consumption dataset. The bar shows standard deviation for each month. We observe heavy usage during winder (Nov-Feb) and less usage during summar (July-Aug).

The monthly chart makes perfect sense. More electricity in winter than in summer. You could imagine that this would be different for countries in the Southern Hemisphere. That's why we need learning-based instead of rule-based systems! Let's continue onto daily patterns.



Daily power usage. We see clear peaks in the mornings and at nights before bedtime. Also, weekend has higher peaks than weekdays.
Daily power usage. We see clear peaks in the mornings and at nights before bedtime. Also, weekend has higher peaks than weekdays.

Looking closely at the data gives you insights about the problem. On weekdays, we can see clear peaks around 8am and around 7pm. These are probably associated with activities like making breakfast and watching televisions, etc. On weekends, the peak is higher, and there are more activity overall. What's more, mid-day usage is particular high on Sundays compared to other days of the week. This makes sense because people tend to stay at home.


And so we can conclude that our data is useful for our prototype. In making sense of the data, we also learn more about the problem. We know that for a model to be successful, it has to base its prediction on month, date, and time. Now we are in position to move on to modeling.



Model: Choosing Inputs with Sufficient Predictive Power


A model consists of architecture and parameters (revise here). At the moment, you cannot make an educated choice of architecture yet because you don't yet know the common classes of model architectures (we will cover them in Part II). But don't worry, for prototyping, today's AI knows enough to give you a sound starting point. And there is one thing you can control which is more important than which type of model to use. That is, deciding the right inputs for the model that have enough predicitve power to solve the problem. Let me explain.


Suppose in our prototype, we only feed the model with the time of the day (e.g., 18:00) without telling it what date it is. Can we expect the model to perform well? No. As we discover in the previous section, usage pattern varies substantially between months (i.e., seasons) and between weekdays and weekends. In this case, we say the input has not enough predicitve power to give a good prediction. The model does not have enough information to make an informed judgment, much like the proverb goes: "You can't make bricks without straws". What's worse, the model has no means to tell you that it needs more information. All you will observe is that the model struggles to get good at the task, and you might fall into the rabbit hole of fixing modeling approach when it is really the inputs that need to be fixed. But if you understand the problem, you are immune to this type of error. That's the benefit of starting with data inspection to understand the problem!


We now know the model should definitely take in date and time to predict total power usage. We will also feed the usage of the previous hours as input to the model. This is because although peaks can appear quite abruptly sometimes, usage at the previous hour is generally a good indicator of the next hour. This is again what we can learn by inspecting the daily usage figure.


Now, it's time to call up our AI coder to write running codes for us. To instruct Claude to do this for me, I simply fed Claude with this section of the article and added


"Let's train a simple neural network for regression on the UCI dataset which has already been downloaded. Please use a subsample of the dataset and give me training and evaluation code. "


After 30 seconds, the code is ready to be run! But hold on, how do we know if training works and we actually have a good model? In other words, we will learn very little even if the code works because we have not decided how to test the model. In our AI Trinity, testing is part of the learning process, which we will look at next. Let me explain.



Learning: Correct Evaluation is Half the Success.


We know learning consists of four basic steps. Just to refresh your memory with our learning-to-use-ovens analogy, the four steps are Forward (bake), Evaluate (taste), Update (note), and Repeat (revise here). These are the basic units that form any learning algorithms. These are also the general steps to develop the final system, which may involve training multiple models and comparing which is best. For now, we are less concerned with the learning process in training one model (i.e., the inner-loop of learning) because (1) you don't yet have the knowledge to modify the inner-loop and (2) your AI code writer will have written the algorithm for you along with the model.


We are more interested in the "outer-loops" that governs how we build towards the ultimate, best-performing system. The figure below shows how the outer-loop works.


The "Outer-Loop" of learning involves multiple training runs of multiple model to get the best-performing system. Developers go thourgh "train-evaluate-update-repeat" cycles similar to the inner-loop of training model ("forward-evaluate-update-repeat")
The "Outer-Loop" of learning involves multiple training runs of multiple model to get the best-performing system. Developers go thourgh "train-evaluate-update-repeat" cycles similar to the inner-loop of training model ("forward-evaluate-update-repeat")

In the outer-loop, we human developers actively participate in the learning process by updating our approaches (on either data, modeling, or learning algorithm) based on how well the model is performing. As you can imagine, this process will not work unless we evaluate models correctly. Here's the good news: you can reason about how models should be evaluated using common sense.


In evaluating how well a particular model works, we are generally interested in two things: (1) How well did the training process go (Training Monitoring) and (2) How well does the system perform on the final task we are interested in (End-to-End Performance).


Training monitoring is relatively straight-forward. For any training algorithm, there will be some metrics to monitor. The most common being the "loss value", which describes how well the model is doing in achieving its training objective. The smaller, the better. As the value represents a "loss". Here are the numbers from me training the simple neural network Claude has written:


Training new model...
Epoch [10/50], Loss: 0.3298
Epoch [20/50], Loss: 0.3161
Epoch [30/50], Loss: 0.3128
Epoch [40/50], Loss: 0.3118
Epoch [50/50], Loss: 0.3074

The Epoch number is equivalent to "how many times have the model seen the whole training set". We can see that as the model learns, the loss value keeps decreasing. This means that our model is achieving the learning objective on the training set. That's a promising sign!


Is it enough to just look at the loss value and conclude that we have a working model? Unfortunately, no. The model can learn well on the training set but still perform badly when faced with actual problems, which are new scenarios not present in the training set. It is easy to see why: a (unfortunate) student can do very well on the exercise book but very poorly in actual exams. The same applies to our AI learner. That's why we always reserve some data for testing the model which are not used for training. Conventionally, we call these reserved data the "validation split" or the "test split" of the dataset and refer to the other data used in training as the "train split".


Now, let's look at end-to-end performance. That is, how well the model is doing on the actual task. We've designed our model to predict electricity usage for avoiding peak hours. In other words, the end task is whether peak hours are correctly identified. If we set the threshold for defining peak hour to be 2.331 kW, we get 72 peak hours to be identified out of the 720 hours of test data. This pick-one-from-ten task is not trivial. But our trained model managed to achieve an accuracy of 89.4%, which is quite impressive given it's built in less than a day.


Below I've visualized a particular instance from the test split and overlaid the actual usage and the predicted usage (again, with Claude). You can see our model is really doing quite well in predicting the actual usage, though it does has a tendency to under-estimate peaks. But for our purposes, it is more than capable as a prototype!



At this point, both training monitoring and end-to-end performance suggest we have a working system. In general, you will go on to the next step of the outer-loop to figure out how you can make the model even better. However, at this stage, you will need to leave this part entirely to your Engineering friends until you read up on Part II.



Reflection: An Exciting Era Ahead for Change-Makers


Congratulations! You just watch me built a working AI prototype from start to finish. Now, it's time to take a step back and revisit the big picture: why are we doing this?


Rapid prototyping is nothing new, and certainly not new to software development. In the dot.com boom (pre-1990s), Internet startups would hire aggressively once they get funded to get as many capable programmers as they can to prototype rapidly. Coming up with a useful prototype in matters of days is nothing new.


But today, we are able to prototype at this speed single-handedly. All we need is the

right guiding understanding and the AI tools that are virtually available to anyone. In the old days, you will need to convince a board of investors to test your idea. Today, you only need the idea and commitment from yourself. The bar for innovation is lowering at unprecedented rate. Everyone will be empowered to solve problems. The only challenge remaining is whether you have asked the right questions and understood the real problem.


This example is a testimony to exactly that. My biggest contribution, from the problem-solving perspective, is to instruct Claude that the model should take date, time, and usage at the previous hour as input, and should ignore other data in the dataset. In other words, the most challenging part was to understand the problem and to identify what's important for solving the task. This corresponds to the efforts we spent finding, understanding, and sanity-checking the data. And it proved worth-while. If we hadn't done this, an AI coder could easy miss crucial features or incorporate irrelevant features such as sub-meter readings. And we will be blaming Claude for "not doing its thing" and probably get stuck.


If you look at it this way, you will find the core of problem-solving has not changed. You still need to understand the problem and come up with ideas. What's changed is how we implement ideas. It has just gotten incredibly easy with AI helpers. This sets the stage for change-makers and, in my opinion, an excitingly creative era ahead.


I hope you enjoyed reading. In the next article, we will make a start at understanding AI technology so you can engage in generating good ideas for modeling and for learning. Stay tuned, and I will see you soon!





 
 
 

1 Comment


John Wixpartner
Feb 08

Hello,

I trust you are doing well. I'm John, one of the Wix Legend Partners. Your website was recently referred to my team by Wix's technical team for optimization and to benefit from their latest updates and improvements to help enhance your website's visibility on search engines.


After conducting a detailed review of your website, I identified several areas where strategic enhancements could make a significant difference in performance. Here’s a summary of what I found:


1. Website Speed: Your site currently loads in 11 seconds, while Google recommends a load time of under 5 seconds. A slow-loading website can discourage visitors and hurt your search engine ranking.


2. Keyword Optimization: The website lacks proper usage of keywords that your…


Like

Stay Tuned.

Get notified of new contents as they come up.

bottom of page