AIFP Part I.1: What exactly is an AI system?

Jinghong Chen
Jan 29
7 min read

Updated: Jan 30

Welcome to Part I: The Big Picture. In this part, I ask you to be a curious child and imagine seeing an AI system for the first time of your life, much like when you first see an aeroplane. I invite you forget about all pre-existing assumptions about AI just like you wouldn't assume an areoplane is just made for flying when you first see it. It could be a device to spread seeds and scare off birds. Why not?

Starting from minimal assumptions help you explore the use cases of a technology. What else do you think the plane can be used for if you assume minimally?

When something just comes into existence, it usually has the bare minimal of features that define what it is. In the case of Wright's Brother's plane, the defining feature is the giant wings - and aircrafts work because the lift they generate. Similarly, a good starting place for understanding what defines an AI system is all the way back to when it all first began - to when it was born.

The first useful AI by mankind, in LEGO terms

The DENDRAL system made at Stanford University in 1965 is widely considered the first AI system with practical use. It was designed to help Chemists identify possible molecular strucutures. We won't dive into Chemistry here. So I will use an analogous system, in the LEGO world, to explain how DENDRAL works.

Suppose you are a LEGO man living in the LEGO world, where everything around is built of simple building blocks - cubes, pillars, triangles, various shapes of various sizes. As their creators, we can easily re-make anything in the LEGO world. We just need to break things up and re-arrange the blocks. Sadly, for LEGO people, they don't have the strength to break items up. But they are aware that everything in their world is composed of simple block units, and that there are good items (e.g., ice cream shops) that they want to make as many as possible. The problem for LEGO people is: how do you figure out the number and types of blocks needed to build an ice cream shop?

The old way of doing it is to take all the basic units known to the LEGO people and to re-make the shop by trying all combinations. This works for smaller items, say, a bus stand. But something like an ice cream shop takes hundreds of blocks, and the LEGO people know there is only one way to make the shop they want (otherwise, the shop will only sell durian-flavour ice creams, instead of chocolate ones). They had some success with easier items. But the dreamy ice cream shop remains unattainable, even though the LEGO men and women work hard for it day and night.

A group of sympatheic and innovative LEGO researchers come along. They think: surely there's a way to relieve the burden of our fellow people. They have better things to do than trying all combinations. However, building LEGO robots to automate the whole block-guessing-item-building process was impossible. How can the LEGO researchers do better with what they have now?

Here's what the LEGO people have in hand to improve their situation:

They know all the basic building blocks, and they know what the final product should look like.
They have many failed attempts and some successful ones in piecing basic units into useful items.
They know a set of rules from past attempts that some blocks go together (e.g., two cubes) while some don't (e.g., a sphere and a triangle).

The third item is incredibly useful because it can rule out some impossible combinations outright. Since it applies to the LEGO world, let's call it "LEGO knowledge". The LEGO researchers think: if there is a way to automatically derive LEGO knowledge from past experiments, we could save big time - because we can exclude numerous infeasible combinations with no effort. What's better is, we already got all the experiments done and documented. So we just need to figure out the "automatic derivation" part!

Who would be able to derive knowledge from experiences? Human certainly can. Some smart primates like the chimpanzees also can. This ability seems to be related to what we consider "Intelligent". "Alright," the LEGO researchers come to a satifying name, "this kind of system should be called to possess 'Artificial Intelligence'". But the quest is not straight-forward, as even the best of LEGO people can struggle to derive LEGO knowledge from past experiences.

If you replace LEGO blocks for atoms, ice cream shops for organic molecules, and LEGO knowledge for scientific laws, you pretty much get DENDRAL, the first useful AI built by mankind. DENDRAL was built to help chemists make complex organic molecules, and it did so by automatically deriving rules about molecular strucutures from past experimental results to eliminate infeasible starting combinations. You see, even at the onset of AI, we have posed very difficult problems for AI systems - chemistry is difficult even to the human mind.

The Defining Feature of AI Systems

We looked at DENDRAL, the first useful AI system, to understand the defining feature of AI system. The story seems complicated to interpret, but we are in fact very close to the answer. Now it's time to apply "First Principles".

Consider how an AI system, DENDRAL, is different from a normal computer software, say, the browser you use to read this post. The browser accepts your inputs (e.g., clicks) and gives you outputs (e.g., webpages) in an expected manner. If you click the "go back" button, you go to the previous page. Otherwise you have a bug. In this sense, conventional softwares are like your cars. Sure, they can take you to new places you've never visited before. But they themselves are meant to act according to a pre-existing specification. Any "surprising" behavior is considered "buggy" or "malfunctioning". You don't want your car to surprise you!

But AI systems are different. We expect responses from DENDRAL to potentially be something we don't know when we build it (i.e., chemistry knowledge yet unknown). In a sense, the more unexpected the response, the better, as the goal was to discover unknown scientific laws in the first place. We still have control over AI's input and range of outputs, but we leave the exact response to the AI system. To us, outputs from AI systems are undeterministic (or "probabilistic"). We never know for sure until we have observed the output. If deterministic softwares are like your cars, then AI systems are like your pets. Yes. You know your puppy cannot do your maths problems and go to university for you. But within its capabilities, you cannot know for sure what it's going to do next. You would know this if you have walked a dog: you never know what might catches its attention.

Is any software system with non-determinstic outputs AI? That surely can't be right. Consider a random number generator. It would be a stretch to admit it has any intelligence at all. So we are still missing some defining property. To find out, we need to consider the bases of AI's probabilistic outputs: what makes AI non-determinsitic?

Again, let's find out via thought experiments and comparisons. Compare the following two systems in the LEGO world for building an ice cream shop:

System A outputs likely combinations of LEGO blocks by first proposing many combinations and then filter by rules that the system learned from past experimental data.
System B outputs likely combinations of LEGO blocks by first proposing many combinations and then filter by rules that are pre-determined by human expert.

Which one do you think describes an AI system?

System A essentially describes DENDRAL, whereas System B describes a helpful conventional software tool. The key here is learning from "past experimental data" (or more generally, "data"). System A is non-deterministic because we have no definite control over the outcome of its learning. System B is deterministic because we know all the rules in advance. Put another way, the origin of non-determinism of AI systems lies in learning, not pure chance as in the case of random number generators.

Learning is an abstract term and we will come to concrete implementations of learning in Part.II. But it is really a familiar notion to everyone. At a high-level, learning means we change the way we behave based on our experiences (i.e., data). It is the foundation of AI systems and is more fundamental than non-determinism. If something can learn, it implies that its behavior can change over time and thus it is non-determinisitic. The golden rule for identifying AI system is therefore simply: "Does the system learn from data?".

Let's put this golden rule to test:

Your alarm program in your phone automatically reset the alarm every day. Does the clock program learn from data? No. Hence it is not AI.
Your fancy alarm program automatically sets your alarm for the next morning, based on your sleeping pattern in the previous days. Does it learn from data? Yes. It learns from your sleeping pattern. Hence it is AI.

And so that we can describe the defining feature of AI systems in one concise statement:

An AI system learns from data and produces actions based on what it has learned.

You will see in due course that learning from data is not as high-tech and abstract as you think. Consider the case of your alarm program. If your alarm clock simply choose the most often time interval you wake up in during the past 3 weeks, that's learning already, though admittedly very limited. The point is: AI systems can be very simple in practice.

Conclusion: AI (Artificial Intelligence) and Machine Learning (ML)

In this post, we establish what makes a system an Artificial Intelligent system: that it is able to learn from data. It is no surprise that for this very reason, mentions of "Artificial Intelligence" (AI) are often entangled with "Machine Learning" (ML). AI is all about learning. Many people assume ML is the only way to do AI. But are machines the only options for learning? Can we do "Biological Learning"? I will leave these thoughts to you. Feel free to explore and enjoy the benefits of starting from first principles. But for practical reasons we will primarily look at ML as the means to building AI systems.

Next time you see an AI product, think about the questions: "What does the AI system learn? From what data?" This often takes you to the core of their business. For example:

Large Language Models learn to write like human from human-written texts in books, on the Internet, etc.
AI search engines learn to retrieve the wanted document given a user query from previous search histories.
AI recommenders learn to recommend products to a certain individual from that individual's browsing and purchase history.
AI drug discovery systems learn to predict likely moleculer structure for a certain drug from previous failed and successful trials.

At this point, you can already reason about AI systems at a very fundamental level. Going forward, we want to ask the more critical, practical question: "Can machines learn the alleged knowledge/capability from available data?". We will put together a framework to to understand how any AI system work so that we can answer this question systematically. That's the topic of the next post.

Stay tuned. Keep learning!