Understand AI

How AI Learns – One Gesture at a Time 

Shehara Mar 30, 2026 5 min read

If you’re curious about how artificial intelligence actually learns, try this experiment. This project uses Google’s Teachable Machine, a free, browser-based tool that lets anyone train a real AI model without writing a single line of code. 

If you’re curious about how artificial intelligence actually learns, try this experiment. This project uses Google’s Teachable Machine, a free, browser-based tool that lets anyone train a real AI model without writing a single line of code.

What You’ll Need Before You Start

A laptop or desktop with a webcam

A modern browser (Chrome works best)

15–20 minutes of uninterrupted time

No coding skills required

My Mini Project: Gesture-Based Game Controls

I trained an image classifier to recognise three hand gestures via webcam and mapped each one to a game action:

Thumbs Up → “Go”

Open Palm → “Stop”

Fist → “Jump”

Using Teachable Machine’s image project and my laptop webcam, I captured training images, trained the model, and exported it to run directly in a browser. Within about 20 minutes I had a working gesture controller triggering events in a simple game. No Python, no machine learning background, no special hardware.

Three Things I Learned About AI Along the Way

1. AI learns patterns, not meaning. The model never “understood” a thumbs-up. It learned to associate a specific pixel pattern with the label “Go.” Change the label, and the meaning changes completely. Without clear, consistent labelling, there is no intelligence, just noise.

2. Diversity in training data is everything. My first model failed the moment I moved to a different room. I had trained it under a single lighting condition, so that was the only condition it knew. Adding images with shadows, different angles, and varied backgrounds fixed it. AI generalises only as far as its training data allows.

3. Good AI products need human-centred design. When switching gestures quickly, the model flickered between labels. The model itself was fine. The experience was broken. Fixing it required a smoothing layer, and this pattern shows up across real AI products: the model is one piece of a larger puzzle.

Step-by-Step Guide

Phase 1: Set Up Go to teachablemachine.withgoogle.com and select Image Project, then Standard Image Model. You’ll see three default class slots. Rename them “Thumbs Up,” “Open Palm,” and “Fist.”

Phase 2: Capture Use the built-in webcam capture to collect 100–150 images per gesture. Vary your lighting, hand position, distance from the camera, and background as you go. This is the most important step. Rushed, uniform captures are the single biggest reason beginner models fail.

Phase 3: Train Click Train Model. This usually takes under a minute. You’ll see a live accuracy reading. If any class is consistently low, go back and add more varied images for that gesture before moving on.

Phase 4: Test Use the Preview panel on the right to test your model in real time. Wave your hand, switch gestures, step back, change angles. Try to break it. Better to find the gaps now than after you’ve exported.

Phase 5: Export Click Export Model. You can either upload to Google Cloud and get a shareable URL, or download locally. The local download gives you three files: model.json, metadata.json, and weights.bin. Keep them together. They only work as a set.

Phase 6: Connect and Refine Connect your model’s predictions to game logic or any interactive event. Add a debounce or smoothing buffer, which is a short delay (usually 200–500 milliseconds) that stops the model from flickering between labels when your hand is mid-transition. Without it, fast gesture switches feel glitchy even when the model is accurate.

Common Mistakes (and How to Avoid Them)

Problem

Likely Cause

Fix

Model works at desk, fails elsewhere

Trained in one lighting condition

Recapture with varied backgrounds and lighting

One gesture keeps getting confused with another

Visual similarity between classes

Exaggerate the gesture difference, add more images

Predictions flicker rapidly

No smoothing logic

Add a debounce buffer in your integration code

Low accuracy on one class

Too few or too similar images

Add 30–50 more varied captures for that class

Why This Is Worth Your Time

You don’t need a PhD or a Python environment. In under an hour, you get hands-on experience with supervised learning, why data quality beats data quantity, how accuracy and generalisation are different things, and where design matters as much as the model itself.

These are concepts that show up in every serious AI conversation. Building something small makes them stick in a way that reading about them never quite does.

What to Do Next

Once your gesture controller works, try one of these:

Swap gestures for facial expressions or body poses using Teachable Machine’s Pose or Face projects

Write a short post about what broke and how you fixed it. That reflection is where the real learning happens.

Share your exported model URL and let someone else test it. You’ll immediately discover edge cases you hadn’t considered.

The model you built today is small. The instinct to build and learn from it is worth developing.

Download PDF