Here is a rather long paper I have been writing on the subject. I welcome your thoughts!
One
Click, One Treat
Why
I prefer this approach to positive reinforcement training
Some
Nuts & Bolts about Clicker Training, some key (for me) points:
- Positive Reinforcement Training or Operant Conditioning (and the more colloquial name Clicker Training) is a training system based on science - the study of brain function, how we learn, behaviourism - that’s one very cool thing about it - it is science based!
The
science of operant conditioning proves that by reinforcing behaviour,
you increase the likelihood of that behaviour happening again.
Therefore the opposite is also true - that by ignoring a
behaviour (not reinforcing it) the behaviour will likely extinguish!
- Reinforcement can take many forms - it doesn’t necessarily mean food or treats or rewards. It can be paying attention! So even yelling at the dog to stop barking might be reinforcing in some way for the dog - he has got your attention!
Important
to remember that “reinforcement” is what the learner finds
reinforcing NOT the teacher! Example: If I gave you chocolate
every time you swept the floor but you had an allergy to chocolate,
then chocolate would not
be reinforcing for you! You might even quit sweeping (the sweeping
behaviour extinguishes!)
So
while one donkey might like scratches (Rose) another donkey may not
(Siog) or one dog may loves his frisbee but another doesn’t. etc.
It’s up to us to find out what each individual animal finds
reinforcing.
- Operant conditioning also means that the learner becomes “operant” or in other words, makes choices, becomes the “operator” of their situation and learns that cues and behaviours/ actions trigger consequences.
Animals
are individuals and we tailor our training plans to suit each one.
As Alexandra Kurland likes to say “the horse will tell you
what he/she needs to learn.” By this she means that if a
horse is, say pushing into your space uninvited, you can see that
that particular horse needs to learn to stand politely, perhaps needs
to learn to back up or station to a mat - while another horse may be
timid and needs to learn to approach or “come!"
- Important to remember that we are not simply training behaviours, (do this, do that) but we are also reinforcing a calm attitude, good energy and good balance and we are establishing a trusting relationship. This is a 2-way street - we need to trust our animals and they need to trust us.
Clicker
training, done properly, teaches both the trainer and the learner
patience
and gives us a lot of training tools to draw from as we teach and
experience a variety of situations.
- The donkey is never “wrong” - if he/ she doesn’t understand what we are trying to teach, it’s never the donkey’s fault! Perhaps we are unclear, working too fast, working in BIG steps instead of baby steps, or maybe trying to teach something the donkey cannot do.Through a well planned clicker training lesson, the donkey learns to trust that we will not punish or hurt her. How often have you seen someone use punishment when they are afraid of their animal? We can fall into that trap so easily; we humans are very reactive!
- We work within the animal’s own comfort zone, allowing her to progress at her own speed and as the animal becomes fluent with the new behaviour, we add a cue and put it under stimulus control. When she shows us that she is ready to move on, then we must move on, changing criteria by building duration and adding new variables to the lesson, such as changing location, building training loops and chains, and shaping the behaviour by asking for a tiny bit more proficiency.
Also,
we work a lot at liberty, so the donkey learns that she has choice -
she can walk away, she isn’t restrained. She becomes our
partner
- we
are engaging her not bossing her around. And choice, the
freedom to choose and the freedom from any fear of punishment is also
a huge reinforcer for our donkeys.
Follow
the principles, have a plan, work in baby steps according to the
learner’s needs (not your own!) practice your timing, your cues,
pay attention to body language (yours and the learners) and you will
be able to use these principles to train anybody.
———————————————————
For
really comprehensive reading (and video watching) on clicker
training, I must refer to some really sophisticated trainers that I
have either had the honour to work with, have heard their
presentations at conferences or have read papers or books that they
have written. It would be silly of me to write about what has
already been so eloquently presented. Some people that really
shine in their work with animals are: Karen Pryor, Alexandra
Kurland (and quite a few few of her students who are also
professional trainers,) Ken Ramirez, Kay Laurence … but there are
others, many
others and I can come up with a much longer list!
I
admit that I have been convinced by their careful methodologies,
their deep understanding of behaviourism and their long experiences
with a multitude of different animals, from marine mammals, all kinds
of animals in zoo environments, equine and dogs. The list of
trainers that really excel in positive reinforcement training is
growing and I certainly don’t claim to know them all. It’s
exciting to see the increasing understanding and acceptance of this
approach to animal training and welfare!
———————————————————
One
Click, One Treat ... or ...?
So
without going into a step by step explanation of how to train, I want
to write about one approach which has differing points of view, and
is rather controversial. In fact this whole post stems from a video
I just watched in which a professional trainer teaches students to
click
(mark
the behaviour they want)
but NOT
reinforce
each time.
So
sometimes
you click and
treat and sometimes
you
click and
don’t
treat, but use the sound of the click as a keep-going signal,
meaning, “don’t stop and no reinforcement is coming.”
This
trainer (and probably others too) suggest that when you consistently
click and
treat,
- you create frustration in your animal,
- the animal will throw all kinds of unwanted behaviours at you,
- animal is only interested because of the food,
- your animal could become dangerous or aroused,
- the animal stops whenever they hear a click and this interrupts the flow of the training session.
It's
understandable that we humans wouldn't want to encourage any of the
above. However we have to question whether the sound of the “click,”
(the conditioned reinforcer or marker/ bridge signal,) can mean more
than one thing, i.e. sometimes this, and sometimes that, to
the animal.
Can it be both a keep-going signal AND a marker signal?
Some
people say yes; I am, respectfully, in the “no” camp. Here's how
I address the concerns listed above:
Clicker
training is a thoughtful training system -
a process that takes the learner through a whole developmental
learning program. So, while the above list may in fact happen
at early stages of learning, this is simply part of a training
process, not the end result.
Training
an animal partner is always dynamic - shifting through micro-shaping to
reflect progress and advancement. We use the
clicker training process
to teach our animals step by step.
We
learn how to avoid frustrating our animal by using a high rate of
reinforcement, then by gradually building duration, loops and chains.
We work with a micro-shaping strategy that teaches the smallest
shifts in nuance using a high rate of reinforcement.
At
their current level of training, my donkeys will now move through
long sequences and I only click and treat at the end. Now
I can ask for numerous things in a row and they happily respond. Now
I don’t use my clicker when I pick hooves unless a hoof is sore.
Now I don’t click and treat for every step along our
walks … they have learned those things and we can move on.
When
the donkey understands the game, sure, she will offer behaviours to
see what might get you to “click!” but we learn how to manage
this by putting each behaviour on cue and then under stimulus
control.
We
stay safe by using protective contact, if need be, until we have
taught the donkey good manners around food (#1 important thing to
teach!) And it is well understood that although food IS the
primary reinforcer, it’s really the game, the engagement, the
problem/ puzzle solving, the enrichment that our animals are
enjoying.
The
unique sound of the click (the conditioned reinforcer) takes away any
confusion in the learning process when it is timed correctly and
followed by a primary reinforcer (usually food.) Operant
conditioning explains that a behaviour that gets reinforced is more
likely to recur.
Is
your animal following you around just because you have food?
Donkeys and horses respond well to food (grazing is usually the
most reinforcing thing for an equine.) But, interestingly, once our
animal partner “gets the game” or becomes “operant,” we soon
see that her eagerness to engage with us isn’t just about the food,
it’s also about the relationship we are building through the
training interaction and the enrichment to the animals' environment.
Equine,
like many animals, are naturally curious - given the opportunity and
encouragement to think for themselves, they love to solve puzzles. I
give 1 - 3 tiny hay pellets at a time, that’s it and yet my donkeys
will stay glued to my side as long as we are playing the clicker
game.
If
the sound of the click ends the behaviour and the donkey stops,- what
a great opportunity to begin again! I get to practice “walk
on” or the beginning of any other behaviour again. Repetition is
one key to
successful
training.
Therefore,
training needs to be a form of clear,
easily understood
communication, not ambiguous. The click or marker signal, is a
distinctive sound that is processed in the non-cognitive part of the
brain called the amygdala, and communicates information to the animal
that is easily and quickly understood: YES! that’s what I
wanted
(behaviour, attitude) and now I’m going to give you something that
you
want (food, scratches!)
We
actually don't want the animal learner to have to interpret, dissect,
think about what the click means.
We
want him/ her to “get it” immediately – the fact that this
distinctive sound is recognized by the oldest part of the mammalian
brain, makes this approach to training all the more effective.
So,
can a red light sometimes means “stop” and sometimes mean
something else? No. When I am in my car, I want everyone
to understand that a red light means STOP - clear and unambiguous!
The same applies to the kind of clicker training that I
practice.
When
I click, my animals know that this signal marks the exact moment they
have done the small thing I am looking for and this is reinforced.
It’s clear information for all of us, one click, one treat.
Some
References:
1.)
Dr. Jesús Rosales-Ruiz, 2014, Resurgence and Regression:
Understanding Extinction So You Can Master It
From
a presentation given by Dr. Jesús Rosales-Ruiz during the 2014 Five
Go To Sea Conference cruise.
Read
Parts 1-15 here:
2.)
Martin, S. & Friedman, S.G. (2011, November). Blazing clickers.
Paper present at Animal
Behavior
Management Alliance conference, Denver. Co.
3.)
Alexandra Kurland, 2014, What is Clicker Training
and http://www.theclickercenter.com/New--On-Line-Course.html
4.)
Katie Bartlett's site: http://www.equineclickertraining.com/articles/articles_new.html
Nice post. Thanks! Do you switch to a vsr, then fade click & treat once behavior is on cue? I do, but see "newer" trainers, not ever fading the click. I give verbal praise and a rub or scritch always, but do not use the clicker one animal is reliably performing desired behavior is under stimulus control. Looking forward to seeing you in CA in November!
ReplyDeleteHi Ann, I find there is usually always more to work on to make the behaviour loop clean - i.e. no unwanted behaviour occuring. True stimulus control is often wishful thinking - lol! But something we aspire to, of course. Once a single "new" behaviour is learned I always pair it with another, building longer loops and then not clicking until the end.
Deletegreat post!
ReplyDeleteWonderful, thoughtful, courageous post! Thank you!
ReplyDeleteThe title contradicts itself, as one click one treat is a positive R system. And yes, I stick with one click one treat too, and if I want to condition a kgs I use another signal.
ReplyDelete