Welcome to my blog - a diary about living with donkeys, notes about care, my training sessions and the absolute pleasure of donkey companionship.


Leave a comment! Just click on Comments at the bottom of each post and a box will appear. If you have a question, I always respond!


Tuesday, September 29, 2015

One Click, One Treat: why I prefer this approach to positive reinforcement training


Here is a rather long paper I have been writing on the subject.  I welcome your thoughts!


One Click, One Treat
Why I prefer this approach to positive reinforcement training


Some Nuts & Bolts about Clicker Training, some key (for me) points:

  • Positive Reinforcement Training or Operant Conditioning (and the more colloquial name Clicker Training) is a training system based on science - the study of brain function, how we learn, behaviourism - that’s one very cool thing about it - it is science based!  

The science of operant conditioning proves that by reinforcing behaviour, you increase the likelihood of that behaviour happening again.  Therefore the opposite is also true - that by ignoring a behaviour (not reinforcing it) the behaviour will likely extinguish!

  • Reinforcement can take many forms - it doesn’t necessarily mean food or treats or rewards.  It can be paying attention!  So even yelling at the dog to stop barking might be reinforcing in some way for the dog - he has got your attention!  

Important to remember that “reinforcement” is what the learner finds reinforcing NOT the teacher!  Example: If I gave you chocolate every time you swept the floor but you had an allergy to chocolate, then chocolate would not be reinforcing for you! You might even quit sweeping (the sweeping behaviour extinguishes!)  

So while one donkey might like scratches (Rose) another donkey may not (Siog) or one dog may loves his frisbee but another doesn’t. etc.  It’s up to us to find out what each individual animal finds reinforcing. 

  • Operant conditioning also means that the learner becomes “operant” or in other words, makes choices, becomes the “operator” of their situation and learns that cues and behaviours/ actions trigger consequences.  

Animals are individuals and we tailor our training plans to suit each one.  As Alexandra Kurland likes to say “the horse will tell you what he/she needs to learn.”  By this she means that if a horse is, say pushing into your space uninvited, you can see that that particular horse needs to learn to stand politely, perhaps needs to learn to back up or station to a mat - while another horse may be timid and needs to learn to approach or “come!"

  • Important to remember that we are not simply training behaviours, (do this, do that) but we are also reinforcing a calm attitude, good energy  and good balance and we are establishing a trusting relationship.  This is a 2-way street - we need to trust our animals and they need to trust us.

Clicker training, done properly, teaches both the trainer and the learner patience and gives us a lot of training tools to draw from as we teach and experience a variety of situations.

  • The donkey is never “wrong” - if he/ she doesn’t understand what we are trying to teach, it’s never the donkey’s fault!  Perhaps we are unclear, working too fast, working in BIG steps instead of baby steps, or maybe trying to teach something the donkey cannot do.  
    Through a well planned clicker training lesson, the donkey learns to trust that we will not punish or hurt her. How often have you seen someone use punishment when they are afraid of their animal? We can fall into that trap so easily; we humans are very reactive!


  • We work within the animal’s own comfort zone, allowing her to progress at her own speed and as the animal becomes fluent with the new behaviour, we add a cue and put it under stimulus control. When she shows us that she is ready to move on,  then we must move on, changing criteria by building duration and adding new variables to the lesson, such as changing location, building training loops and chains, and shaping the behaviour by asking for a tiny bit more proficiency. 

Also, we work a lot at liberty, so the donkey learns that she has choice - she can walk away, she isn’t restrained.  She becomes our partner - we are engaging her not bossing her around.  And choice, the freedom to choose and the freedom from any fear of punishment is also a huge reinforcer for our donkeys.

Follow the principles, have a plan, work in baby steps according to the learner’s needs (not your own!) practice your timing, your cues, pay attention to body language (yours and the learners) and you will be able to use these principles to train anybody.
———————————————————

For really comprehensive reading (and video watching) on clicker training, I must refer to some really sophisticated trainers that I have either had the honour to work with, have heard their presentations at conferences or have read papers or books that they have written.  It would be silly of me to write about what has already been so eloquently presented.  Some people that really shine in their work with animals are:  Karen Pryor, Alexandra Kurland (and quite a few few of her students who are also professional trainers,) Ken Ramirez, Kay Laurence … but there are others, many others and I can come up with a much longer list!

I admit that I have been convinced by their careful methodologies, their deep understanding of behaviourism and their long experiences with a multitude of different animals, from marine mammals, all kinds of animals in zoo environments, equine and dogs.  The list of trainers that really excel in positive reinforcement training is growing and I certainly don’t claim to know them all. It’s exciting to see the increasing understanding and acceptance of this approach to animal training and welfare!

———————————————————

One Click, One Treat ... or ...?

So without going into a step by step explanation of how to train, I want to write about one approach which has differing points of view, and is rather controversial. In fact this whole post stems from a video I just watched in which a professional trainer teaches students to click (mark the behaviour they want) but NOT reinforce each time.

So sometimes you click and treat and sometimes you click and don’t treat, but use the sound of the click as a keep-going signal, meaning, “don’t stop and no reinforcement is coming.” 

This trainer (and probably others too) suggest that when you consistently click and treat,
  •  you create frustration in your animal, 
  •  the animal will throw all kinds of unwanted behaviours at you, 
  •  animal is only interested because of the food, 
  •  your animal could become dangerous or aroused, 
  •  the animal stops whenever they hear a click and this interrupts the flow of the training session.  

It's understandable that we humans wouldn't want to encourage any of the above. However we have to question whether the sound of the “click,” (the conditioned reinforcer or marker/ bridge signal,) can mean more than one thing, i.e. sometimes this, and sometimes that, to the animal. Can it be both a keep-going signal AND a marker signal?

Some people say yes; I am, respectfully, in the “no” camp. Here's how I address the concerns listed above:

Clicker training is a thoughtful training system - a process that takes the learner through a whole developmental learning program.  So, while the above list may in fact happen at early stages of learning, this is simply part of a training process, not the end result.  

Training an animal partner is always dynamic - shifting through micro-shaping to reflect progress and advancement. We use the clicker training process to teach our animals step by step.

We learn how to avoid frustrating our animal by using a high rate of reinforcement, then by gradually building duration, loops and chains.  We work with a micro-shaping strategy that teaches the smallest shifts in nuance using a high rate of reinforcement.

At their current level of training, my donkeys will now move through long sequences and I only click and treat at the end.  Now I can ask for numerous things in a row and they happily respond.  Now I don’t use my clicker when I pick hooves unless a hoof is sore.  Now I don’t click and treat for every step along our walks … they have learned those things and we can move on.

When the donkey understands the game, sure, she will offer behaviours to see what might get you to “click!” but we learn how to manage this by putting each behaviour on cue and then under stimulus control.  

We stay safe by using protective contact, if need be, until we have taught the donkey good manners around food (#1 important thing to teach!)   And it is well understood that although food IS the primary reinforcer, it’s really the game, the engagement, the problem/ puzzle solving, the enrichment that our animals are enjoying.

The unique sound of the click (the conditioned reinforcer) takes away any confusion in the learning process when it is timed correctly and followed by a primary reinforcer (usually food.)  Operant conditioning explains that a behaviour that gets reinforced is more likely to recur.

Is your animal following you around just because you have food? Donkeys and horses respond well to food (grazing is usually the most reinforcing thing for an equine.) But, interestingly, once our animal partner “gets the game” or becomes “operant,” we soon see that her eagerness to engage with us isn’t just about the food, it’s also about the relationship we are building through the training interaction and the enrichment to the animals' environment.  

Equine, like many animals, are naturally curious - given the opportunity and encouragement to think for themselves, they love to solve puzzles. I give 1 - 3 tiny hay pellets at a time, that’s it and yet my donkeys will stay glued to my side as long as we are playing the clicker game.

If the sound of the click ends the behaviour and the donkey stops,- what a great opportunity to begin again! I get to practice “walk on” or the beginning of any other behaviour again. Repetition is one key to
successful training.

Therefore, training needs to be a form of clear, easily understood communication, not ambiguous.  The click or marker signal, is a distinctive sound that is processed in the non-cognitive part of the brain called the amygdala, and communicates information to the animal that is easily and quickly understood:  YES! that’s what I wanted (behaviour, attitude) and now I’m going to give you something that you want (food, scratches!)

We actually don't want the animal learner to have to interpret, dissect, think about what the click means.
We want him/ her to “get it” immediately – the fact that this distinctive sound is recognized by the oldest part of the mammalian brain, makes this approach to training all the more effective.

So, can a red light sometimes means “stop” and sometimes mean something else?  No.  When I am in my car, I want everyone to understand that a red light means STOP - clear and unambiguous!  The same applies to  the kind of clicker training that I practice.
When I click, my animals know that this signal marks the exact moment they have done the small thing I am looking for and this is reinforced.  It’s clear information for all of us, one click, one treat.  


Some References:

1.) Dr. Jesús Rosales-Ruiz, 2014, Resurgence and Regression: Understanding Extinction So You Can Master It
From a presentation given by Dr. Jesús Rosales-Ruiz during the 2014 Five Go To Sea Conference cruise.

Read Parts 1-15 here:


2.) Martin, S. & Friedman, S.G. (2011, November). Blazing clickers. Paper present at Animal
Behavior Management Alliance conference, Denver. Co.


3.) Alexandra Kurland, 2014, What is Clicker Training and http://www.theclickercenter.com/New--On-Line-Course.html








5 comments:

  1. Nice post. Thanks! Do you switch to a vsr, then fade click & treat once behavior is on cue? I do, but see "newer" trainers, not ever fading the click. I give verbal praise and a rub or scritch always, but do not use the clicker one animal is reliably performing desired behavior is under stimulus control. Looking forward to seeing you in CA in November!

    ReplyDelete
    Replies
    1. Hi Ann, I find there is usually always more to work on to make the behaviour loop clean - i.e. no unwanted behaviour occuring. True stimulus control is often wishful thinking - lol! But something we aspire to, of course. Once a single "new" behaviour is learned I always pair it with another, building longer loops and then not clicking until the end.

      Delete
  2. Wonderful, thoughtful, courageous post! Thank you!

    ReplyDelete
  3. The title contradicts itself, as one click one treat is a positive R system. And yes, I stick with one click one treat too, and if I want to condition a kgs I use another signal.

    ReplyDelete