Sunday, 28 July 2019

Principles of Engelmann's Direct Instruction: Speed Principle

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

Speed Principle

Maintaining a high pace is central to the DI method. This is for several reasons:
  • There are multiple activities and repeated review built in to the schemes of work. Activities need to be short in order to cover the large amount of GCSE content in this way.
  • In order to get sufficient practice of a skill, within these short activities, the time a student has per question must necessarily be short.
  • If a component cannot be explained and grasped in <1 minute, it it likely too complex and will result in the failure of the initial instruction to several students. This is a sure sign that the component should be further atomised.
  • Time between each example and each student question needs to be very short in order to ensure that students are seeing the questions as a whole topic and making the correct conclusions about the range/scope of the topic.
  • If students get bogged down with the calculations, they will be unlikely to make inferences from one question to the next. In this case, the component should either be further atomised or taught as a routine.

Friday, 26 July 2019

Principles of Engelmann's Direct Instruction: Difficulty and Motivation

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

Difficulty and Success

Ensuring success, in the acquisition of both individual components of knowledge and whole schemes of work, is at the heart of Direct Instruction. Every part of it was built and tested with this single criteria. Each activity in a DI program is tested and improved until it achieves over 90% success rate of all questions along all students. Even when doing remedial work to correct mistakes and learned misconceptions, DI takes great pains to ensure that students will successfully answer at least 2/3 of questions directed at them (more on this in the correcting mistakes section).

When I am planning a lesson now, I use the following rubric:
  • For an initial instruction activity, can I explain this in <1 minute of teacher direction so that >90% of success will be able to perform the task first time, without additional support?
    • If not, I need to atomise this further in to components that fit this criteria.
  • For review activities, am I confidant that students will remember this topic?


Direct Instruction does not have much to say about student motivation.This, in my view, is one of its main shortfalls, as we all know how much of a factor student motivation is for successful learning. However, it is highly focused on achieving student success. This successful progress is a big factor in most of the popular current theories on motivation. I will demonstrate this below by looking at three theories in particular: the Expectancy-Value Theory, Maslow's Hierarchy of Needs, and Dweck's Growth-Mindset Theory.

Expectancy-Value Theory

According to this theory, a student's motivation is determined by:
  1. How much they value the goal(s) that are set
  2. To what extent they expect to succeed at these goals
These factors are said to multiply, so that if either value is zero (or very low), the overall motivation is zero:
Text Box: Motivation = Text Box: Value   The value of the learning to the learner X Text Box:  Expectancy  The extent to which the learner expects success in the learning

Expectancy is obviously heavily tied to your previous experiences of success and failure in that subject. Therefore if, as with the DI model, you are regularly and reliably making successful progress from lesson to lesson, your expectancy of success in future lessons will increase.

DI does not say anything about Value, and so other techniques should be added to a DI program in order to increase this. At some point I plan on making a blog post specifically about this.

Maslow's Hierarchy of Needs

The crux of this theory is that individuals’ most basic needs must be met before they become motivated to achieve higher level needs. I feel like this plays right to the biggest strength of DI. This is because, whilst enquiry-based learning, project-based learning, collaborative problem solving, open-ended, rich tasks, etc. all attempt to service a student's need for self-actualisation in one way or another, DI is an attempt to service the deeper, psychological need to build self-esteem. In my experience, it does this very successfully with a wide range of students.

Dweck's Growth-Mindset Model

This model states that some people believe their success is based on innate ability (with a fixed mindset) while others believe their success is based on hard work, learning, training and doggedness  (with a growth mindset). This theory has been met with serious criticism and I only include it here because of its popularity within the profession.

A student's beliefs about what causes learning is formed by their experiences of their own and other's successful learning. In DI, a student gets to experience rapid and reliably successful learning. The learning even feels easy for them (so long as they follow along with each section). This means that students get to regularly experience the positive effect of practice on their abilities. This experience can be signposted to make sure that they appreciate it using several techniques:
  • Teacher praise
  • Pre and post-learning check-in tests*
  • Assessment score tracking for students
  • Dan Meyer's headache-aspirin set-up to learning*
*these are not included directly in DI, but I have found adding them useful

Addressing concerns

How does this fit with the theory of desirable difficulty?

This theory states that the best learning happens when students work on problems that are at the right difficulty - challenging but doable. On the face of it, this runs completely counter to the DI model, which aims to make each individual task very easy for students to replicate. Indeed there are several elements of the pure DI model that contradict desirable difficulty:

  • Individual tasks are designed to be (or seem to be) easy in all stages
  • Feedback is designed to always be instant (or as close as possible to this) whereas some studies have shown that delayed feedback is preferable in certain situations.
  • Differentiation is only present in DI in the section under 'correcting mistakes'

However, there are also many ways in which DI fits nicely with the theory of desirable difficulty:
  • Spaced retrieval practice/Interleaving: This is at the heart of every DI session (85% of each lesson spend on review)
  • The scheme of work remains unchanged; students still learn topics to the same level of difficulty as before - DI just gives a different approach on how to lead up to that level of difficulty.

To account for these discrepancies I have adjusted somewhat:
  • I usually do not add any differentiation to initial instruction (as I do want all students to be successful at understanding the central concept). However I do differentiate the expansion and context-shaping activities, as some learners will immediately see wider generalisations and range of applications than others. During these activities I will plan a range of difficulties that students can choose or be directed to.
  • For homework (not mentioned in DI) - it is important that it is accessible for all without additional support (it otherwise risks widening the achievement gap). The only way to do this and achieve desirable difficulty is with differentiation. This can be achieved by choice of task or the amount of scaffolding/prompting on the sheet.
  • I also try to give feedback instantly during initial instruction (as stated in DI, this is a particularly important time to make sure that students are not learning mis-rules) but will often use delayed feedback for other activities. I still find it difficult to know when to use different types of feedback, but here is a decision tree by that aims to help:

Monday, 22 July 2019

Principles of Engelmann's Direct Instruction: Lesson structure and Schemes of Work

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

Lesson Structure

Due to atomisation and the speed principle, the lesson can be split in to 3-5 minute activities (in practice I find this hard to achieve with expansion and cumulative review activities).

These activities should be mix of 15% initial instruction on new material and 85% review (including a combination of the different types of review listed here.

Here is an example lesson plan:
  1. Starter: review of two recent topics, prompted by pre-written worked examples with similar features (this task needs to be doable without prompting from me as I need to be settling the class and taking register)
  2. Prompted review of an older topic (I will pick either one that students have previously struggled with, or one that will be required for the topics to be learnt in coming lessons).
  3. Initial teaching of a new topic, blocked practice followed by an expansion activity.
  4. Re-teaching of last lessons new topic, alternating question practice followed by expansion.
  5. Context-shaping or question discrimination activity on the new topics.
  6. Cumulative review including the topics used in activities 2, 3, and 4.

Scheme of Work

A track is collection of skills grouped not by topic, but by similarity of the required student response. For example, ratio sharing, fraction of an amount, percentages of an amount and pie charts may all be taught in a single track. Each of these skills should be taught using highly similar language and highly similar overt responses. The aim is to reduce teaching time by making the connections in maths (that teachers often take for granted) as clear to students as possible.

Three or four of these tracks should be taught in parallel. That way, the interleaving of topics is woven right in to initial instruction. This is very unlike most teaching models, where interleaving, if used at all, is employed only during topic review/revision.

Each track chosen should be highly different (both visually and in terms of required student response) to avoid blending/confusion.

Friday, 19 July 2019

Principles of Engelmann's Direct Instruction: Defining range/scope

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

Deciding on the range

Once you have atomised your curriculum in to components, you need to thing about the range of all possible examples of each of them. When doing this, you should think about how students will use this skill in the long term, not just about the current topic.

For example, if you were teaching "Solving one-step equations using opposite operations", you would need to include equations like:
as well as x, ÷, + and -.

It is important to note that this does not mean that these all must be taught straight away. Instead, you can show these as examples of the topic that "we haven't learned how to do yet". Then you can teach these sub-components at a separate date before blending them in to the other example types.

There are several advantages to this approach:
  • It effectively pre-teaches elements of several topics that will later build upon the component (in this example, trigonometry, formulae, solving quadratic equations, etc.).
  • It highlights the links between these topics by teaching the components of the topics that require the same response as a single component.
  • It allows students to see the full scope of the new skill and its potential applications. This also demonstrates the value of learning the skill.
  • By showing the full range and limitations early, you are avoiding stipulation where students under-generalise or over-generalise the skill or pattern.

Demonstrating range efficiently

This process often involves vastly broadening what may previously have been a simple component. In order to maintain a high pace on your activities, it is important to use an efficient method to demonstrate the full range of a component and its limitations. In DI, this is achieved by juxtaposing examples using the sameness principle and the difference principle.

Difference principle

In order to demonstrate the full range of applications of a component, you should juxtapose (put next to) examples and questions that are as different as possible (in a single dimension). This allows students to interpolate all the questions in between that would also work.

For example:

From example 1-2 and 2-3, only one dimension has changed (the number on the right-hand side), but it has changed in drastic ways:

  • Between 1-2, the answers turn from natural numbers to negative decimals.
  • Between 2-3, students must now divide a decimal by 5.
Between 3-4, a new dimension changes (the co-efficient of x becomes a power).

Through the use of four examples, we have showed how the same technique works in several different situations, and allowed students to infer how to apply the method to many questions in between.

Sameness principle

In order to demonstrate the limitations of a component, you should juxtapose pairs of positive/negative examples that are as similar to each other as possible. This allows students to see the exact boundary of where the component stops working and extrapolate to other questions that this skill will not work for. Also, because "any sameness shared by both positive and negative examples rules out a possible interpretation", every dimension that has not changed between the examples cannot be the cause of whether a technique works or does not. 

For example:

Each question now looks highly similar, with many features remaining the same, but 2 can not be solved as a one-step equation. From this students will quickly infer that only equations with single variables can be solved using this technique.

By applying the sameness and difference principles, you can effectively and efficiently show the full range and limitations of a skill.

Principles of Engelmann's Direct Instruction: Review

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.


According to Direct Instruction, only 15% of each session should be on new content, with the rest dedicated to review. This seems miniscule to me and I have not managed <40% let alone 15% whilst trying it last term. However, I have massively upped the amount of review that I now plan for in my lessons. This has given me much more scope for the reminding and interleaving of topics and it seems to have made a positive impact on student confidence and grades.

Types of review

  • Reteach: In the follow-up lesson from initial instruction, reteach the component with a set of examples. This is made feasible as instructions now take <1 minute (see atomisation).
  • Blocked practice: Over the next few lessons, practice the skill in a set of questions on this topic only. This may start prompted, with worked examples or some scaffolding, but these prompts should fade until students can perform on the questions without prompts.
  • Question discrimination: Where there is another skill whose set-up looks highly similar to the skill just learnt (say worded LCM questions and worded HCF questions), teach how to differentiate between the two and then present a random set of questions on the two topics. Note that to save time, you may get students to only perform the discrimination and perhaps the first step or two of the working.
  • Cumulative review/Interleaving: A review of up to 6 topics. These topics will change between each review session, allowing students to refresh their memory on a wide range of skills. The topics should include:
    • The two most recently taught topics
    • One topic that is highly similar in set-up to the most recent topic
    • One topic that students have struggled with in the past (and has been re-taught)
    • Two other topics not reviewed recently

In order to keep track of which topics have been reviewed recently, I use an excel spreadsheet (I - initial instruction, R - review, S - reviewed but struggled with):

Building to cumulative review

Also mentioned in DI is a step-by-step approach for building up to the full cumulative review. I have found this particularly helpful with lower-ability sets. Before I have avoided doing unprompted cumulative reviews with these classes as I knew that students would not be able to remember the topics (indeed they often forget everything from the lesson on the day before!).
Steps to build to cumulative review:
  1. Prompted, blocked questions
  2. Unprompted, blocked questions
  3. Questions alternated between the new topic and a mix of highly different, secure topics
  4. Continue number 3, but add more and more questions in between each question on the new topic
  5. Full cumulative review (after teaching discrimination with highly similar topics)
The idea is to do two or three of these in a single activity, slowly building up the complexity as students become more comfortable. At first you may do 1-2, then 1-2-3, then 2-3, then 3-4, then 4-5, then only 5 (or you may build it up much faster or slower, depending on the performance of the class).

DI also recommends avoiding moving on to new topics whilst students are still struggling with the cumulative review.

Example of question discrimination

I find SSDD problems to be a great place to start thinking about which topics are similar to the one taught. For some classes, I know that I can give them some SSDD questions immediately after initial instruction and they will be fine with just this. However, for many students, particularly with English as a foreign language, I find it useful to first teach the discrimination as a correlated feature before they work up to the SSDD problems themselves. So with pythagoras for example:

  1. First define the correlation: "I know that I can use pythagoras because I can make a right-angled triangle where I know two sides"
  2. Now show various example questions, some on pythagoras and some on similar-looking topics (like trig). The aim is not to answer the questions, but to decide when pythagoras is applicable. Demonstrate the first few before students have to answer.
  3. The answer is either:
    • "Yes" Why? "Because I can make a right-angled triangle where I know two sides"
    • "No" Why? "Because I cannot make a right-angled triangle"
    • "No" Why? "Because I do not know two sides"
  4. Practice this until they are firm.
Notice here that I defined it as "When can I use pythagoras?" rather than "When should I use pythagoras?". Otherwise the correlation would become "I know that I should use pythagoras because I can make a right angled triangle where I know two sides and I want to work out the third side". This is a further example of atomisation - a way of splitting up the "When should I..." into more easily digestible components - to facilitate high success rate of initial instruction.

Therefore a final instruction plan would be to teach "When can I..." then "When should I..." before finally actually answering SSDD type problems.

Practicalities and Addressing Concerns

This may seem like it is going to slow down the teaching of new topics drastically. This will be addressed in the next section on "lesson structure". I have found some slow-down, but it is not drastic. This effect may also reduce over time as students are (hopefully) more likely to remember the topics from last year when learning new topics which build on them in the next.

DI recommends choral response for most of their activities. I think this only really works with classes of up to 10 students. Any more than this and it is hard to hear what everyone says and to make sure everyone is taking part. I like to do these reviews on mini-whiteboards, as this allows me to quickly assess everyone and to give instant feedback on misconceptions.

It is important to keep a high pace during these activities, so with longer, more complex procedures, I may ask students: "What's the first step?" or "What's the next step in the working shown?"

Wednesday, 17 July 2019

Principles of Engelmann's Direct Instruction: Cognitive Load

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

Cognitive Load Theory

The basic idea of Cognitive Load Theory is that everyone has an inherent limit to how much information can be stored in your working memory at one time, therefore students will struggle to grasp concepts when their working memories are overloaded with data.
Though Direct Instruction does not reference Cognitive Load Theory directly, it still has a lot to say about ways to reduce the cognitive load for students, particularly during initial instruction.

Reduce clutter

Remove all elements of a problem that are non-essential to the skill, allowing students to infer what is essential/non-essential to the concept. This means changing this question:
Abby, Charles and David share out 32 sweets in the ratio 2:5:9. How many more sweets does David get than Charles?
in to this question:
Share 32 in the ratio 2:5:9
This also means removing non-essential elements from diagrams and avoiding the use of 'real-life' pictures or situations (more on this in the "real-world" maths section)

The wording/setup principle

Every example and practice problem that students do during the initial instruction should have the exact same setup, right down to the wording and the font of each question. In practice, along with the "single dimension transformations" discussed below, this looks a lot like Craig Barton's Variation Theory problems:

Single dimension transformations

As seen in the problem set above, only a single aspect of the question changes from problem to problem. This reduces cognitive load as there is a single aspect to concentrate on between questions. Craig Barton points to other benefits:
By holding constant as much as possible and varying one element, we can direct students’ attention to that element that has varied. Any change (or lack of) in the answer may then be attributed to the change in the element. Moreover, because each question or example is related to the one that preceded it, students are able to form expectations as to what the answer will be. I call this process reflect, expect, check. This can lead to significant moments of revelation and discussion when these expectations are not realised, compelling students to think more deeply about the processes involved, instead of just cruising through an exercise on autopilot.


Discussed in more detail here, atomisation, or the splitting of skills down to their component parts, also means that students are learning a single concept at a time. This means there is less information that needs to be stored in their working memory at any one moment.

Addressing concerns

  • Students won't be able to generalise to similar questions which are worded differently.
    • This is taught as a separate skill during expansion activities
  • Students won't be able to repeat the process when a questions is not preceded by several highly similar questions
    • This is also taught separately during review and context shaping activities.
  • So no context? No treasure hunts, code-breakers, funny pictures practical demos or anything?! Won't students find maths incredibly boring?
    • Most models for motivation put a high emphasis on 'perceived ability to succeed' as a main contributor to overall motivation. Atomisation is a fantastic tool for facilitating regular and obvious successful progress. I have found my students have enjoyed my lessons much more this year because they can 'feel' the progress they are making.
    • All of those activities are possible to fit in to a Direct Instruction program, and should only be strictly avoided during the initial instruction phase. Because each component is small, each initial instruction phase should only take a few minutes. This frees up time to do the code breaker/treasure hunt as an expansion or review activity in a future lesson.

In the Direct Instruction model, the aim of initial instruction is to teach the central concept only,  whilst avoid the learning of misrules. Separating the concept, question identification and recall and teaching them as separate skills is a further tool for reducing cognitive load.

Monday, 15 July 2019

Principles of Engelmann's Direct Instruction: Expansion and Context-Shaping

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.


Expansion takes place after initial teaching, where things have been simplified to the only the essential features (see cognitive load). Expansion then introduces more variety, showing how questions can be worded differently and still mean the same operation

One type of expansion activity would be 'implied conclusion' tasks. These are tasks that you use the current skill (possibly alongside other skills which are already firm) to solve new problems. An example of this is ordering fractions after being taught how to compare fractions.

Matching ("circle all the fractions greater than two thirds") and "here is the answer, make up the question") are other types of expansion activities.

Context Shaping

Context shaping is the name given in Direct Instruction for demonstrating the range of contexts where a skill can be used.

This may be covered in a single activity or multiple activities, depending on the number of different possible contexts and the amount of difficulty added with each new context.

DI recommends juxtaposing highly different contexts one after the other (before interleaved practice). This maximises the chances that students will interpolate and infer any contexts in between those shown.

Both expansion and context shaping activities should happen as soon as possible after students are firm on the initial teaching. This is so that students do not stipulate by viewing the skill in too narrow a way.


These questions from show a variety of contexts for using direct proportional reasoning:
*Note, there is also some question discrimination here as students need to decide if the numbers are in direct proportion or not.

I would likely split this in to several activities in order to reduce the difficulty of individual sessions. In order to juxtapose highly different contexts I might show them in this order (though other orders would also work):

  • Speed/cycling and driving
  • Lines
  • Similar shapes/triangles
  • Maps
  • Money and Petrol


Textbooks generally do this in their exercises. However, I find the structure that DI gives to these activities helpful for phasing in the wider activities. So for a high ability set, I may just give them the textbook and let them get on with it and, for a low ability set, I can expand their understanding in small stages over several days. This allows students of all abilities to feel successful and to deepen their understanding of a topic, something I definitely struggled to get right in the past!

Principles of Engelmann's Direct Instruction: Speed Principle

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann i...