## Tuesday, 14 July 2020

### Quick wins from Direct Instruction: Dimensions of Difficulty

This post was inspired by an episode of the Craig Barton podcast with Kris Boulton. Kris was acting as a salesman for Engelmann's Direct Instruction, but he didn't seem to have much to offer when Craig asked him for Direct Instruction 'quick wins' that teachers could implement and improve their teaching right away (besides atomisation and minimal variation, which followers of Craig Barton will already be very familiar with).

One of the things that has really helped me after reading Direct Instruction is the way that it breaks down, in great detail, the different elements that cause students to find a topic difficult. As an experienced teacher, I had already figured out some of these, but Direct Instruction really clarified my thinking and it has made me much more confident in predicting how successful students will be on a topic. This is vitally important to the Direct Instruction method, as designing every task so that over 90% of the students will be instantly successful is a core element of the method. I believe that these ideas would be even more useful for newer teachers. First of all, here is the list of difficulty dimensions:
• Similarity to other topics - e.g. the letter 'd' is harder to learn than the letter 'o'. Because of its superficial similarity to 'b', an additional amount of problem differentiation is needed by the student (note that this only applies to whichever letter out of b and d that they learn last).
• Method is shared with another topic - conversely, if the method is identical to that of another topic, the difficulty of the new topic is reduced. This is because you can teach the new topic as a "double transformation", where students convert the new topic into the topic they already know. e.g. 35% of 250 can be transformed into 35/100 of 250 if they already know fractions of amounts.
• Prior knowledge required - each additional piece of prior knowledge required will increase the load on long-term memory.
• The length of time elapsed since students last successfully practiced on that topic (or any prior knowledge topics).
• Number of times students have successfully practiced the topic.
• The number of steps in the method - Increases the load on working memory. This should include any decisions that are required, not just the overt written steps. e.g. for a basic trigonometry question you have the following steps: (1) decide whether it is trig or pythag, (2) decide which side is h, o and a, (3) work out which sides are used in the question, (4) decide whether to use sin, cos or tan, (5) remember the formula for the trig function, (6) substitute the correct numbers in to the formula, (7) rearrange the formula, (8) type the formula in to the calculator, (9) round the answer to the correct accuracy.
• The context of the problem on the page (this includes how the question is worded as well as any real-life situation the problem is put in).
• Blocked/Interleaved questions - Are the questions in a block of the same question type? Are they interleaved with other topics? How many other topics? Have they been primed to think about this topic earlier in the lesson (via instruction or earlier questions)?
• Difficulty of response - How are students expected to respond to a question? These are some options going from easiest to most difficult:
• Yes/No binary option
• multiple choice
• scaffolded response
• single line answer (without scaffold)
• multi-step process (with scaffold then without)
• Reasoned explanation

As mentioned at the start, teaching so that each new topic has over 90% of students 'getting it' on their first attempt is central to DI. This is crucial for motivation, by setting up an expectation of success in the subject, and for the high pace required in the DI method. Knowing and considering these dimensions of difficulty is essential for pitching the work correctly so that this happens.

Though all of this is discussed in DI, I find the way that they consider each of these dimensions separately to be complicated and unwieldy. Which brings me to the 'win' that may take a bit more explanation...

#### My interpretation/application of the dimensions of difficulty concept

For each topic I teach I assign a sort of 'difficulty score'. Then, through knowledge of the students and testing of the class, I decide what is the maximum difficulty score I can give to that class, where I will still get >90% success (this varies wildly from class to class). If the 'difficulty score' of the new topic is too high, I find ways to reduce the score to an acceptable level. Though I've tried to codify it here, it is important to note that in my lesson planning I rarely work out the exact score, but consider each dimension to get a rough feel for the difficulty before deciding how many difficulty reduction steps I should take. Here is how I assign my score and how I reduce it:

I know that this sounds like quite an intense thing to do whilst planning for every lesson, but I find doing a rough version of this in my head takes less than a minute and it helps my planning immensely.

## Monday, 5 August 2019

### Principles of Engelmann's Direct Instruction: Categories for different types of knowledge

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

### Categories for different types of knowledge

After atomisation, Direct Instruction then attempts to categorise each of these components according to common features. Each category then has a section detailing how best to convey this type of knowledge to students. In this post, I will describe and give examples for each category. In future posts, I will discuss the different teaching methods for each. Here is a list of the categories I will be detailing:
• Basic types
• Noun
• Non-comparative, single-dimension concept
• Comparative, single-dimension concept
• Correlated feature
• Single transformation
• Double transformation
• Complex forms
• Fact systems
• Cognitive routines

### Basic Types

#### Noun

Nouns are labels for object classes (Cars or trees, for example). They are multi-dimensional concepts (for example, there are several features that an object must have in order to be considered a car: wheels, 2-7 seats, steering wheel, etc.). The boundary between positive and negative examples is not usually precisely defined (many people would disagree about at what point a car becomes a jeep, or when a car becomes a go kart, etc.).

I find that there are very few of these in maths and, where they do, the boundary of maths nouns is clearly defined. For example, a square is multi-dimensional (has 2 dimensions, all sides straight, four equal angles, 4 equal sides), but there is no disagreement in maths between positive and negative examples of a square.

For this reason, I do not think that the teaching method detailed in Direct Instruction applies much to the subject of maths and I will not be going through it in much detail.

#### Single-dimension non-comparative

Examples include: horizontal, improper and multiple. These are concepts that are defined by a single dimension (for example: a fraction is improper if the value of the numerator is larger than the value of the denominator. All other features of the fraction are unimportant for this definition.)

#### Single-dimension comparative

These concepts are also defined by a single dimension, but are about comparing two things. Examples include: getting steeper, increasing and getting heavier. Note that many non-comparative concepts can be transformed in to comparative ones (heavy → getting heavier, large → getting larger).

#### Correlated feature

Correlated features are two (or more) concepts where one is implied by the other (in the form if... then...). Examples include: if an equation has more than one variable then it cannot be solved on its own, if a shape is an enlargement then it is similar, if a fraction is improper then it is greater than 1).

Note here how teaching these as correlated features can make the links between topics explicit. This is an element I really like about DI.

#### Single Transformation

Instead of having a single-dimension which defines positive/negative examples, a transformation is a concept where you follow a single rule. An easy way to spot these is if the required student response is not the same every time (Is it a multiple? always has the answer yes or no. What is the value of the 7 in this number? can be a range of answers, depending on the number being displayed). Examples include: Add a numerator to make this fraction improper, 3+5 = ? and solving one step equations of the form ax=b.

I find that a lot of maths is made up of single transformations when you atomise the more complex concepts.

#### Double Transformation

Double transformations are a set of two single transformation concepts which share many similar features. Examples include: single-digit addition ↔ tens-digit addition, addition ↔ subtraction, 5 x 4 = ? ↔ 5 x ? = 25).

As I'm sure you can imagine, there is a large proportion of topics in maths which can be thought of as double transformations. There are two reasons to consider them as such:
• You can leverage the similarities to make the new concept easier to learn (its just like this, but...)
• You can spend time explicitly teaching the discrimination between the topics. In the past I did not spend nearly enough time considering the added difficulty of discriminating between similar-looking questions.

### Complex forms

#### Fact System

A fact system is a series of facts which make up some larger 'whole'. Examples include: properties of 2D shapes and angle rules. When teaching a fact system, it is important to note that you should not be teaching the meaning of individual facts (these should be pre-taught as more basic types), but you are teaching their place as part of a whole system.

#### Cognitive routines

Most of the topics in maths that you will atomise in to several basic and linked types of knowledge are cognitive routines. They are built up from multiple different atoms (and smaller cognitive routines) and often require several steps and distinctions to successfully complete. Examples include: solving linear equations, adding fractions and finding the missing side using trigonometry.

## Sunday, 28 July 2019

### Principles of Engelmann's Direct Instruction: Speed Principle

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

### Speed Principle

Maintaining a high pace is central to the DI method. This is for several reasons:
• There are multiple activities and repeated review built in to the schemes of work. Activities need to be short in order to cover the large amount of GCSE content in this way.
• In order to get sufficient practice of a skill, within these short activities, the time a student has per question must necessarily be short.
• If a component cannot be explained and grasped in <1 minute, it it likely too complex and will result in the failure of the initial instruction to several students. This is a sure sign that the component should be further atomised.
• Time between each example and each student question needs to be very short in order to ensure that students are seeing the questions as a whole topic and making the correct conclusions about the range/scope of the topic.
• If students get bogged down with the calculations, they will be unlikely to make inferences from one question to the next. In this case, the component should either be further atomised or taught as a routine.

## Friday, 26 July 2019

### Principles of Engelmann's Direct Instruction: Difficulty and Motivation

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

### Difficulty and Success

Ensuring success, in the acquisition of both individual components of knowledge and whole schemes of work, is at the heart of Direct Instruction. Every part of it was built and tested with this single criteria. Each activity in a DI program is tested and improved until it achieves over 90% success rate of all questions along all students. Even when doing remedial work to correct mistakes and learned misconceptions, DI takes great pains to ensure that students will successfully answer at least 2/3 of questions directed at them (more on this in the correcting mistakes section).

When I am planning a lesson now, I use the following rubric:
• For an initial instruction activity, can I explain this in <1 minute of teacher direction so that >90% of success will be able to perform the task first time, without additional support?
• If not, I need to atomise this further in to components that fit this criteria.
• For review activities, am I confidant that students will remember this topic?

### Motivation

Direct Instruction does not have much to say about student motivation.This, in my view, is one of its main shortfalls, as we all know how much of a factor student motivation is for successful learning. However, it is highly focused on achieving student success. This successful progress is a big factor in most of the popular current theories on motivation. I will demonstrate this below by looking at three theories in particular: the Expectancy-Value Theory, Maslow's Hierarchy of Needs, and Dweck's Growth-Mindset Theory.

#### Expectancy-Value Theory

According to this theory, a student's motivation is determined by:
1. How much they value the goal(s) that are set
2. To what extent they expect to succeed at these goals
These factors are said to multiply, so that if either value is zero (or very low), the overall motivation is zero:
= X

Expectancy is obviously heavily tied to your previous experiences of success and failure in that subject. Therefore if, as with the DI model, you are regularly and reliably making successful progress from lesson to lesson, your expectancy of success in future lessons will increase.

DI does not say anything about Value, and so other techniques should be added to a DI program in order to increase this. At some point I plan on making a blog post specifically about this.

#### Maslow's Hierarchy of Needs

The crux of this theory is that individuals’ most basic needs must be met before they become motivated to achieve higher level needs. I feel like this plays right to the biggest strength of DI. This is because, whilst enquiry-based learning, project-based learning, collaborative problem solving, open-ended, rich tasks, etc. all attempt to service a student's need for self-actualisation in one way or another, DI is an attempt to service the deeper, psychological need to build self-esteem. In my experience, it does this very successfully with a wide range of students.

#### Dweck's Growth-Mindset Model

This model states that some people believe their success is based on innate ability (with a fixed mindset) while others believe their success is based on hard work, learning, training and doggedness  (with a growth mindset). This theory has been met with serious criticism and I only include it here because of its popularity within the profession.

A student's beliefs about what causes learning is formed by their experiences of their own and other's successful learning. In DI, a student gets to experience rapid and reliably successful learning. The learning even feels easy for them (so long as they follow along with each section). This means that students get to regularly experience the positive effect of practice on their abilities. This experience can be signposted to make sure that they appreciate it using several techniques:
• Teacher praise
• Pre and post-learning check-in tests*
• Assessment score tracking for students
• Dan Meyer's headache-aspirin set-up to learning*
*these are not included directly in DI, but I have found adding them useful

#### How does this fit with the theory of desirable difficulty?

This theory states that the best learning happens when students work on problems that are at the right difficulty - challenging but doable. On the face of it, this runs completely counter to the DI model, which aims to make each individual task very easy for students to replicate. Indeed there are several elements of the pure DI model that contradict desirable difficulty:

• Individual tasks are designed to be (or seem to be) easy in all stages
• Feedback is designed to always be instant (or as close as possible to this) whereas some studies have shown that delayed feedback is preferable in certain situations.
• Differentiation is only present in DI in the section under 'correcting mistakes'

However, there are also many ways in which DI fits nicely with the theory of desirable difficulty:
• Spaced retrieval practice/Interleaving: This is at the heart of every DI session (85% of each lesson spend on review)
• The scheme of work remains unchanged; students still learn topics to the same level of difficulty as before - DI just gives a different approach on how to lead up to that level of difficulty.

To account for these discrepancies I have adjusted somewhat:
• I usually do not add any differentiation to initial instruction (as I do want all students to be successful at understanding the central concept). However I do differentiate the expansion and context-shaping activities, as some learners will immediately see wider generalisations and range of applications than others. During these activities I will plan a range of difficulties that students can choose or be directed to.
• For homework (not mentioned in DI) - it is important that it is accessible for all without additional support (it otherwise risks widening the achievement gap). The only way to do this and achieve desirable difficulty is with differentiation. This can be achieved by choice of task or the amount of scaffolding/prompting on the sheet.
• I also try to give feedback instantly during initial instruction (as stated in DI, this is a particularly important time to make sure that students are not learning mis-rules) but will often use delayed feedback for other activities. I still find it difficult to know when to use different types of feedback, but here is a decision tree by improvingteaching.co.uk that aims to help:

## Monday, 22 July 2019

### Principles of Engelmann's Direct Instruction: Lesson structure and Schemes of Work

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

### Lesson Structure

Due to atomisation and the speed principle, the lesson can be split in to 3-5 minute activities (in practice I find this hard to achieve with expansion and cumulative review activities).

These activities should be mix of 15% initial instruction on new material and 85% review (including a combination of the different types of review listed here.

Here is an example lesson plan:
1. Starter: review of two recent topics, prompted by pre-written worked examples with similar features (this task needs to be doable without prompting from me as I need to be settling the class and taking register)
2. Prompted review of an older topic (I will pick either one that students have previously struggled with, or one that will be required for the topics to be learnt in coming lessons).
3. Initial teaching of a new topic, blocked practice followed by an expansion activity.
4. Re-teaching of last lessons new topic, alternating question practice followed by expansion.
5. Context-shaping or question discrimination activity on the new topics.
6. Cumulative review including the topics used in activities 2, 3, and 4.

### Scheme of Work

A track is collection of skills grouped not by topic, but by similarity of the required student response. For example, ratio sharing, fraction of an amount, percentages of an amount and pie charts may all be taught in a single track. Each of these skills should be taught using highly similar language and highly similar overt responses. The aim is to reduce teaching time by making the connections in maths (that teachers often take for granted) as clear to students as possible.

Three or four of these tracks should be taught in parallel. That way, the interleaving of topics is woven right in to initial instruction. This is very unlike most teaching models, where interleaving, if used at all, is employed only during topic review/revision.

Each track chosen should be highly different (both visually and in terms of required student response) to avoid blending/confusion.

## Friday, 19 July 2019

### Principles of Engelmann's Direct Instruction: Defining range/scope

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

### Deciding on the range

Once you have atomised your curriculum in to components, you need to thing about the range of all possible examples of each of them. When doing this, you should think about how students will use this skill in the long term, not just about the current topic.

For example, if you were teaching "Solving one-step equations using opposite operations", you would need to include equations like:
as well as x, ÷, + and -.

It is important to note that this does not mean that these all must be taught straight away. Instead, you can show these as examples of the topic that "we haven't learned how to do yet". Then you can teach these sub-components at a separate date before blending them in to the other example types.

There are several advantages to this approach:
• It effectively pre-teaches elements of several topics that will later build upon the component (in this example, trigonometry, formulae, solving quadratic equations, etc.).
• It highlights the links between these topics by teaching the components of the topics that require the same response as a single component.
• It allows students to see the full scope of the new skill and its potential applications. This also demonstrates the value of learning the skill.
• By showing the full range and limitations early, you are avoiding stipulation where students under-generalise or over-generalise the skill or pattern.

### Demonstrating range efficiently

This process often involves vastly broadening what may previously have been a simple component. In order to maintain a high pace on your activities, it is important to use an efficient method to demonstrate the full range of a component and its limitations. In DI, this is achieved by juxtaposing examples using the sameness principle and the difference principle.

#### Difference principle

In order to demonstrate the full range of applications of a component, you should juxtapose (put next to) examples and questions that are as different as possible (in a single dimension). This allows students to interpolate all the questions in between that would also work.

For example:

From example 1-2 and 2-3, only one dimension has changed (the number on the right-hand side), but it has changed in drastic ways:

• Between 1-2, the answers turn from natural numbers to negative decimals.
• Between 2-3, students must now divide a decimal by 5.
Between 3-4, a new dimension changes (the co-efficient of x becomes a power).

Through the use of four examples, we have showed how the same technique works in several different situations, and allowed students to infer how to apply the method to many questions in between.

#### Sameness principle

In order to demonstrate the limitations of a component, you should juxtapose pairs of positive/negative examples that are as similar to each other as possible. This allows students to see the exact boundary of where the component stops working and extrapolate to other questions that this skill will not work for. Also, because "any sameness shared by both positive and negative examples rules out a possible interpretation", every dimension that has not changed between the examples cannot be the cause of whether a technique works or does not.

For example:

Each question now looks highly similar, with many features remaining the same, but 2 can not be solved as a one-step equation. From this students will quickly infer that only equations with single variables can be solved using this technique.

By applying the sameness and difference principles, you can effectively and efficiently show the full range and limitations of a skill.

### Principles of Engelmann's Direct Instruction: Review

This post is part of a series where I go in to detail on some of the main aspects of Direct Instruction as laid out by Siegfried Engelmann in this book.

Below are links to the other posts in this series. Scroll past them to read the article.

### Review

According to Direct Instruction, only 15% of each session should be on new content, with the rest dedicated to review. This seems miniscule to me and I have not managed <40% let alone 15% whilst trying it last term. However, I have massively upped the amount of review that I now plan for in my lessons. This has given me much more scope for the reminding and interleaving of topics and it seems to have made a positive impact on student confidence and grades.

#### Types of review

• Reteach: In the follow-up lesson from initial instruction, reteach the component with a set of examples. This is made feasible as instructions now take <1 minute (see atomisation).
• Blocked practice: Over the next few lessons, practice the skill in a set of questions on this topic only. This may start prompted, with worked examples or some scaffolding, but these prompts should fade until students can perform on the questions without prompts.
• Question discrimination: Where there is another skill whose set-up looks highly similar to the skill just learnt (say worded LCM questions and worded HCF questions), teach how to differentiate between the two and then present a random set of questions on the two topics. Note that to save time, you may get students to only perform the discrimination and perhaps the first step or two of the working.
• Cumulative review/Interleaving: A review of up to 6 topics. These topics will change between each review session, allowing students to refresh their memory on a wide range of skills. The topics should include:
• The two most recently taught topics
• One topic that is highly similar in set-up to the most recent topic
• One topic that students have struggled with in the past (and has been re-taught)
• Two other topics not reviewed recently

In order to keep track of which topics have been reviewed recently, I use an excel spreadsheet (I - initial instruction, R - review, S - reviewed but struggled with):

#### Building to cumulative review

Also mentioned in DI is a step-by-step approach for building up to the full cumulative review. I have found this particularly helpful with lower-ability sets. Before I have avoided doing unprompted cumulative reviews with these classes as I knew that students would not be able to remember the topics (indeed they often forget everything from the lesson on the day before!).
Steps to build to cumulative review:
1. Prompted, blocked questions
2. Unprompted, blocked questions
3. Questions alternated between the new topic and a mix of highly different, secure topics
4. Continue number 3, but add more and more questions in between each question on the new topic
5. Full cumulative review (after teaching discrimination with highly similar topics)
The idea is to do two or three of these in a single activity, slowly building up the complexity as students become more comfortable. At first you may do 1-2, then 1-2-3, then 2-3, then 3-4, then 4-5, then only 5 (or you may build it up much faster or slower, depending on the performance of the class).

DI also recommends avoiding moving on to new topics whilst students are still struggling with the cumulative review.

#### Example of question discrimination

I find SSDD problems to be a great place to start thinking about which topics are similar to the one taught. For some classes, I know that I can give them some SSDD questions immediately after initial instruction and they will be fine with just this. However, for many students, particularly with English as a foreign language, I find it useful to first teach the discrimination as a correlated feature before they work up to the SSDD problems themselves. So with pythagoras for example:

1. First define the correlation: "I know that I can use pythagoras because I can make a right-angled triangle where I know two sides"
2. Now show various example questions, some on pythagoras and some on similar-looking topics (like trig). The aim is not to answer the questions, but to decide when pythagoras is applicable. Demonstrate the first few before students have to answer.
• "Yes" Why? "Because I can make a right-angled triangle where I know two sides"
• "No" Why? "Because I cannot make a right-angled triangle"
• "No" Why? "Because I do not know two sides"
4. Practice this until they are firm.
Notice here that I defined it as "When can I use pythagoras?" rather than "When should I use pythagoras?". Otherwise the correlation would become "I know that I should use pythagoras because I can make a right angled triangle where I know two sides and I want to work out the third side". This is a further example of atomisation - a way of splitting up the "When should I..." into more easily digestible components - to facilitate high success rate of initial instruction.

Therefore a final instruction plan would be to teach "When can I..." then "When should I..." before finally actually answering SSDD type problems.