Skip to content

L&D

Why leadership development programs fail without assessment

S
Simon Carvi
·May 8, 2026·9 min read
Why leadership training fails without assessment

Most companies run a leadership development program by buying training. They book the workshops, design the curriculum, deliver the content, and count completions. The data they did not collect before any of that started is what determines whether the program ever changed behavior on the team.

The pattern is the same in every failed program

The shape is consistent. HR books a leadership program for the manager population. Two days off-site, four virtual modules, an executive sponsor at kickoff. Participants leave energized. Six months later, peers and reports describe their managers in the same language they used before the program. The program ran. The metric the program was meant to move did not move.

The reason is not the trainer or the content. It is that the program was designed without a diagnosis. There was no role baseline. There was no assessment of where each participant actually stood on the competencies the program claimed to develop. There was no observable target. The curriculum was generic because the gaps were unknown, and the gaps were unknown because nobody assessed them.

This is the default state of corporate leadership training. The fix is not better content. It is sequencing assessment in front of training, and using the assessment data to decide who needs which content.

Why does leadership training fail without assessment? Because training without a baseline cannot target a gap, and training without observable behaviors cannot prove movement. Assessment defines what good looks like at each role, surfaces the gap between current and target proficiency, and gives the development plan something specific to practice. Without it, training is content delivery, not capability building.

What “fails” actually means

Three failure modes show up in leadership development programs that skipped assessment.

No targeting. Every participant gets the same content because there is no data to differentiate them. A senior manager strong on coaching but weak on strategic framing sits through the same coaching module as a first-line manager who has never given written feedback. Both finish the program. Neither is meaningfully more capable of the work their role actually requires.

No measurable outcome. Completion counts as success. The program report shows attendance and a satisfaction score. Neither number tells anyone whether managers are now coaching their teams more often, framing decisions more clearly, or holding their reports accountable more consistently. The metric that would answer those questions was never collected, because nobody took a baseline before the program began.

No behavioral target. The program objective is “improve leadership capability.” That phrase is not coachable. It does not name a behavior, a frequency, or a context. Without behavioral anchors at each proficiency level, neither the trainer nor the participant can describe what success looks like at the end of a module, let alone six months out.

Three reasons assessment changes the outcome

Assessment is not an add-on to a leadership development program. It is the layer that decides whether the program can target, measure, or improve anything.

A baseline gives the program a target. Define the role profile (the competencies and target proficiency level for each leadership level), assess each participant against it, and the gap is now specific. A participant rated proficient on coaching but advanced beginner on decision framing has a different program than someone with the inverse profile. The curriculum stops being a conveyor belt. It becomes a set of modules selected against actual gaps.

Multi-rater data reduces single-manager bias. A leadership program designed off the manager’s view of the participant carries the manager’s blind spots into the design. Self, peers, and direct reports see different angles of the same person. The aggregate is closer to what the team experiences than any single rater on their own. The 360 also surfaces self-awareness gaps directly: when self-rating diverges from rater-rating, the development conversation has somewhere concrete to start.

Behavioral anchors make the program coachable. A competency rated on the Dreyfus 0 to 5 scale, with one observable behavior per level, gives the trainer a reference point and the participant a recognizable target. “Move from level 3 to level 4 on giving written feedback” is a goal a coach can work with. “Improve coaching skills” is not.

The two views: role baseline and 360 perception

Most leadership development programs need both, and skipping either produces a half-built program.

A role baseline assessment compares the leader against the expected proficiency level for their role. The output is a gap analysis: which competencies are below target, which are at target, which exceed it. It answers the question what should this role be capable of, and where is this person against that. The aggregate view tells the company where the leadership bench is thin across the population, which is the input for program scoping in the first place.

A 360-degree assessment captures perception from self, manager, peers, and reports against the same competency framework. The output is a perception map: where the leader is rated by each rater group, where self-perception aligns or diverges from rater perception, and which behaviors show up as strengths or risks. It answers a different question. How is this person actually showing up in the work.

The two views share a vocabulary because they run against the same competency framework. Run both, and the development plan can target the gap between what the role requires and what raters observe. Run one, and the plan is built on half the picture.

Two views, one framework

Role baseline

  • Compares the leader against expected proficiency for the role
  • Output: gap analysis (below / at / above target)
  • Aggregate view scopes the program at the population level
  • Answers what the role should do, and where this person stands

360 perception

  • Captures self, manager, peer, and report ratings
  • Output: perception map across rater groups
  • Self-vs-rater divergence surfaces self-awareness gaps
  • Answers how this person is actually showing up at work

Reference

The 7 leadership competencies and 21 behaviors

Huneety's leadership taxonomy: 3 pillars, 7 competencies, behavioral anchors at each Dreyfus level, and the rating scales used in the consultant 360.

See the framework

What gets measured: the leadership behaviors

A leadership program cannot be assessment-led if leadership stays undefined. The Huneety leadership framework, used inside our 360 product, organizes the work into 3 pillars, 7 competencies, and 21 observable behaviors. Each behavior is rated by frequency on a 5-point scale, not as a personality judgment. The structure is the vocabulary that makes assessment, IDP design, and re-assessment all run on the same definitions.

Pillar 1: Lead Self. How the leader manages their own internal state. Two competencies. Self-Awareness & Self-Management covers understanding emotions and triggers, seeking and integrating feedback, and regulating emotions under pressure. Resilience covers overcoming setbacks and barriers, maintaining progress and quality, and learning from difficult experiences.

Pillar 2: Lead Others. How the leader builds, develops, and influences the team. Three competencies. Team leadership covers delegation and clear team ownership, building motivation and engagement, and driving performance outcomes. Influential Communication covers tailoring messages to different audiences, gaining stakeholder buy-in, and countering resistance in conversations. Developing People covers coaching and mentoring, providing developmental feedback, and setting development goals and plans.

Pillar 3: Lead Results. How the leader makes calls and gets things done. Two competencies. Decision-Making covers assessing risks before acting, evaluating strategic trade-offs, and taking ownership of decisions. Execution & Accountability covers following through on decisions, maintaining quality standards, and taking ownership of outcomes.

How one competency breaks into 3 observable behaviors
Developing People
  • Coaching and mentoring
  • Providing developmental feedback
  • Setting development goals and plans

Each behavior gets a Dreyfus 0 to 5 anchor describing what good looks like at each level. “Move from level 3 to level 4 on providing developmental feedback” is coachable because both rater and ratee can describe what level 3 and level 4 actually look like in real conversations. “Improve coaching skills” is not.

From gap to development plan

A gap report is not a development plan. It is the input to one. The plan is what closes the loop between assessment and behavior change.

The 70/20/10 framework applies once the gap is specific. On the job (70 percent) means a stretch assignment that requires the missing behavior. A leader weak on decision framing gets a project where the deliverable is a written decision memo with the rationale, alternatives considered, and reversal conditions. The behavior is the deliverable. Through others (20 percent) means pairing with a peer or mentor scoring at least one Dreyfus level above the participant on the same competency. The peer demonstrates the behavior in real meetings, not in a classroom. Formal learning (10 percent) means a module targeted to the specific skill: a 45-minute course on written decision framing, not a 12-hour generic leadership program.

This is where most leadership development programs invert the ratio. They allocate the bulk of the budget to formal learning because that is what shows up on a vendor invoice. The 70 and the 20 are operational, not procurable, and require manager involvement to design and supervise. They are also where the actual behavior change happens.

The IDP is short, specific, and reviewable. Each goal names a competency, a behavioral target on the Dreyfus scale, the on-the-job assignment that will produce the behavior, the peer or mentor providing through-others learning, the formal learning module, and the artifact that would prove movement (a memo, a meeting recording, a 1:1 note, a feedback message). If a goal cannot be written this way, the gap is still too abstract to develop against.

How to know if the program worked

The honest measure of a leadership development program is whether the competency scores moved between assessment cycles. Re-assess against the same framework, with the same rater types, after the development cycle has run. The score movement is the answer.

Three patterns show up in the re-assessment data.

A score that moved up on a competency that was below target, with corroborating data from peers and reports, is the program working. The behavior is observable on the team, which is what the program was supposed to produce.

A score that did not move, despite training completion, is a program that delivered content but did not change behavior. The cause is usually that the development plan stayed in the formal learning lane and never engaged the 70 or the 20. The participant attended the workshop. They did not practice the behavior on a real assignment. Re-assessment makes this visible. Completion data does not.

A score that moved on the self-rating but not on rater scores is a self-perception inflation, not a development gain. The participant believes they are operating differently. The team does not see it. This pattern is common in programs that emphasize confidence and presentation over observable behaviors.

The point is not to grade the program. It is to feed the next cycle. The competencies where scores did not move become the program’s input for next year, and the assessment-led design starts again with a sharper baseline.

How to redesign your leadership development program around assessment

The sequence is not complicated. The discipline is in not skipping any step under deadline pressure.

Assessment-led leadership program
  1. Define the framework

    Leadership competencies with behavioral anchors at each Dreyfus level.

  2. Assign role profiles

    8–12 competencies and target proficiency per leadership role.

  3. Run the baseline

    Assess each participant against the role profile. Output: gap report.

  4. Run the 360

    Self, manager, peers, reports rate the same competencies. Output: perception map.

  5. Build the IDP

    Each goal: competency, target, on-the-job assignment, mentor, learning module, artifact.

  6. Deliver the program

    Mix on-the-job, peer pairing, and targeted formal learning per IDP.

  7. Re-assess

    Same framework, same rater types. Score movement is the metric.

Start by defining the leadership competency framework with behavioral anchors at each Dreyfus level. Without anchors, the rest of the program runs on impression.

Assign role profiles. For each leadership role (first-line manager, senior manager, director, executive), select 8 to 12 competencies and set the target proficiency. The role profile is the assessment criteria, not a job description.

Run a baseline assessment against the role profile. The output is the gap report by individual and the aggregate gap report by role and population. The aggregate is the program scoping input.

Run a 360 assessment against the same competency framework. The output is the perception map per individual. Compare against the baseline. The two views together drive the IDP.

Build IDPs from the gap data. Each plan names competency, target, on-the-job assignment, mentor pairing, formal learning module, and artifact. Manager and HR review.

Deliver the program. Mix on-the-job assignments, peer pairing, and targeted formal learning against the IDP. The program is no longer one curriculum delivered to everyone. It is a portfolio of development paths, each targeted to a specific gap.

Re-assess against the same framework after the cycle. Score movement is the metric. Use it to scope the next year’s program.

Assessment is the first step. The roadmap is what comes next.

Assessment is data, not change. Between the baseline and the re-assessment a year later, three roles own different parts of the loop. Skip any one and the IDP becomes a document HR keeps in a folder.

The participant owns the development plan. They commit to the on-the-job assignments, attend the formal learning, and ask their peer or mentor for the through-others practice. Without participant ownership, the rest of the system has nothing to support.

The top manager owns accountability. The manager assigns the stretch project, observes whether the new behavior shows up in real meetings, and signals during the cycle whether the participant is on track. A development program without manager involvement is a development program without a feedback loop.

HR owns monitoring. HR runs the cadence (quarterly check-ins are a reasonable rhythm), tracks IDP progress at the population level, surfaces stalled plans, and operates the re-assessment. HR is not coaching individuals; HR is holding the system accountable.

Built for HR teams

Leadership assessment in Huneety

Role baselines and 360 reports against your competency framework, behavioral anchors built in, gap reports that drive IDPs.

See how it works

Frequently asked questions

A leadership development program without assessment cannot target a gap, measure movement, or differentiate participants by need. Training is delivered to everyone uniformly, completion replaces capability as the success metric, and the program's effect on actual leadership behavior is invisible. Assessment defines the target, surfaces the gap, and produces the post-program measurement that says whether the program worked.
A leadership development program is the structured set of activities that build defined leadership competencies in a target population over a development cycle. It usually combines on-the-job assignments, peer or mentor pairing, and formal learning. An assessment-led program defines the competencies and target proficiency first, assesses participants against that baseline, and designs the activities to close specific gaps.
No. The moment raters believe their feedback affects compensation, the data flattens. Peers cluster around safe scores, self-ratings inflate, and the program's data input collapses. Use 360 assessment results for development plans only. Use performance reviews, on a separate cycle, for compensation decisions.
A 360 assessment uses self, manager, peers, and reports to rate observable behaviors against a competency framework, on a development scale such as Dreyfus 0 to 5. The output is a gap report that drives a development plan. A performance review uses the manager (sometimes with skip-level oversight) to rate output against goals and KPIs. The output is a comp letter that drives bonus and promotion decisions. Different methodology, different decision, different cycle.
A typical cycle runs 9 to 12 months. Framework definition and role profile assignment take 4 to 8 weeks. The baseline and 360 assessments run over 4 to 6 weeks. IDPs are written over 2 to 3 weeks. The development cycle itself runs 6 to 9 months, with the re-assessment in the final month. Subsequent cycles compress the framework and role profile work because both already exist.
L&D
Share

Ready to close the gaps?

Book a demo. We'll show you how it works with your competency framework.