Normalizing story points across an ART?


Working in a SAFe shop, 4 teams new to agile on the same release train. Does anyone have any tips/tricks to normalize points across the teams without using man-days?


Gut reaction, trying hard not to be snarky: “New to agile” suggests to me that they probably want to focus on delivering working software for a few sprints before worrying about normalizing anything.

Less snarky, possibly more helpful thought: When looking across teams, I tend to ignore story points and just look at the number of stories.


Snarky response: well duh.

Non-snarky response: the teams are already working on learning how to “do” agile, I’m thinking of next steps, and I have never heard any type of suggestion (good/bad/indifferent) about exercises or activities to help a train normalize.

That make more sense? I typed out the post on the train and didn’t take the time to fully explain :smile:


You might want to give the #NoEstimates idea a chance. It provokes the absence of story point at all, as they bring more issues than benefits with them.

Here is a good - and balanced - article about the idea and the reasoning behind it:


I’m halfway thru Vasco’s book, love it. However…I would have better luck getting pregnant than introducing this concept in the enterprise, agile was a tough enough sell (I do bring up the concept and talk about it, but for now it’s too forward). At this point I’m picking my battles.


IMO… the only way you can normalize is by reverting to person hours.

I have many beefs with SAFe in general, but in particular, the “normalizing story points” thing drives me a little nuts. Just call it what it is: You’re estimating in hours.

This is called out specifically in the framework, actually. 1 pt = .5 days (


@JayHorsecow As far as I know, SAFe doesn’t propose normalising SPs. I’ve searched the SPC 4 training documents and the only reference to normalising is for the quick start (which @mattdominici references) , where for your very first PI Planning you use it to help new teams who don’t understand SPs. Thereafter you would never normalise. Have you read which gives a good description of the metrics at each level (team, program, VS and portfolio)? We encouraged using the Program Predictability Measure, where you try to get each team’s committed to achieved ratio within the 80 - 100% level.

@mattdominici Did you notice the bold text below what you refer to?


why did I think I read somewhere that all the teams on an ART should have normalized points? @Leanleff didn’t we talk about this?


Straight from the SAFe frame work… In SAFe here is the thought process behind Normalizing points
Relative Estimating, Velocity, and Normalizing Story Point Estimating
Agile Teams use relative estimating [2, 3] to estimate the size of a story in story points. With relative estimating, the size for each story is estimated relative to the size of other stories. The team’s velocity for an iteration is equal to the sum of the size of all the stories completed in the iteration. Knowing a team’s velocity assists with planning and is a key factor in limiting WIP, as teams simply don’t take on more stories than their prior velocity would allow. Velocity is also used to estimate how long it takes to deliver larger Features (/features-and-capabilities/) or Epics (/epic/), which are likewise estimated in story points.
Normalizing Story Point Estimating

In standard Scrum, each team’s story point estimating—and the resultant velocity—is a local and independent matter. The fact that a small team might estimate in such a way that they have a velocity of 50, while a larger team estimates so as to have a velocity of 12, is of no concern to anyone. In SAFe, however, story point velocity must be normalized to a point, so that estimates for features or epics that require the support of many teams are based on rational economics. After all, there is no way to determine the return on potential investment if you don’t know what the investment is. In order to do this, SAFe teams start down a path where a story point for one team means about the same as a story point for another. In this way, with adjustments for economics of location (the U.S., Europe, India, China, etc.), work can be
estimated and prioritized based on economics by converting story points to cost. This is particularly helpful in initial PI planning, as many teams will be new to Agile and will need a way to estimate the scope of work in their PI. One starting algorithm is as follows:
For every individual contributor on the team, excluding the Product Owner, give the team eight points (adjust for part timers).
Subtract one point for every team member vacation day and holiday.
Find a small story that would take about a half-day to develop and a half-day to test and validate. Call it a 1.
Estimate every other story relative to that one.
Example: Assuming a 6-person team composed of 3 developers, 2 testers, and 1 PO, with no vacations, etc., then the estimated initial velocity = 5 * 8 points = 40 points per iteration. (Note: The team may need to adjust a bit lower if one of the developers and testers is also the Scrum Master.)
In this way, all teams estimate the size of work in a common fashion, so management can thereby fairly quickly estimate the cost for a story point for teams in a speci􀃒c region. They then have a meaningful way to establish the aggregate cost estimate for an upcoming feature or epic.
Note: There is no need to recalibrate team estimating or velocities after that point. It is just a common starting point.
While teams will tend to increase their velocity over time—and that is a good thing—in fact, the number tends to remain fairly stable, and a team’s velocity is far more affected by changing team size, makeup, and technical context than by productivity changes. And, if necessary, financial planners can adjust the cost per story point a bit. This is a minor concern
compared to the wildly differing velocities that teams of comparable size may have in the non-normalized case.

Link for Reference


Do you want to normalize story points or do you want to normalize teams?

You could use a magic estimation based on a shared backlog to balance out the understanding within each team, from where to start when doing estimations.

We have fiddled around with normalized story points for a while. The result was always having a single person predefining the story points, which only could be argued in a team anymore. But this is not the purpose of story points.


@warren I did notice that, and it kind of confused me a little further due to the way portfolio backlog management works in SAFe.

What is described here: would infer that story points must be normalized across teams.

Here’s why I feel this way:

User stories that make up these features and epics have yet to make it into a team’s workflow. And those features and epics are estimated in story points. So what happens when an epic or feature that was estimated at 1000 points (by people who aren’t doing the work) ends up in the workflow of a team where 1 pt typically gets done in a day vs another team where 1 pt typically gets done in 5 days? Without normalization, you’re forecasting can be off by a factor of 5 (or more! or less!)

I’d really love to understand how forecasting at the portfolio level would work without teams having normalized story point (hour-based) estimates. I don’t ask that derisively, I honestly would like to know :slight_smile:



Let’s go back to some basic questions we always ask as coaches:

  • “Why might we want to normalize story points?”
  • “What is the problem we are trying to solve?”


  1. Is it for ART capacity planning?
  2. Is it for Feature ROI determination?
  3. Is it “just because some VP said so”?

If it turns out that “normalizing points” does make sense, perhaps try the Bockman Method.

It’s similar to Magic Estimation technique, however the “points” come into play at the end. And things are less timebased, and more complexity+risk+effort per team.

In my experience normalized points will survive a PI or two. Then each teams velocity changes. Due to any number of factors.

At that point capacity estimates for a PI get a little bit more complicated, but with only a handful of teams it’s not too bad.


For features at the program level, you use Feature Points (FP), which are not the same as Story Points. In effect you could say that FPs are normalised following the process below, which removes the need for normalised SPs. A feature is broken down into multiple stories and each of these stories could potentially be worked on by a different team. The FPs are not just the sum of the SPs of all it’s stories, the feature is estimated prior to the PI Planning and hence prior to it being broken down into estimated stories. At the end of a PI, the total FPs of completed features gives you a feature velocity which can be used for planning the next PI.

At my previous company, what we used to do is have refinement sessions towards the end of each PI involving product management, architects, POs and senior technical people to estimate the FPs. For each feature, an estimate was given for size (T-shirt sizes) and duration (how many sprints and how many teams would be required to complete it) - this data was then used to calculate the FP.

In a similar manner, epics and enablers at the portfolio level are the sum of the features that make up that particular item.


Thanks all for the suggestions/feedback. I did some more digging and it appears the impetus for this idea is ostensibly to start comparing teams to one another regarding velocity, productivity, etc. So now I have a completely different battle to fight!


Ah, I was missing that there was a disconnection between points at the story and feature level. I still have my reservations, but I do now see how it’s not dependent on normalizing estimates at the team level - thank you!

What calculation is done between the T-shirt size and duration, that allows you to arrive at a feature point estimation?


Jay - this topic sends me into a blind rage every time it comes up - and it comes up beyond just SAFe. Every “leader” with little agile experience wants to know why an “underperforming team” isn’t as good as one that delivers the “most” points. So I want to make sure people don’t think this is just a SAFe issue.

But - let me be an enabler for you…

If you want to normalize points, that equation is basically a man day calculation. if you are committed (or if your Enterprise is) at using this model - make every story no more than 2 points. Every story needs to be decomposed into half day, 1 day, or 2 day increments of work. This will take estimation in Fibonacci out of the equation, and force teams to deliver a series of micro stories. But, they shouldn’t deviate too far from estimates because stories are so small.

Also - a word of caution…if your teams are highly inbalanced with DEV:QA ratios, or if you have a metric shit ton of manual testing, or if your environments are as similar as snowflakes or finger prints - this will be hard to achieve. There are probably no less than 3 dozen other reasons why this will break, but I would start with extremely small stories and see how that works.

Then while you distract people with this, you can start coaching people asking for it - why it is awful. Or for pure amusement - ask them how they will normalize points across trains. :joy:


I typed about 5 different comments and ended up deleting them all because I didn’t like how they sounded.

My 2 cents: Normalization is a bandaid to the bigger problem of trying to know the unkown. Teams are more effective at estimating because the unkown becomes less complex and ambigous. Normalization implies that “someone” is dictating what should or shouldn’t be a certain size. This would be similar to asking an inventor to normalize their complexity towards creating their ideas.


I admit to feeling a little vindicated in my initial snarky reaction. :head_bandage:

Also, I can totally identify with your predicament. One tool I found helpful in a recent coaching situation, with a team that was being unfairly compared to other teams, was to remind everyone of Tuckman’s stages. The teams that were judged “better” had been together longer and were well into the performing stage. The team that was judged “at risk” was newer and recently had people added to and removed from the team, so they were clearly going through a prolonged storming stage. The thing is, once I actually talked to the people on the team, it became apparent that they were already transitioning into norming on their own – I really didn’t have to do anything other than pointing it out to everyone. That actually helped management chill out a bit, and they became more thoughtful about the impact of making further changes to the membership of the team.


Recently overheard a VP suggest all should move to normalization of story points - across 30 teams. Same 1 pt = 1 dev day, under the guise of “makes capacity calculations easier”

One of my team leads asked if we could script JIRA to create one ticket per dev per day of the sprint and run the script during iteration planning.

Then when a dev-team member showed up for work in the AM, they simply moved their ticket to done.

All real work would then be done on post its.


not sure if I should applaud the brilliance of the script or cry because of the VP.

I actually defused this whole discussion in my org and no one is even asking anymore. That’s a win in my book.