Yesterday saw the first reported accident on the new Saltmarsh Junction at the western end of the Itchen Bridge in Southampton. This is notable as it is the recently opened first step of the implementation of Southampton's ~~Cycle Superhighway~~Eastern Cycle Route. I also previously threatened to look at the junction's specific accident rate – this seemed like a good time.

Helpfully, I've just seen there is a new instruction leaflet on how to use the junction. One wonders if such a leaflet is necessary, how effective the junction is to begin with but I digress.

Following my previous post on the junction here, some fortuitous retweets and Reddit, my post received nearly a thousand hits over the following week. I was also at-ted (if that's the verb) in various angry Twitter exchanges about how awful the junction is and how it's an accident waiting to happen. These are points I generally agree with (the other post explains why) though I assume the work was done with the best intentions rather than the malice some assume, but it is the last point I want to pick up on here.

"Accident Waiting To Happen"

Those are emotive words but what are we actually saying? As much as Vision Zero is to be supported and even despite the fact that GB roads are up there with the safest on the planet, we still get in the order of 150,000 recorded injury accidents per year on the roads. To some extent then, all roads (and indeed most activities in life – lest we forget the 40-ish tea-cozy-related A&E admissions each year) carry some degree of risk which, no matter how hard we try, cannot be eliminated. What really concerns us here is: has the junction been made worse (i.e. riskier) by this intervention?

Clearly, we can't say yet. Fortunately accidents happen infrequently enough that you'd want to wait for at least 2–3 years before making that determination statistically. However, one thing we can at least try to answer now:

How likely is it that we will see a cycle accident on this junction and so, what is the probability that this accident would have been observed this soon, given historical trends?

#### Assumptions and SQL

As a traffic engineer and if we were looking high-level, we would generally consult the DfT WebTAG statistics for a baseline. However, as long(er)-term readers will have seen before, we have access to real accident data, so need not bother with the high-level averages. The SQL-fu I deployed, can be found in those posts (save for the addition of WHERE clauses to narrow the geographical extent to the junction and expand the severity to all types from fatal-only).

There are some other statistical assumptions we will work with here:

- Accident frequency (or more specifically interval) follows a Poisson Distribution. This has what's referred to as "no memory", as events are independent. I.e. the occurrence (or not) of an accident yesterday, does not affect the probability of observing one today. If we are assuming a constant rate, the result is that considering the time period since the junction was re-opened, i.e. 10 days ago, does not affect our calculations. Which brings us to...
- The accident rate doesn't change with time. A bit questionable but not entirely unsupported by national-level data, as we saw previously.
- Recorded accidents are representative of the true rate. Now this is a big assumption, especially as we are concerned with the interval here. However, we are timing this to an accident that made the local paper. It may therefore be that recordable (but not actually recorded) accidents have happened in the interim and we just didn't hear about them. The underlying reasons behind the under-recording of accidents is likely to be similar and so this is a reasonable assumption.
- Accident locations are accurately represented. STATS19 reports reduce OS references for the location down to a 10m level of accuracy. I've constrained consideration to accidents at or around the junction and as the junction is relatively isolated this is likely to be overall accurate.
- Not quite statistical, but I will only include data from 1999 through 2012. We stop at end-2012 as the 2013 data will not be released until mid-2014. We start at beginning-1999 as I can only be certain (from Google aerial images) that the junction was broadly in its previous state back to that time (though plausibly it was a roundabout since construction in the 1970's).

Finally a general point, I formulated my question before running this analysis. People always forget to do that. Also as I type this, I have no idea what the outcome will be. Either way (or indeed if it is inconclusive), this shall still be posted. People always forget to do that one too.

#### Underlying Stats

Firstly, I SQL down to the 1km OS grid square (442000,111000) – no mean feat as there's a lot of data and Access throws a wobbly when it has to do anything with over 2Gb of data (and sadly cycles are involved in about half to two thirds of accidents) – and dump it out to Excel. This gives us 110 cycle accidents recorded in the date range 01/01/1999–31/12/2012.

Things always look clearer with pictures so, some Helmert Transformation (note an error up to 7m arises as a result of this) to get from OSGB36 to WGS84 that Google Earth uses, exclude those accidents not at or around the junction, and voilà:

Saltmarsh Junction Accidents (1999–2012) |

We have 25 recorded injury accidents over the 14 year period occurring on or near the junction, or a little less than 2 per year. The mean interval (i.e. our rate constant) is 160.3 days between accidents.

The distribution of the intervals is perhaps a bit lumpy:

Distribution of cycle accident intervals at the Saltmarsh Junction |

Happily though, a quick Chi-Squared test for goodness-of-fit with an assumed Poisson Distribution tells us it's good enough. I love it when a plan comes together.

#### Number-Crunching

So we know our overall rate constant of (^{1}/_{160.3} accidents per day) and we know an exponential distribution (i.e. probability of the event occurring at or before the time concerned) follows the cumulative distribution function:

From Wikipedia |

So all we need do is plug in the numbers for 10 days and we have a probability of 0.060. I.e. 6% odds of there being an accident in any given ten days.

#### What does that even mean?

Ah-ha! The junction's a death trap now then. Well maybe. We can't say. Yes, perhaps a 6% chance looks pretty definitive to you. It's certainly low. But... of those 25 accidents in the period 1999-2012, one of the intervals is 9 days, another is 1 day and a third is 2 days. In fact regards those last two, 2001 saw 3 accidents occur in a 4 day period. So shorter intervals have happened before, a number of times.

To sum up, what does this tell us. Well, someone got hurt, there's no getting away from that. A fair proportion of the fibres of my being (both as a professional and a cyclist) tell me the junction is more dangerous than it need be and I'm not happy about it. But, is the junction statistically more dangerous than before, it's simply too early to tell and there is not enough data. As I said at the beginning, a usual analysis would tend to use something in the order of +/- 3 years from intervention.

You perhaps don't need statistics to know some things are dangerous. I'm all right assuming I probably shouldn't drink the LD50 of arsenic in the hope I'm in the 50% that it wouldn't kill. 6% is pretty unlucky and fortunately as far as I know, the cyclist only suffered minor injuries. There are questions to be answered about what sort of person mows down a cyclist then drives off and I fear the CCTV coverage of the junction isn't clear enough for the police to get a number plate. I assume from the appeal for information, no one bothered to take it down either.

Fortunately, there are no fatalities in the data I've presented here. I just hope this isn't the start of a new more dangerous Saltmarsh Junction.

#### Data

Data sources are linked at point of appearance above. Additionally, STATS19 data is available from: