Why don't we model bicycles?

Published 2014-07-04 at 07:49

Tl;dr: We can't, but we're working on it.

Ok, so you don't click here for that. The question is poorly framed though. Put simply, the answer is indeed that we can't. But we (the field) are trying to do something about that, I promise.

Spurred by an average evening of twitter ire about A. N. Other Scheme which apparently serves to make no one happy, this post will explore a little of not why we don't model bicycles, but why we can't model bicycles. So let's go; it's a long one.

Transport Modelling

For all the progress we have made in maths, computation and engineering, the design of highway space is in no small part still informed by the core principles of the 70-ish years of modelling behind it. Manuals for Streets, Streetscape design and similar – much though we like to think are informed by modern design principles and aspirations – are still driven (pun intended) by the need to get traffic safely through a space, whatever mode that traffic might take the form of.

Now, we can argue about the best way to do that and whether we should be designing for the modes we want not the modes we have, but this isn't about that. At the top level, the scale of a space as originally constructed, or as reallocated, is defined by the volumes of flow we expect through it. We can go out and measure them today if we wanted, but we are usually building schemes with an expected life of 60 years and a realistic life of probably well over a century, so today's problems are only a small part of informing the design process. A stepping off point, as it were. Predicting the future is a matter fraught with peril. You will be wrong; there can be certainty only of that. The question is, how wrong and does it matter?

The Four-Stage Model

The classic 4 Stage Model

Almost all, if not all, transport models build around the core of what is referred to as "The Four Stage Model". Pictured right, the model is so called (somewhat obviously) as it is based upon the simplification of the modelling process down to these four stages. Overall, the modelling process involves: working out where people are; working out where people want to go; working out how they will choose to get there; and working out what route they will take to do so.

The following explores each of those and how they relate to cycling. Please note that I am massively oversimplifying the process of transport modelling here for the purpose of brevity and doing this properly can (and does) take many months. Explaining it would take an entire Semester's worth of module.

Trip Generation

Trip generation is the process of working out where people are and how much they will travel. A trip is defined by it's start and end points and trip generation is the process of fixing these start and end points.

We do this first by having defined our area of interest. This could be the scale of a city or the scale of a country. As an example, for a scheme of 6000 houses I worked on in the Thames Valley, we needed to consider trips from the entire of Great Britain. This is not unusual.

The area of interest is broken up into zones. In that Thames Valley example, Scotland was one zone, the North two and zones becoming increasingly smaller the closer they are to the immediate area of concern. Ideally, we define zones so they coincide with Local Authority and/or census areas so this gives us a good definition of population; or at least, as good as the census. We then use this information along with other information (e.g. on the local scale we will likely have household surveys) to produce trip production estimates. I.e. trips are produced based on where people live.

Separately, though with the same zones, we use employment information, land use observations etc. to establish trip destinations. I.e. trips are "consumed" based on where people work. Note that I'm discussing employment here as we tend to consider the commuter peaks. A proper model will repeat this process for a range of trip types. In some circumstances, such as around a large shopping centre or a stadium, non-employment-based trips will be a (if not the) major contributor. I'm also talking road trips, but the principles are the same whether we mean train, tube or flight routes.

The information we have to make these measures is often not great. Census data is often out of date. Zone definitions do not match authority or statistical definitions. Much comes down to the skill of the modelling team to make educated guesses as to appropriate (and replicable) determinations with (ideally) errors cancelling each other out.

Trip Distribution

The trip distribution process comprises linking the trip productions with trip destinations to form origin destination pairs. There are a range of different methods for this, each with various issues. Again though, it is an educated decision as to which is taken. Despite what a client or member of the public might think about the end product, a model is only as robust as the data that goes into it.

A catch-all term for this process (though it does have a specific meaning) is "matrix estimation". The Origin-Destination matrix we end up with is what feeds into the model you're thinking of when you think of transport modelling. Remember though that the whole model is backed by OD Matrices. Like sausages and laws, you probably don't want to know how they're made.

As I say, there are a range of methods. A common method is the use of roadside surveys. People are stopped as they cross a cordon line and amongst other things are asked for their home and destination post codes. This is why we ask you this whenever you're surveyed. We (probably) aren't trying to steal your personal information and it really does help us massively.

The results of the survey give us a real origin-destination matrix of real people on real trips. What we need to do as modellers is link our calculated productions and attractions to one another such that we produce an OD matrix that resembles reality. I won't go into it here but if you want to look it up, commonly a gravity model is used for this. We then find that our productions and attractions don't match up; there is a surplus of one or the other (different information sources, errors etc.). To correct this we then undergo a process of matrix balancing which (again a range of methods) is an iterative process by which we adjust the matrix until it balances numerically whilst still retains overall proportions in line with OD surveys.

You'll have noticed again that there is some haziness in this process, and we haven't even started on the network yet. Nor have we worried about a bike.

Mode Choice

This is a messy process by which the probabilities of going by car, bike, public transport, plane, whatever, are used to determine which of the OD paired trips will be undertaken by whichever mode. The underlying data comes from travel surveys, demographic information and research. I oversimplify spectacularly here but the process is normally made based on the relative valuations of the different modes where the traveller seeks to maximise their own personal utility (or technically minimise disutility, people don't want to travel) and makes their choices based on relative probabilities.


The final stage of the process is trip assignment. The trips for each mode are assigned to the appropriate model network (e.g. rail and vehicles have different networks). The network is built from observation of the real life one. Nodes (junctions) are connected by links (the bits between junctions) and the capacities of each link and node in the network can be estimated based on its type and/or measured, with speed-flow curves (as traffic increases, speeds fall) used to determine the level of congestion on the link.

Again, there are various methods for this but essentially the traffic is then iteratively loaded on the network with the overall aim to reduce the generalised cost of a user's trip on the network according to the user's measures of generalised cost. Time is a cost, distance travelled is a cost (vehicle operation/fuel) and so congestion is a cost (there are also other more nuanced costs, but I'll gloss over those). If a user can take an alternative route which they perceive to reduce their overall costs, then they will do so. These behaviours are summarised by Wardrop's principles and eventually the network comes to an equilibrium.

But we aren't done yet. You'll have noticed the feedback arrows in the diagram above and here's where they come in. Depending on the model, the timescale and the objectives, the result of the assignment can feed back. People will change mode in response to congestion (e.g. switch to public transport or bike or to car), people may change where they work, people may even change where they live. Depending on the transport policy, we may change where we put houses and we may change where we locate employment. These are all interrelated things. We all do this all the time and we don't tend to even think about it in this context, but the modeller must. The modeller must work off stated policy objectives measured in decades and this is why those DfT projections of ever increasing road traffic (despite the last 10 or so years of stagnation) are so destructive.

These interactions are not just a modelling artefact either. They're a reflection of real behaviours. It's why the provision of urban cycle hire has greatest impact on walking and bus use. It's why a new bypass will always eventually fill up with traffic. It's why the Congestion Zone in London led to a 20-ish% increase in goods vehicles. All we are ever doing is shifting an equilibrium.

What does this mean for cycling?

You may be quite annoyed at this point. I've conned you into reading I don't know how many words with barely sideways mention of cycling. Hopefully, you have got a better feel for the modelling process and both how complex and yet simplified it is. It will never reflect reality and should be treated as such. As with any model, good enough is good enough. We want a model that is good enough for the purpose. We establish that by calibrating it against real data. For cars we've been doing it 70 years. For bikes, we have barely begun.

Trip production and generation stages are mode independent. This is the same whether we care about cars, bikes or spaceships. It's mode choice where bikes come in. Here, simply we have a lot less data than we do about cars. Knowing someone cycles is not necessarily enough; we need to know about their *propensity* to cycle. Why might they not cycle sometimes and cycle other times? If someone drives to work, they probably drive to work all the time. But if someone cycles, they might not cycle in the winter. They might not cycle when it is raining. They might not have cycled the day the roadside survey was taken. How someone internally values cycling is not well studied at all.

Also, the actual assignment of cycles to the road is a big unknown. Simply measuring current (self-selected "brave") cyclists is not good enough. We need to know why they are there. Why do people choose the routes they do? How does traffic (or the fear of) cause cyclists to take a different route or use a separated facility or not. Or travel at a different time, or do part of the journey by bike but another by train? If we push traffic off a road, does the reduced traffic attract cyclists from other routes (or were they taking roughly shortest distance routes anyway), inadvertently force it off other routes or must we depend on it simply generating demand for new trips in the future. Where will those new trips come from, both demographically and spatially? How will the behaviour of cyclists as a whole change as the underlying population of regular cyclists changes? All these things are huge unknowns – even if we weren't trying to quantify them – and actual research is extremely limited at best. But to build a model that is of any use to anyone, proper quantification is a necessity.

We can't possibly design well (though obviously we can do a hell of a lot better) for the cyclists we want to have or think we might have on the basis of the current research. Even if we could, we don't have the tools to get from a known quantity of cyclists to an appropriate facility. How much space is enough? Is space needed at all? Is too much space a waste? When can shared-use work well? Bearing in mind that urban congestion can affect buses as much as cars, is it worth inconveniencing hundreds of bus users for a facility that won't do the job cyclists want them to do? Even amongst the disparate group that is "the cycle lobby", there is hardly agreement on any of this. Clearly some of these have obvious policy answers. We want cycling. We must encourage cycling. But modelling is all about numbers, and few of these exist. Even world-leading design standards are based on scant numerical data and provide no way to directly justify individual schemes numerically, just as here in the UK.

That last part is my specific field of research. I think we are good at the rest of it; we've had a lot of practice. But a simple question of how wide a cycle lane needs to be for a given number of cyclists remains unanswered. By contrast, we have had that answer for motor vehicles and pedestrians for over half a century. How the speed and quality of service of a cycle facility degrades with increasing users (cycle facilities suffer from congestion like any other) is unanswered.

Some research that does exist makes ridiculous assumptions such as cyclists not impeding one another even up to a density that is essentially a pile-up. Most of the rest of it is based entirely on an "isolated cyclist". The basic principles of how bikes interact with each other and how we can then integrate that into modelling is my hopeful contribution to the field. Others are working on the issue from the other end; i.e. how can we integrate bicycles into our existing tools. This tends to entail treating them as vehicles, something I don't personally think is a good idea. We did that in the 1950's and look where that left us.

But in any case, that's why we don't model bicycles now. We can't do it well enough to be of use to anyone and even if we did: garbage in, garbage out.