Estimating Claim Reserves using Python

In this article I would like to discuss an important technique used for Estimating Claims/Losses in Property & Casualty Industry which is known as ‘Development Triangle’ . Here I will be using Python Programming Language for number crunching

This article is not specific to any particular line of business, the purpose of this article is to demonstrate the technique in general based on fabricated data

Before getting into the technique lets discuss what is ‘Development Triangle’?

Simplistically a Development Triangle is a table that shows changes in the values of an entity over a time period — in our case the entity is the ‘Losses Incurred’ in Insurance Claims. Essentially using past experience on claim losses, a specific table can be developed which can be used to estimate the future losses and hence Insurance carriers can use to estimate the kind of reserves they need to create for its claims

For example, for all the accidents that took place in say year 1981 lets an Insurance company ends up paying $4999 million by the end of year 1981. Now usually the claims are long tailed meaning it is not necessary that all the claims that were reported due to accidents in year 1981 the claim will be closed in the same year. Meaning the claims will keep coming in year 1982 also, let’s say the total losses paid by the end of 1982 for the accident that happened in 1981 are $8392 million, similarly the losses paid by the end of 1983 is $11030 million and so on. Let’s say we continue to add up the losses paid by the end of 120 months for the accidents that took place in year 1981.

Similarly, we tabulate the losses paid by the end of years for the losses that took place for year 1982, 1983…and so on for 10 years up to let’s say year 1989

Following is the section of the table that represents above (note this is fabricated data for the purpose of illustration)

AccidentYear   LossDevelopmentYear   IncurredLoss ($million)  
-------------- --------------------- -------------------------
1981 1981 4999
1982 1982 229
1983 1983 3533
1984 1984 5778
1985 1985 1215
1986 1986 1636
1987 1987 680
1988 1988 1474
1989 1989 3256
1990 1990 2186
1981 1982 8392
1982 1983 4408
1983 1984 9115
1984 1985 11678
1985 1986 9688
1986 1987 6568
1987 1988 4143
1988 1989 7070
1989 1990 5518
1981 1983 11030
… … …
… … …

As we can see above tabular format is not easy to draw insights — if this information is formatted differently then it will be easier to observe the Loss over a period of time. From 1981 a 3 year period for losses is shown in a triangle format below-

| AccidentYear | 12Months | 24Months |36Months
|--------------|----------|----------|-------|
| | | | |
| 1981 | 4999 | 8392 | 11030 |
| 1982 | 299 | 4408 | |
| 1983 | 3533 | | |

For the accidents in year 1981 over 3 years the losses developed to 11030. The losses developed with an age factor of 1.6787, then by 1.3143. Similarly the losses in 1982 developed by a factor of 19.24. Hence this way we can see how the losses are developed over a period of time say 10 years or 120 months. Eventually we can ascertain what is the amount of reserve needed over a given period of time for similar claims

This representation is called a development triangle and this triangle has properties like link-ratio, tail-factor, and ultimate projection which help in ascertaining how a loss/claim is expected to develop. To discuss these properties, I will be using python, I have activated python environment and imported necessary libraries pandas, NumPy and chainladder

I have created a data-frame with original table, now we will use the data type triangle available with chainladder package. I have saved the triangle as ‘lossinfo’

lossinfo=c1.Triangle(data, origin='AccidentYear', development=['LossDevelopmentYear'], columns=['IncurredLoss'])print(lossinfo)

Above should provide following output

Link-Ratio: this ratio is a measure of change in claims from one valuation period over the other valuation period. This is also called Age-to-Age factor

print(lossinfo.link_ratio)

Tail Factor: this is a representation of number of periods required for analysing the claims/losses development. Whether we should analyse the development for 3 years, 8 years or 20 years? If the data is available then ideally the development should be reviewed till the time there is no further development which implies a development factor of 1.0

However, for our illustration a period of 10 years or 120 months should be fine

Ultimate Projection:

cl_ult = c1.Chainladder().fit(lossinfo).ultimate_print(cl_ult)
Projection of the losses occurred

When will this technique be not be useful: the development triangle technique is based on an assumption that future claim development will be similar to the past claim development. The future claim movement will be based on the past claim movement and similar patterns in the claims will be observed that were observed in the past claims. Also the presence of large claims in the historical data should not drastically distort the data