Random Analytics

Charts, Infographics & Analytics. No Spinning the Data. No Juking the Stats

Month: February, 2014

Random Analytics: David Warner First Chance Average (to 24 Feb 2014)

Note: I first published this blog using the acronym Earned Run Average (ERA) and Earned Run Differential (ERD). I have subsequently amended the acronyms to First Chance Average (FCA) and First Chance Differential (FCD). See: Random Analytics: Shane Watson First Chance Average (to Cape Town 2013/14) for the detail.

David Warner’s recent form has been fantastic. Two centuries plus two half centuries in the Australian leg of the Ashes and one century plus two half centuries in the first two tests in South Africa are a good return.

However I haven’t been completely convinced that Warner is in the best of form and while discussing the subject over a beer a mate of mine suggested using a Moneyball metric to test the theory.

I could be wrong but here might be a cricket first, looking at David Warner’s Test Earned Run Average statistics (and thanks to Daryn Webster for the suggestion and Adrian Storen for the sanity check).

1 - AvgVsERA_DWarner_140226

First chart looks at Warners standard Test average (currently 42.88 after a 70 and a 66 at Port Elizabeth, South Africa) versus his Earned Run Average which sits at 34.92 (-7.96).

The Earned Run Average (ERA) is calculated using the score he would have got if a legitimate chance had been taken by the opposing team. In this case I’ve only had to consider dropped catches and missed stumping’s as legitimate chances but I could foresee a missed referral being added in the future. As an example in the 2nd Innings at Port Elizabeth, Warner was put down by Duminy in the 16th over on 36. Thus although he scored 66 for the match his Earned Runs were just 36.

2 - AvgVsERA_DWarner_Summer2013~14_140226

The next chart looks at Warners standard Test average for the Australian and South African summer series. Although Warner has had an outstanding summer with the bat his average over seven tests stands at 60.46 yet his ERA is a much lower 41.00 (-19.46).

3 - ERDevEtChances_DWarner_140224

The final chart looks at two datasets.

The first (in blue) is the Earned Run Deviation (ERD) which for Warner has increased from a career low of 8.8% at the start of summer to now hit a career high of 18.6%.

The Earned Run Deviation (ERA) is calculated 1/Total Test Runs x Earned Run Deviation. In Warner’s case he has currently scored 2,187 runs but would have scored 406 less if opposing teams had of taken his offered chances.

The second dataset (in red) are the chances that David Warner has been given.

On the positive side his twelve chances have a first chance average of 33 but a multiple chance of 38.7, thus demonstrating he doesn’t throw away his wicket early. On the negative side:

  • He has had 2/3rd (8/12) of all his chances in the last two series;
  • His Average Chances (AC) over his career was 0.27 (a chance every fourth innings). Over the recent summer this has doubled to 0.57 (a chance every second innings);
  • His summer 2013/2014 Earned Run Deviation is 32.2%.

FINAL THOUGHTS

Australia’s coach, Darren Lehmann, when asked if Warner was too reckless has recently statedThat’s just the way he is, and we’re very comfortable with that“.

Warner’s current form is excellent so any coach would be hard pressed to have to drop him.

Saying that Warner’s Earned Run Average, Earned Run Deviation and Average Chances are all moving in the wrong direction. Unless he can turn that around in the short term he might find his luck running out.

UPDATES

6 Mar 2014: Updated title to First Chance Average and added note plus link to Shane Watson’s FCA.

Advertisements

Random Analytics: H7N9 by Employment (to 250 confirmed)

The Avian Influenza A(H7N9) continues its steady attrition.  According to Flutrackers there have been 358-cases of H7N9. With Wave 1 (45) and Wave 2 (32) fatality counts as confirmed by Xinhua my unofficial fatality total stands at 77 (a Case Fatality Rate of 21.5%).

While updating the most recent case details to my personal H7N9 Db today, a 29-year old female from Changsha, Hunan I noticed that we had reached an interesting milestone. Of the 358-cases thus far I have now been able to confirm 250 of their job titles.

Let’s look at the data to data.

1 - JobTitle_H7N9Top20_140218

Looking at the Job Titles we still find that the leading data item (occupation) is Farmer (35.6%), then Retired (24.4%) then the two paediatric titles of Primary School (5-12) and Child (0-4) with a combined total of 21 (8.4%). I’ve now been able to record 40 different titles with the top half accounting for 61.2% of the entire data, the bottom half just 8.6% and unknown 30.2%.

Some further points of interest:

  • In Wave 1 (to case #136) Farmers represented 28/136 of all cases (20.6%). Currently in Wave 2 there have been 222 cases of which 61 were Farmers (27.5%);
  • The current average age of the H7N9 impacted Farmer is 62-years while the average age of all H7N9 victim is 54.5-years;
  • The average age of a H7N9 Retiree is 70.4;
  • In Wave 1 Paediatric cases (0-15) represented 7/136 of all cases (5.1%). Currently in Wave 2 there have been 15-cases (6.8%) which shows a slight increase;

2 - JobFamily_H7N9_140218

When we role all the Job Titles into a Job Family the top-3 groups are Non-Participatory (26.5% comprising children, retirees and the unemployed), Farming, Fishing & Forestry (25.4%) and then Food Preparation & Serving (5.3% including catering, chef/cook, food sales, live poultry trade & market vendor).

Of interest:

  • The first two groups have remained largely unchanged in 2014 but the Food Preparation & Servicing group has been steadily declining in recent weeks (down from 6.9% recorded on the 1st February);
  • Only one Healthcare Practitioner (an ER Surgeon from Shanghai) has been recorded;
  • Along with the single Healthcare Practitioner recorded, no Healthcare Support (Enrolled Nurses, Vet Assistants or Orderlies) or Protective Services (Police, Ambulance, Fire & Wildlife Rangers) have yet been recorded equating to just 0.3% of all cases. A marked contrast to MERS which as Ian M. Mackay noted on 5 February 2014 Health Care Workers accounted for 18% of all cases and 2.7% of all deaths;
  • The average age of all H7N9 victims without a job title is 57.

3 - MainJobs_H7N9_140218

Last chart is a look at some main Job Titles in a running total. I’ve included child cases up to the age of 15-years in response to some of Ian M. Mackay’s concerns about an increasing paediatric count.

Given that those unknown job titles cases have an average of 57-years I believe that Retirees are somewhat underreported but given the older age of Chinese farmers it’s hard to estimate a breakdown without some local knowledge (of which I don’t possess).

FINAL THOUGHTS

Without wishing for more H7N9 cases I’ll plan for another employment update as I confirm the first 300 Job Titles.

There is a lot of interesting data in the first 250 Job Titles that I have been able to confirm. I only wish we had some more clarification on the almost 1/3rd of missing data items.

I’ll continue to scrabble for information as it comes in. Public sourced journals with detailed case studies are excellent sources and I am sure we will be seeing some of the Wave 2 case studies in coming weeks and months.

Random Analytics: Abbott’s Promise. 1-million jobs in 5-years (to Jan 2014)

“The next Coalition Government will create one million jobs in five years and two million jobs in 10 years,” he said. “This pledge is achievable given our record and policies.” Tony Abbott (27 Nov 2012)

First of all, Labor side should be commended for its employment story during its term in office (2007-2013) where 955,200 jobs were created. Saying that one of my key criticisms of the then Employment Minister, Bill Shorten was that the spotlight was always on total job creation rather than looking at full-time and part-time job breakdowns. During the Labor years 450,400 part-time jobs were created against 504,800 full-time ones.

Currently, there is a lot of discussion in Australia around employment and unemployment at the moment. In the past month many companies have announced large job cuts either in the immediate or near future. Recent examples with direct jobs lost include Holden (2,900), Toyota (2,500), Forge (1,400), Rio Tinto (1,100), Qantas (1,000), Electrolux (544) and just today Alcoa (980). The seasonally adjusted unemployment rate hit 6% for the first time since July 2003 (when it peaked at 6.1%).

The RBA is has for some months forecast that the unemployment rate would hit 6.25% during 2014, then steadily improve from 2015. That view remained unchanged in its most recent Economic Outlook.

Thus it would be unfair to immediately thrust blame on to Tony Abbott and the recently elected Coalition government as many in the opposition camp are doing.

To that end I thought I might shine a light on the Abbott promise. 1-million jobs in five years. Here is a look at the data for the first four months to January 2014.

1-AUSEmployGainsLossestoJan2014_140218

First chart is a look at employment gains and losses since the Coalition took power in September 2013. Two points:

  • The total jobs are slightly negative, that is 9,949 jobs lost; and
  • The sample size is way too small to start analysing and unemployment figures from the ABS are generally considered a lagging indicator.

2-JobCreationtoJan2014_140218

Second and last chart looks at job creation in three parts. Total job creation (green), full-time employment (blue) and part-time employment (maroon).

In effect there have been 56-thousand full-time jobs lost against 46-thousand part-time jobs gained for a gross loss of 9,949 jobs.

Final Thoughts

Bill Shorten’s recent commentary around 54,000 job losses (or one job every three minutes) might make a good sound grab but actually only reflects full-time employment losses over a very short timeframe.

I think it’s disingenuous of him as the former Minister to use total employment figures then but now only concentrate on one set of numbers.

That aside I wonder if the RBA has underestimated the unemployment nadir at 6.25% which will make it much harder for Tony Abbott to hit his 1-million jobs in five years promise.

Only time will tell. I’ll keep you updated.

Random Analytics: A H7N9 family cluster in Zhongshan, Guangdong?

Ian M Mackay wrote an update on his Virology Down Under article on Wednesday where he nailed a Wave-2 data-point that I had completely missed. H7N9 snapdate: age with time. Key excerpt:

The interesting line to watch is that of the youngest age group (0-19-years) which has lifted to comprise 50% of cases in the week beginning 27-Jan. Also, the proportion of cases in the oldest age group (70->90-years) has dropped down in the past 2 weeks.

There have been a rash of children in recent announcements; 8 of the last 45 cases have been <10-years of age. For a virus with a median case age sitting at 58-years, this is quite a departure.

 Is this due to an increase in familial clusters? Does it herald a shift in the way the virus is spreading? Interfamilial transmission may provide a hint at increasing transmission efficiency. It might also be a sign of increased testing augmenting clinical observation of close contacts of ill family members.

It was such an interesting thought I started to dig a little deeper into the recent data to see if there were any possible interfamilial patterns that, as yet, might not be confirmed as family clusters but would have a high likelihood of being so.

Consider this.

Looking at the Flutrackers.com case list and case number #285 (37M) and #289 (2F). Data points:

  • Onset within 5-days of each other;
  • Hospitalised 2-days apart;
  • Confirmed one day apart;
  • Both are named Liang, although the original translation was Liang Yijun which might stand for ‘someone Liang’. As Crawford Kilian put it Liang is one of the top 100 Chinese surnames;
  • Both come from Sanjiao Town (original reports had them at Triangle Town but I linked that to Sanjiao Town via local hotel addresses).

The important point to my thinking is that these two cases are the first reported in Zhongshan in both waves. Zhongshan is different from other cities in that it doesn’t have County level administration but rather six inner districts and 18 smaller surrounding towns. Sanjiao Town has a population of just 121K, which by Chinese standards is miniscule.

I don’t believe in coincidences and there is a lot of data which is missing from this picture.

Yet, as we see a lot of P2P denial occurring could we also be seeing the first of many unconfirmed family clusters?

Is this the ‘tip of the iceberg’?

Random Analytics: H7N9 in Hangzhou, Zhejiang (to 4 Feb 2014)

According to the latest updates from Flutrackers.com there have been 299 cases of Avian Influenza A(H7N9) to 1200hrs EST (my time in Brisbane, Queensland) with an unofficial fatality count of 71. The Case Fatality Rate (CFR) plus a comparison between Wave 1 (to case #136) and Wave 2 (from 8 October 2013 to the present) stands at:

Wave 1: 136-cases, 45 known fatalities and a CFR of 33.1%;

Wave 2: 163-cases, 26 known fatalities and a CFR of 16.0%;

Total: 299-cases, 71 known fatalities and a current CFR of 23.7%.

Since mid-January the province of Zhejiang has moved into triple figures for H7N9 cases. At around the same time the provincial capital, Hangzhou became the first city to reach more than 50-cases, surpassing Shanghai as the most impacted city by H7N9.

Given those unfortunate statistics I thought it might be worthwhile to crunch some data on Hangzhou, Zhejiang.

Firstly, let us look at the 119 Zhejiang H7N9 onsets by prefecture level city.

1 - CasesbyCity_Zhejiang_140205

Two points:

Lisa Schnirring from CIDRAP stated in the 3 February H7N9 Update that:

Southern provinces lead second-wave cases

Six of the latest cases are from Guangdong province, continuing a strong second-wave tilt toward the mainland’s southernmost areas. In the first wave, locations north of that area were driving most of the outbreak activity: Shanghai, Jiangsu province, and Zhejiang province.

Not sure I agree with that.

Zhejiang has experienced 73-cases in the second wave which is much higher than Fujian (13) and Guangdong (49) below it. Of the 73-cases, Hangzhou alone had 23.

On the second point the infographic also (interestingly) highlights that 90.8% of Zhejiang’s cases are concentrated in the north of the province, emphasising a north/south provincial divide. I can’t suggest a reason for that outside of population density.

Next, let’s look at cases by month of onset with an emphasis on Hangzhou.

2 - CasesbyMonthofOnset_Hangzhou_140205

During April 2013 (the bulk of first wave cases) there was a significant spike in numbers from Hangzhou (28.9%) as compared to Shanghai for the same month (18-onsets at 18.6%).

As you can see from the provisional data for Hangzhou in January the case load is less both in terms of numbers (25) and as a percentile of total cases (18%), although the overall numbers are greater.

Lastly, a look at the second wave case load within Hangzhou.

3 - CasesbyDistrict_Hangzhou_140205

Here is the biggest surprise (IMO). Although farmers make up 10 of the 23 second wave cases in Hangzhou all of the cases (minus one in Fuyang City and three which are unconfirmed) are not in the outlying cities and districts of Hangzhou but in the more tightly congested metropolitan areas of the prefecture level city. It seems the Chinese peri-urban divide is a significant risk factor in catching H7N9, at least in Hangzhou.

Final Thought

H7N9 has almost been around a year and as we verge on the 300th case I think we have spent more than enough time doing provincial level analytics when we now can and should be spending a little more time getting granular with our analysis.

Random Analytics: H7N9 More Employment Graphs (to 31 Jan 2014)

The Avian Influenza A(H7N9) continues its steady attrition. According to CIDRAP there have been 277-cases of H7N9 with the fatality count standing unofficially at 61 (a Case Fatality Rate of 22%).

Earlier in the week I posted some analytics looking at the case list employment data. Subsequently I’ve been involved in a rolling tweet-up with Ian Mackay, A biologist and Potrblog.com on some of those findings. Some of that discussion has caused me to further reflect on the data I presented.

Reflection then turned into action (and some updated/revised charts plus one new one!).

1 - JobTitle_H7N9Top20_140201

The first chart is very similar to the H7N9 incidences by Job Title previously released with the exception that I have updated a number of Job Titles to align with the Chinese data (i.e. amended Maid (Expat) to Domestic Helper) but also to better reflect actual real world situations. Thus School Age (5-17) has been split into both Primary and High School age groups.

The current chart reflects Job Title data in 204 of 277 cases (73.7% of all data inputs). The two predominant employment types continue to be either a Farmer (33.8%) or Retired (27.5%). Farming job titles are up slightly and Retired job titles are down slightly from data released earlier in the week.

Some further points of interest and conjecture:

  • In Wave 1 (to case #136) Farmers represented 28 out of all 136 of all cases (20.6%). Currently in Wave 2 the following 141 cases had 41 Farmers (29.1%);
  • The current average age of the H7N9 impacted farmer is 61.9-years. More than 5.1-years over the average age of all those impacted by the virus, which probably demonstrates an ageing issue for Chinese agriculture; and
  • The average age of the H7N9 retiree is 71.3. How good is the Chinese economy, its medical system and its infrastructure compared to barely three decades ago?

Before I go on to my next three charts I want to discuss the importance of job titles. During my tweet discussions this week I brought up the issue of the differentials between a small cropping and a pig farmer. Everyone agreed with the issue but, by chance, I found a great example as I was completing my data updates today.

Via CIDRAP reported (29 Jan 2014). Seven new H7N9 cases, plus family cluster, reported. Detail:

The family cluster reported today involves three people from Zhejiang province, a 49-year-old man, his wife, and their 23-year-old daughter, according to a report from Xinhua, China’s state news agency. All three cases were previously reported. The man’s infection, which ultimately proved fatal, was confirmed on Jan 20. His daughter got sick 3 days after taking her father to the hospital, and she is in serious condition. The man’s wife’s infection was confirmed on Jan 27, and her illness is mild, according to Xinhua. Media reports in China yesterday, citing officials from China’s Center for Disease Control and Prevention, said the parents are from Xiaoshan and worked as vegetable dealers in a live-bird market before they got sick and that their daughter had worked at the market for a short time, the South China Morning Post, an English-language newspaper based in Hong Kong, reported today.

That detail might have made me change my job title for the parents to a Market Vendor, yet I suspect they are a small cropping family (who as first reported are ‘Farmers’) who also ran a small vegetable stall in a local poultry market (thus a secondary occupation of ‘Market Vendor’/’Vegetable Dealers’). Their daughter who also became ill was first reported as ‘Staff’ potentially equating to her role as running their market stall.

For all the conjecture that I put forward they might have caught H7N9 from wild birds at their vegetable farm, rather than the poultry market.

2 - JobFamily_H7N9_140201

The second chart looks at employment by Job Family (see previous H7N9 employment related blog for methodology). Unlike the previous post about Job Families I thought it important to include the unknown data inputs which have been relatively unchanged since the commencement of the outbreak in February 2013. The largest groups are represented by Non-Participatory (27.4% comprising children, retirees, students and the unemployed), followed by unknown employments (26.4% or more than one in four) finally followed by Farming, Fishing & Forestry (25.6%). After those two groups Food Preparation & Serving (6.9% including catering, chef/cook, food sales, live poultry trade, market vendor) and Production, Factory & Food Processing (2.5% comprising butcher, factory worker, poultry abattoir, sheet-metal worker and stone processor). Those five groups equate to 88.8% of all cases.

3 - MainJobs_H7N9_140201

The final chart asks the question. Has the recent spike in H7N9 cases been over represented by farmers?

Short answer is No.

The above chart displays acquisition by employment type (at onset) with four main groups represented: Farmer, Retired, All Other Known Employments and those that are currently unknown.

Two key months dominate. April 2013 and January 2014. By the end of April Farmers represented 19.7% of cases, currently they have increased by more than 5-points to 24.9% while Retired have reduced from 31.8% to 20.2%. Farmers moving from one in five to one in four H7N9 cases is still a reasonable movement but a trend has (not yet) been proven.

Let us give it one or two more months…

Stay safe, stay healthy and continue to make good choices.