(Oldest Posts First)
|
95% of gaelic football results since 2010 can be explained by a mix of population, sport preferences, county financial resources, historical success, proximity to cities with significant career/study opportunities. A model created from these factors predicts win/loss results since 2010 with 95% accuracy when weighting games for relative importance. Worth noting that this year's championship has been only predictable to 70% accuracy mostly down to rule changes and Louth over-performing with ceiling still unknown :) level (Louth) - Posts: 108 - 01/07/2026 22:31:30 2683548 Link 0 |
|
95% of gaelic football results since 2010 can be explained by a mix of population, sport preferences, county financial resources, historical success, proximity to cities with significant career/study opportunities. No they can't. GreenandRed (Mayo) - Posts: 8650 - 02/07/2026 12:16:34 2683641 Link 0 |
|
is there an actual model or is this just a chat gpt speal?
tirawleybaron (Mayo) - Posts: 1970 - 02/07/2026 12:40:52 2683650 Link 0 |
|
Here you go... made a few tweaks but this will get you to 80% accuracy using only socio-economic data and football pedigree from more than 10 years prior to the observation window. The last 15% takes some real work. NOTE -- with this approach, Kerry is completely average given all their history prior to 10 years ago. Mayo and Monaghan are the big over-performers vs the model. So -- you can think of that being some reflection of manager skill, player skill/commitment, i.e., it's not just history and socio-economic factors that explain current performance. Meath gets hammered because it's close to Dublin, the 90s success is recent, and I'm a Louthman. The model (with all the terms explained) PPIᵢ = β₀ + β₁·P_advancedᵢ + β₂·Incomeᵢ + β₃·AirportTimeᵢ + β₄·Pedigreeᵢ + eᵢ, with Σe = 0 and OIᵢ = eᵢ/σₑ. P_advanced - Effective Structural Population The heart of the model. Each county's gross population is converted into an "effective playing base" by a chain of discounts: Population × C_community × C_hurling P_advanced = ───────────────────────────────────────────────────────── [log₁₀(Density)]^1.0 (urban drain) × [1 + 0.48·(120 − t_hub)/120] (city-proximity drain) × [1 + 0.70·max(0, t_CoE − 30)/60] (training-commute tax) Density term - denser counties lose share to alternative sports infrastructure (soccer academies etc.); Dublin's 1,582/km² divides its base by 3.2, Leitrim's 22/km² by 1.34. Hub term - proximity to Dublin/Belfast/Cork/Galway/Limerick penalizes (talent and attention drain to cities); a county 120+ minutes out pays nothing. Commute tax - real drive time from the county's second population centre to its training base; every minute beyond 30 erodes the base. Cork loses 21% of its effective population to this term, Down 18%, Donegal 14%. C_community - participation footprint applied to the six NI counties only: the Catholic share of "religion or religion brought up in" (Census 2021). Down retains 32.3% of its base, Tyrone 66.5%, ROI counties 100%. C_hurling = 1 − α·(hurling intensity) - per-capita hurling All-Ireland semi-final appearances 2000-2025, normalized to the max county (Kilkenny). At the calibrated α = 1, Kilkenny's football base is fully written off; Tipperary retains ~62%. Exogenous regressors Income - disposable income per person, 2023, harmonised all-island (PPS). Enters negatively (β₂ = −0.00094): richer counties systematically underperform their structural base, consistent with opportunity-cost pressures on an amateur sport. Airport access - OSRM drive time to the nearest of eight international airports; a broad connectivity/remoteness control (β₃ ≈ 0 bc it's largely absorbed by the pedigree term so it's not important). Pedigree - decayed tradition, with the last 10 years excluded NOTE -- you lose about 5-10% accuracy if you push the exclusion window out to 20 years (i.e., by putting it to 10, you capture some current players who have known recent-ish success) To capture footballing tradition without letting current squads predict themselves: Pedigree(county, season t) = Σ over All-Ireland finals in years y ≤ t−10 of weight(final) × 0.5^((t − y − 10)/5) Only finals at least 10 years before the measured season count (recent-squad-bias filter). Beyond that boundary, value halves every 5 years - recent tradition matters far more than ancient tradition. A won final is worth 1.0; a lost final only 0.25. Data sources (all fetched 2026-07-02) InputSourceNotesNFL final standings, 2010-2025Wikipedia season pages (_National_Football_League_(Ireland))All four divisions; 2023 D4 table via finalwhistle.ie (Wikipedia render truncated), points cross-checkedChampionship match logs, 2010-2025Wikipedia AI-SFC season pages (_All-Ireland_Senior_Football_Championship)AI series + provincial finals; 2011-2017 bracket details confirmed via searches grounded in RTÉ/GAA/Irish Examiner match reportsAll-Ireland finals, 1887-2025List of All-Ireland SFC finalsPedigree input; 138 finals, replays counted onceHurling semi-finalists, 2000-2025Championship record (compiled)Drives C_hurlingCounty populationsList of Irish counties by populationCensus 2022 (ROI) / Census 2021 (NI)County areas & densitiesList of Irish counties by areaNI community backgroundReligion in Northern Ireland (NISRA county-level Census 2021 table)"Religion or religion brought up in", Catholic shareDisposable income (PPS 2023)CSO County Incomes & GDP 2024 + CSO/ONS all-island comparisonROI: 8 published anchors, rest derived from the deviation index (≤0.9% off anchors). NI: banded between the three published anchors - approximation, flaggedAll driving times (hub, commute, airport)OSRM public router, live table/route callsPrincipal-town and second-town gazetteer coordinates; snap distances <105 m; no time-of-day traffic (internally consistent) Every scraped value is preserved as a flat CSV (obviously, I have that but easy to recreate with just the above) The weights, and how each was arrived at These matters for how much to trust each number bc there's some amount of over-fitting: (a) Specified (fixed by the project brief, never tuned): all PPI tier weights; the 120-minute hub ceiling; the 30-minute commute baseline; the density floor. (b) Decided (explicit modelling choices made during the build): 10-year pedigree lookback (chosen over 20 - the 20-year variant fits worse and was reported); 5-year half-life (chosen for maximal decay; note the leave-one-out sweep favoured 20-30, so this setting trades some out-of-sample robustness for a stronger recency gradient within tradition); exclusion of 2026; county-level spatial granularity. (c) Calibrated (grid-searched - 2,304 configurations - with leave-one-out R², not raw R², as the selection criterion, to resist overfitting on 32 observations): ConstantOriginal specv4 calibratedDensity exponent0.721.00Hub coefficient0.480.48 (survived)Commute-tax coefficient0.350.70Hurling penalty α0.501.00Pedigree runner-up weight0.500.25log-transform of P_advanced-rejected The configuration that maximised raw R² (0.829) was rejected because it cross-validated worse (LOOCV 0.713 vs 0.743). The regression coefficients themselves (β₀…β₄) are ordinary least squares - nothing hand-set. Fitted coefficients: β₀ = 19.71, β₁ = 1.03×10⁻⁴ per effective person, β₂ = −0.00094 per PPS, β₃ = +0.0009 per minute (≈0), β₄ = 7.90 per pedigree unit. Diagnostics: R² = 0.805 · adjusted R² = 0.777 · F(4,27) = 27.9, p = 3×10⁻⁹ · LOOCV R² = 0.743 · Σresiduals = 0 exactly · OI mean 0, sd 1. The v4 Overperformance leaderboard (2010-2025) (don't know if this will render as a nice clean table or not; if you copy/past into excel or an ai client, it'll render it) RankCountyActual PPIPredicted PPIOI1Mayo23.2512.47+2.812Monaghan14.064.52+2.483Donegal19.2314.44+1.254Roscommon11.096.52+1.195Galway18.4514.46+1.046Armagh12.399.82+0.677Derry11.7310.44+0.338Carlow3.151.96+0.319Dublin36.8135.63+0.3110Louth6.265.34+0.2411Cavan7.126.22+0.2312Tyrone20.1919.92+0.0713Westmeath4.624.47+0.0414Clare5.535.39+0.0415Down8.268.17+0.0216Tipperary4.234.23+0.0017Kildare8.028.74−0.1918Waterford2.082.86−0.2019Kerry30.6731.58−0.2420Kilkenny0.001.02−0.2721Limerick3.194.32−0.2922Leitrim3.154.43−0.3323Fermanagh4.816.31−0.3924Laois5.577.57−0.5225Sligo3.525.92−0.6226Longford3.245.74−0.6527Offaly3.286.05−0.7228Wicklow3.036.20−0.8329Cork14.5617.90−0.8730Wexford3.828.33−1.1731Antrim3.419.60−1.6132Meath9.1317.24−2.1 level (Louth) - Posts: 108 - 02/07/2026 22:27:38 2683792 Link 0 |
|
"This explains why high-population counties like Kilkenny, Antrim, and Waterford..." Kilkenny is not a high-population county. Cockney_Cat (UK) - Posts: 2922 - 03/07/2026 00:31:01 2683808 Link 1 |
|
Definitely a post from Omahant USA top drawer. Saynothing (Tyrone) - Posts: 2825 - 03/07/2026 10:09:47 2683838 Link 1 |
|
^1.0 (urban drain) × [1 + 0.48·(120 − t_hub)/120] (city-proximity drain) × [1 + 0.70·max(0, t_CoE − 30)/60] (training-commute tax) Density term - denser counties lose share to alternative sports infrastructure (soccer academies etc.); Dublin's 1,582/km² divides its base by 3.2, Leitrim's 22/km² by 1.34. Hub term - proximity to Dublin/Belfast/Cork/Galway/Limerick penalizes (talent and attention drain to cities); a county 120+ minutes out pays nothing. Commute tax - real drive time from the county's second population centre to its training base; every minute beyond 30 erodes the base. Cork loses 21% of its effective population to this term, Down 18%, Donegal 14%. C_community - participation footprint applied to the six NI counties only: the Catholic share of "religion or religion brought up in" (Census 2021). Down retains 32.3% of its base, Tyrone 66.5%, ROI counties 100%. C_hurling = 1 − α·(hurling intensity) - per-capita hurling All-Ireland semi-final appearances 2000-2025, normalized to the max county (Kilkenny). At the calibrated α = 1, Kilkenny's football base is fully written off; Tipperary retains ~62%. Exogenous regressors Income - disposable income per person, 2023, harmonised all-island (PPS). Enters negatively (β₂ = −0.00094): richer counties systematically underperform their structural base, consistent with opportunity-cost pressures on an amateur sport. Airport access - OSRM drive time to the nearest of eight international airports; a broad connectivity/remoteness control (β₃ ≈ 0 bc it's largely absorbed by the pedigree term so it's not important). Pedigree - decayed tradition, with the last 10 years excluded NOTE -- you lose about 5-10% accuracy if you push the exclusion window out to 20 years (i.e., by putting it to 10, you capture some current players who have known recent-ish success) To capture footballing tradition without letting current squads predict themselves: Pedigree(county, season t) = Σ over All-Ireland finals in years y ≤ t−10 of weight(final) × 0.5^((t − y − 10)/5) Only finals at least 10 years before the measured season count (recent-squad-bias filter). Beyond that boundary, value halves every 5 years - recent tradition matters far more than ancient tradition. A won final is worth 1.0; a lost final only 0.25. Data sources (all fetched 2026-07-02) InputSourceNotesNFL final standings, 2010-2025Wikipedia season pages (_National_Football_League_(Ireland))All four divisions; 2023 D4 table via finalwhistle.ie (Wikipedia render truncated), points cross-checkedChampionship match logs, 2010-2025Wikipedia AI-SFC season pages (_All-Ireland_Senior_Football_Championship)AI series + provincial finals; 2011-2017 bracket details confirmed via searches grounded in RTÉ/GAA/Irish Examiner match reportsAll-Ireland finals, 1887-2025List of All-Ireland SFC finalsPedigree input; 138 finals, replays counted onceHurling semi-finalists, 2000-2025Championship record (compiled)Drives C_hurlingCounty populationsList of Irish counties by populationCensus 2022 (ROI) / Census 2021 (NI)County areas & densitiesList of Irish counties by areaNI community backgroundReligion in Northern Ireland (NISRA county-level Census 2021 table)"Religion or religion brought up in", Catholic shareDisposable income (PPS 2023)CSO County Incomes & GDP 2024 + CSO/ONS all-island comparisonROI: 8 published anchors, rest derived from the deviation index (≤0.9% off anchors). NI: banded between the three published anchors - approximation, flaggedAll driving times (hub, commute, airport)OSRM public router, live table/route callsPrincipal-town and second-town gazetteer coordinates; snap distances <105 m; no time-of-day traffic (internally consistent) Every scraped value is preserved as a flat CSV (obviously, I have that but easy to recreate with just the above) The weights, and how each was arrived at These matters for how much to trust each number bc there's some amount of over-fitting: (a) Specified (fixed by the project brief, never tuned): all PPI tier weights; the 120-minute hub ceiling; the 30-minute commute baseline; the density floor. (b) Decided (explicit modelling choices made during the build): 10-year pedigree lookback (chosen over 20 - the 20-year variant fits worse and was reported); 5-year half-life (chosen for maximal decay; note the leave-one-out sweep favoured 20-30, so this setting trades some out-of-sample robustness for a stronger recency gradient within tradition); exclusion of 2026; county-level spatial granularity. (c) Calibrated (grid-searched - 2,304 configurations - with leave-one-out R², not raw R², as the selection criterion, to resist overfitting on 32 observations): ConstantOriginal specv4 calibratedDensity exponent0.721.00Hub coefficient0.480.48 (survived)Commute-tax coefficient0.350.70Hurling penalty α0.501.00Pedigree runner-up weight0.500.25log-transform of P_advanced-rejected The configuration that maximised raw R² (0.829) was rejected because it cross-validated worse (LOOCV 0.713 vs 0.743). The regression coefficients themselves (β₀…β₄) are ordinary least squares - nothing hand-set. Fitted coefficients: β₀ = 19.71, β₁ = 1.03×10⁻⁴ per effective person, β₂ = −0.00094 per PPS, β₃ = +0.0009 per minute (≈0), β₄ = 7.90 per pedigree unit. Diagnostics: R² = 0.805 · adjusted R² = 0.777 · F(4,27) = 27.9, p = 3×10⁻⁹ · LOOCV R² = 0.743 · Σresiduals = 0 exactly · OI mean 0, sd 1. The v4 Overperformance leaderboard (2010-2025) (don't know if this will render as a nice clean table or not; if you copy/past into excel or an ai client, it'll render it) RankCountyActual PPIPredicted PPIOI1Mayo23.2512.47+2.812Monaghan14.064.52+2.483Donegal19.2314.44+1.254Roscommon11.096.52+1.195Galway18.4514.46+1.046Armagh12.399.82+0.677Derry11.7310.44+0.338Carlow3.151.96+0.319Dublin36.8135.63+0.3110Louth6.265.34+0.2411Cavan7.126.22+0.2312Tyrone20.1919.92+0.0713Westmeath4.624.47+0.0414Clare5.535.39+0.0415Down8.268.17+0.0216Tipperary4.234.23+0.0017Kildare8.028.74−0.1918Waterford2.082.86−0.2019Kerry30.6731.58−0.2420Kilkenny0.001.02−0.2721Limerick3.194.32−0.2922Leitrim3.154.43−0.3323Fermanagh4.816.31−0.3924Laois5.577.57−0.5225Sligo3.525.92−0.6226Longford3.245.74−0.6527Offaly3.286.05−0.7228Wicklow3.036.20−0.8329Cork14.5617.90−0.8730Wexford3.828.33−1.1731Antrim3.419.60−1.6132Meath9.1317.24−2.1"]You took the words out of my mouth. avonali (Dublin) - Posts: 2059 - 03/07/2026 10:53:39 2683857 Link 0 |
|
An essay on a format change is generally his trait
Gaa_lover (USA) - Posts: 4025 - 03/07/2026 12:44:49 2683905 Link 0 |